Home
        Mascot 2.4: Installation & Setup Manual
         Contents
1.                                                                                                                                                   External   Intranet  TCP IP  i _  TCP IP Mascot master  TCP IP  TCP IP A  100Base T Hub  TCP IP Mascot node 2   a   cs  Mascot Cluster  Topology                Mascot node 3    182 Mascot  Installation and Setup    Hardware Requirements    All machines in a cluster should have processors of the same speed   Otherwise  the box with the slower processor s  will become a bottleneck     Two network ports are required on the master server  one for external  access and one for communication on the private local area network   LAN  that connects the master to the nodes  The LAN for the cluster  should run at least at 100Mb s     The total amount of RAM required in a cluster is a function of how many  sequence databases need to be held in memory simultaneously  Mascot  supports an unlimited number of databases  but only those that are  searched frequently justify being locked in memory  The others can be  allowed to swap in and out of memory as needed  For example  a 5 node   8 processor cluster  non searching head node  might have 12 GB on the  master  and 6 GB on each of the search nodes  Assuming memory re   quirements for the OS are negligible  this gives nearly 24 GB in total for  searches and databases  Even though 5 or 10 searches might be running   this should be sufficient to allow the more popular databases to remain  resident in 
2.                                                                                 mascot sequence SwissProt incoming  current  old  NCBinr incoming  current  old                   For each database  the incoming directory provides a workspace for  downloading and expanding a new database file  The current directory  contains the active database  and this is where Mascot Monitor creates  the memory mapped compressed files  The old directory is where the  immediate past database file is archived     just in case     In the Mascot configuration  the filename for each database must in   clude a wild card  This is to enable the automatic recognition and ex   change of an update file  For example  the filename for the SwissProt  database might be defined as SwissProt_  fasta  This would match to  filenames that included a release number  e g   SwissProt_2012_03 fasta  ora date stamp  e g   SwissProt_20120311  fasta     Whenever Monitor sees a file in that directory which matches to the  database name and is not the current database  it will initiate the ex   change process  This is why the wild card is important  even though you  may not wish to track database dates or revision numbers     Even if you never intend to swap a database  and have called it  say   SwissProt fasta  you must still define it in the Mascot configuration  using a wild card as SwissProt   fasta     60 Mascot  Installation and Setup    Database File Update Procedure    Mascot Database Manager can update database
3.                   I 8 P P             Important  Because Mascot frequently writes status information to  nodelist txt  you should open the file in a text editor that puts a lock  on the file  e g  vi or wordpad   This will prevent Mascot from modifying  the file while it is being edited  nodelist txt can be viewed using  Mascot Status     Node    There must be one or more node entries  Items in square brackets are  optional     but the commas must always be supplied     IP address Port  Host name  Number of processors   OS    Home dir     Home dir from master     IP address  port  and host name must always be specified     The number of processors to be used on the node can be less than the  number of processors available  If the total number of processors speci     Chapter 11  Cluster Mode 207    fied in all the nodes exceeds the number of licenses available  only the  licensed number will be used at any one time     If the OS is not specified  then the DefaultNodeOS is used  Must be one  of the choices shown under DefaultNodeOS     The home directory is the local path on the node to the root of the Mascot  directory structure  If this is not specified  then DefaultNodeHomeDir is  used     Home directory from master is the home directory on the node as seen  from the master  This parameter is only applicable to a Windows cluster  and must be omitted for a Linux cluster     Once a cluster has been started  an additional four status values will be  written periodically to node
4.       If you chose to configure Mascot as a single  SMP  server  you will see a  screen similar to the one above  and can proceed to start Mascot Monitor   If you chose cluster mode  refer to Chapter 11 for further configuration  information     Start Monitor at a shell prompt   You must have root privileges for this   ed  usr local mascot bin    ms monitor exe    Then follow the hyperlink to the Database Status page to register your  product key     14 Mascot  Installation and Setup    Step 6  Licence Registration       Register product key    e C   fi Obogong mascot_2_4_0_64 x cgi ms status exe Show REGPRODUCTKEY    Mascot Server Product Key Registration    View current licence information View database status       You are about to be transferred to the Matrix Science licensing website to register a new product key  When the  registation process has been completed  a licence file will be sent via e mail which should then be saved into the  following directory on this server   usr  local mascot_2_4_0_64 config licdb    Register Online Now          Offline Registration    Ifyou are unable to view this page from a computer that can access the Internet  then click the button below to  download a product registration file  You can then transfer this file to another computer which does have Internet  access and open the URL shown below in a web browser  When prompted  select the registration file that you saved     http   uww matrixscience com  licensing register    Save Registr
5.      1  Creates compressed files from the databases  checking that the  FASTA database files are valid  minor errors in the files are re   ported as warnings  more serious errors stop the databases from  being used    2  These files can then be mapped into memory to improve search  times    3  Allows swapping and updating of databases without interruption  to executing searches  This means that Mascot can be available for  running searches 24 7     4  Deletes old copies of the FASTA databases to stop the disk becom   ing full  only the most recent copy is kept     5  Optionally email a system administrator with serious errors  requiring immediate attention  Configuration of email settings in  the options section of mascot  dat is described in Chapter 6     6  Optionally email users with their results if they didn   t wait for  them  Configuration of email settings in the options section of  mascot dat is described in Chapter 6     Sequence Of Events When A New Database Is Added    When a new or updated database is added to a directory  the following  sequence of events takes place     1  If the entry in the mascot  dat file indicates that there should  also be a reference file containing full text entries  Monitor looks for  a file with the same name as the new file but with a  ref or  dat  extension instead of   fasta  If there is no such file  the swap to the  new database stops     2  Compressed index files are made from the  fasta and reference  files  For example  the fo
6.      Insufficient data segment size will cause a large master_results pl script  to fail and a mascot search to fail with an error M00000        Out of  memory  malloc   number of  bytes requested       Swap space    When all physical memory is exhausted  swap space is used  When all  swap space is used  no more memory can be allocated and an error will  be reported     There is a different way of setting up swap space on each system     see  system documentation     Mascot shows free swap space for cluster nodes only   Stack space    Has not been a problem yet     22 Mascot  Installation and Setup    Thread stack space    This is not normally an issue  since it is increased by all the binaries at  run time to 128k        23    Installation  Microsoft Windows       Release Notes    Mascot 2 4 is compiled for 32 bit and 64 bit Windows  Refer to the re   lease notes for last minute additions to documentation and the Matrix  Science web site support page for patches and known issues     http   www matrixscience com mascot_support html    Cluster Mode    If you have a licence to run Mascot on multiple processors  and plan to do  so on a networked cluster of machines  then please familiarise yourself  with the material in Chapter 11  Cluster Mode  before proceeding with  the installation     Overview    To install or upgrade Mascot  the following steps need to be performed    1   2     6     Verify that the computer has sufficient memory and disk space    Verify that the computer 
7.      To override this setting for a particular node  enter the directory on the  node line    DefaultNodeHomeDirFromMaster    This is the directory on the node as seen from the master  For a Windows  cluster  this must be present and specified as a UNC name     The text  lt host_name gt  will be replaced by the host name as specified in  the Node line     For a Linux cluster  this parameter must be commented out   MascotNodeScript   This script is run for each node with the following parameters     i ip address of node   required    t The task to be performed   required  either       StopNode        the script will try and stop the Mascot Node daemon  or service on the specified node     or       StartNode        the script will unconditionally update ms   mascotnode exe  mascot license  and mascot  dat on the  specified node  then start the Mascot Node daemon or service      f Full path to the node   s home directory   required   r Port number of node   required   O The operating system running on the node     required    For a Linux cluster  the master and search nodes must be able to com   municate using either ssh  preferred   or rsh without requiring pass   words or passphrases  In the case of ssh  key based authentication is the  preferred mechanism  A less secure alternative for rsh is provided by file  based authentication using  rhosts or hosts equiv     As shipped  load_node pl executes ms mascotnode  exe as root on each  search node  If this is not acceptable  the 
8.     6  When there are no more searches running that use the old data   base  the files for the old database will be unmapped from memory   and the new files are then mapped into memory     7  Any files in the old directory for the database  which have the  same base name as the current files  are deleted     8  The  fasta and  ref files for the outgoing database are moved to  the old directory    9  The compressed index files are for the outgoing database are  deleted     Why Memory Map and Lock the FASTA Files     To speed up the processing of the FASTA files  they should be mapped  into memory  Databases can be configured in three operational modes     1  Without memory mapping  Do not choose this option  it will make  searches very slow     2  Memory mapping the database files  but not locking the memory   This gives the best performance in most cases  When the system  gets low on memory  the files are swapped out of memory to disk     114 Mascot  Installation and Setup    GetSeq    On most platforms  this will give better performance than simply  relying on the system file cache     3  Memory mapping the files  and locking the memory  This gives  the best possible performance  but does require sufficient RAM for  the databases  the operating system  searches  and any other appli   cations that are to run concurrently with Mascot     In order to reduce the amount of memory required  and to prevent  memory fragmentation  the sequence strings from the FASTA database  are sav
9.     The file system  NFS or a local file system  needs to support file locking  and memory mapping  The following files will be locked unlocked using  the fentl  F_SETLKW   system call  mascot job  getseq job   mascot control  mascotnode  control  If Mascot Daemon  Mascot  Distiller or any application using the task management functions in  client p1is used  then there will be a task_id file in each data   yyyymmdd directory that will be locked unlocked  The following files will  be memory mapped for r w  mascot control  mascotnode control   The location of these files can all be specified in the options section of  mascot dat so that if necessary they can be put on a local file system     Fasta files greater than 2 GB are fully supported on ext2  ext3 and ext4  partitions     System limits    Memory limits    There are several types of memory limits that can stop Mascot from  running     1  Virtual address space  When files are memory mapped  the  address space required can be large     the amount of physical  RAM   swap space is not an issue here     2  The amount of memory that can be locked  On most systems   memory can only be locked by root     3  Physical memory  It is obviously not possible to lock more  memory than is physically available     Chapter 2  Installation  Linux 19    4  Data segment size  The amount of memory that an executable or  Perl script has access to  The default is sometimes too small to  run master_results pl  and big searches     5  Swap space  
10.     or enter  which is only shown in   ambiguous cases  Bold italic fixed pitch   font indicates a variable for which an   appropriate value should be entered        Introduction       Mascot is a software system for protein identification by matching mass  spectrometry  MS  data against FASTA format protein or nucleic acid  sequence databases  This can be done in three different ways     1  A Peptide Mass Fingerprint  PMF   in which the MS data are peptide  molecular masses from the digestion of a protein by an enzyme     2  A Sequence Query  SQ   also called a sequence tag  in which MS data  are combined with amino acid sequence or composition data     3  An MS MS Ions Search  MIS   which uses MS MS data from one or  more peptides     MS data are submitted to Mascot in the form of peak lists  That is  lists  of centroided mass values  possibly with associated intensity values  The  result of a search is a ranked list of the most closely matching proteins   Mascot uses a probability based scoring algorithm  so that it is possible  to report whether a match is statistically significant  If an exact match is  not present in the database  the highest scoring matches will be those  entries which exhibit the greatest homology     Overview    This manual describes how to install  configure and administer Mascot   It is not a User Guide  Mascot includes a linked collection of HTML help  pages that provide guidance and application related reference material  for end users     Mascot
11.     refers to the data system on which the Mascot  search engine executes  The term    client    is used very loosely  It may  refer to a data system attached to a mass spectrometer  or it may refer to  any system at which a user interacts with the Mascot server via a web  browser     In a small laboratory  the server and client may be one and the same  computer  This doesn   t affect installing or using Mascot  but it does  introduce additional considerations  such as the need to adjust system  priorities to ensure that the instrument control and data acquisition  software is responsive to the real time needs of both instrument and  operator     Configuration    Mascot configuration files are structured text files  Modifications can be  made using a browser based configuration editor and take effect without  a system restart     Search Engine    The Mascot search engine accepts data and parameters on STDIN in  MIME format  executes a search of the specified FASTA format data   base  and outputs a structured text file containing the search results  together with the input data and the complete set of search parameters     The results file contains everything necessary to repeat the search at a  later date  should the need arise  In the default configuration  a new  directory is created on the server for each day   s results files  If required   the contents of these results files can be parsed into an external database  to be queried and analysed     Monitor    Swapping datab
12.     unigene contains sub directories for species specific UniGene  indexes    x cgi is a directory for administrative CGI executables  to which  access may need to be restricted  This can be achieved using either  Mascot security or web server security     Installation    Clean Installation    Create a directory for the Mascot program files  In documentation  this is  assumed to be called mascot  but any name can be used  This directory  should not be in a path mapped to a web server URL     Version upgrade    If upgrading Perl  do this before upgrading Mascot   Ensure that no one will try to use Mascot during the upgrade procedure   Kill the ms monitor exe process     You might wish to make a backup of the existing files before they are  overwritten  All configuration files in the config directory apart from  mascot dat and the security settings will be overwritten by new files   All results files and sequence databases apart from SwissProt will be  retained  The installation script will update mascot  dat with by adding  any new options  but will retain all existing sequence database configu   ration settings and other options     Unpack the Mascot file system    Copy the files mascot tar bz2 and swissprot tar bz2 from the  Mascot DVD into the mascot directory  and unpack the archives     bzip2  d mascot tar bz2  bzip2  d swissprot tar bz2    tar xvf mascot tar    10 Mascot  Installation and Setup    tar xvf swissprot tar    For 32 bit Linux  the 32 bit binaries should be u
13.    For example  to add a choice of Ferns or human  add the following to the  taxonomy file     168 Mascot  Installation and Setup    Tat les     Include   Exclude          Titles     Include   Exclude          Ferns or human    3263    And to add the choice of Not human or mice add the following to the  taxonomy file     Not human or mice    10088    Note that    all species     or root has the ID    1        It is  of course  possible to accidentally specify a selection that will result  in no species matching   for example include humans  and exclude ani   mals     If you wish to include species in the taxonomy file without having them  appear on the search form  the keyword    Hidden    should appear on the  line following the title line     Location and format of species lists       3701 Arabidopsis scientific name   3701 Cardaminopsis synonym   3702 Arabidopsis thaliana scientific name   3702 Arbisopsis thaliana misspelling   3702 thale cress preferred common name  3702 thale cress common name   3702 mouse ear cress common name    NCBI Files    The NCBI provide two files that list all the species for which they have  one or more sequences  These files are called names  dmp and  nodes dmp  As shown above  names  dmp is a list of scientific names   synonyms and misspellings for the species  From this list  you can easily  find the ID for the given species  For example              The file nodes  dmp specifies the tree structure  The first column is a  taxonomy ID and the 
14.    Mascot will reconfigure the sub cluster to exclude the faulty node and re   start     212 Mascot  Installation and Setup    Configuration    mascot dat  SubClusterSet X Y    Large clusters can be divided into sub clusters  X is a unique integer  value  0 based  used to identify the sub cluster  Y is the maximum  number of licensed processors assigned to the sub cluster  Since a li   censed processor is good for up to 4 cores  it may be clearer to think of Y  as cores 4  A single cluster must have a single entry with X set to 0     nodelist txt    This file is used to define the nodes that belong to the cluster  For a very  large cluster  it is advisable to define a few percent of additional nodes as     spares     For example  if 51 nodes with 102 processors were available   and Mascot was configured to use 2 sub clusters  each of 50 processors   the node with the 2 spare processors could be used to replace a failed  node automatically  At start up  ms monitor starts each sub cluster in  turn  taking the required number of nodes from nodelist txt in the order  specified in the file  If you wish to override this behaviour  specify a sub   cluster number in nodelist txt       Each line begins with the word Node  followed by a  space and    then a comma delimited list of configuration param     eters      ip address port     computer  host  name     maximum number of node CPU s to be used    operating system     local path to home directory     status  0   available     sub c
15.    series  single charge    5    b    series  double charge    6    y    series  single charge    7    y NH3    series  single charge    8    y    series  double charge    9    ce    series  single charge    10    c    series  double charge    11    x    series  single charge    12    x    series  double charge    13    z    series  single charge    14    z    series  double charge    15    a H20    series  single charge    16    a H20    series  double charge    17    b H20    series  single charge    18    b H20    series  double charge    19    y H20    series  single charge    20    y H20    series  double charge    21    a NH3    series  double charge    22    b NH3    series  double charge    23    y NH3    series  double charge    25    internal yb    series  single charge   26    internal ya    series  single charge   27    z H    series  single charge    28    z H    series  double charge    29 high energy    d    and    d       series  single charge   31 high energy    v    series  single charge   32 high energy    w    and    w       series  single charge   33    z 2H  series  single charge    34    z 2H  series  double charge     Chapter 8  I O File Formats 159    If there are multiple tags for a query  comma separated groups of these  numbers are output for each tag     hn_qm_drange is output for a query that includes an error tolerant  sequence tag  It defines the range of positions within which an unsus   pected modification has been located  For a peptide of 1
16.   If this is not the case  make the necessary  changes  then save mascot  dat  For a 5 node  10 cpu cluster  typical  entries might be     Cluster         Enable  1  or disable  0  cluster mode  Enabled 1         MasterComputerName must be the hostname  MasterComputerName zx80         Node defaults   DefaultNodeOS Linux   DefaultNodeHomeDir  usr local mascotnode          Following line must be commented out WHEN this is a homogeneous  MascotNodeScript  usr local mascot bin load_node pl          Sub cluster definition     Syntax is SubClusterSet X Y where X is the sub cluster number     and Y is the maximum number of CPUs to use within the given sub       SubClusterSet 0 10          Time outs  log files   IPCTimeout 5   seconds with no response before timeout  IPCLogging 0   no logging   0  minimal   1  verbose   2  IPCLogfile    logs ipc log   relative path  CheckNodesAliveFreq 30   seconds between node health checks    SecsToWaitForNodeAtStartup 20   seconds to wait for node to     end    202 Mascot  Installation and Setup    3  Open mascot  config not nodelist txt ina text editor  Enter  configuration information for the cluster  The parameters are fully  described below in the Reference section  Save as nodelist txt  Fora  5 node  10 cpu cluster  typical entries might be     Cluster node definitions              Each line begins with the word Node  followed by a space and     then a comma delimited list of configuration parameters      ip address port     computer  host  n
17.   Installation and Setup    If set to 1  each charge state will be searched  but only the charge state  that gets the highest scoring match is saved to the result file and re   ported  This is the recommended setting    Note that this switch only applies to MS MS queries   including tags    Independent queries are always generated if multiple charge states are  specified for molecular mass queries     CacheDirectory    data cache  Y  m    Cache files are created and to improve performance when viewing large  search results  This option specifies the relative path from the cgi direc   tory to the location for saving report cache files  The actual directory will  be  for example     data cache 2010 02 uwcuxlsxx3s524f4vnnz3btmni   where the lowest level directory is an mddsum of the  dat filename  the  size and last modified date of the  dat file  The tokens are   followed by  any of the conversion specifiers supported by the strftime function  http     www cplusplus com reference clibrary ctime strftime    For example   Y  gets converted to the year as a decimal number including the century    m to the month as a decimal number  range 01 to 12  and  d to the  day of the month as a decimal number  range 01 to 31   The date used  will be the last modified date of the  dat file  rather than the time that  the search started   See also ResfileCache and ResultsCache    CentroidWidth 0 25  CentroidWidthCount 1000    CentroidWidth is the width in Daltons of the sliding window used for
18.   MailTransport 2  MonitorEmailCheckFreq 300  SendmailPath  usr lib sendmail       Mascot can be configured to use email for two purposes     1  When the search engine executes as a CGI application  email can be  used to send the results of a search to a user who accidentally or  deliberately disconnected before the search was complete  This  facility can be enabled by setting EmailUsersEnabled to 1 or  disabled by setting it to 0     2  Serious error messages can be emailed to an administrator  This  facility can be enabled by setting EmailErrorsEnabled to 1 or  disabled by setting it to 0  Error messages that are considered  serious are identified in the file errors  html  This file can be  found in the root directory of the installation CD ROM  and is  displayed by clicking on the link    Error message descriptions    at the  top of the database status page     A number of parameters are used to define how email should be sent     MailTransport should be set to one of the following values     0 for CMC   1 for MAPI   2 for sendmail  3 for Blat    EmailService is the service name  CMC only     EmailPassword is the password  if any  required to log onto MAPI or  CMC    84 Mascot  Installation and Setup    EmailProfile is the MAPI profile name  sendmailPath is the path to sendmail  or an equivalent     EmailFromUser is the name which will appear in the    From    field of the  email message     EmailFromTextName will appear in the    Title    field of the message     If Email
19.   Se 07 1 u K  TLNDELELI  GMK  F   40 760 8461 1519 6777 1519 7439  43 58 0 93 3 2e 07 1 u K TLNDELELI  GMK F   Oxidation  M   4  CH60 STRMS Mass  57312 Score  42 Matches  1 1  Sequences  1 1     60 kDa chaperonin OS Stenotrophomonas maltophilia  strain R551 3  GNegroL PE 3 SV 1  E Check to include this hit in error tolerant search or archive report    Query Observed Mr expt  Mx calc  ppn Miss Score Expect Rank Unique Peptide  16 456 7806 911 5467 911 6168  76 92 1 42 0 035 2 Y R GIVKVVAVK A    Proteins matching the same set of peptides   CH60 STRMK Mass   7339 Score  42 Matches  1 1  Sequences  1 1    60 kDa chaperonin OS Stenotrophomonas maltophilia  strain K279a  GN groL PE 3 SV 1   CH60 XANAC Mass  57131 Score  42 Matches  1 1  Sequences  1 1    60 kDa chaperonin   8 Xanthomonas axonopodis pv  citri GN  groL PE 3 sv 1   CH60 _XANCS Mass  57163 Score  42 Matches  1 1  Sequences  1 1    60 kDa chaperonin OS Xanthomonas campestris pv  vesicatoria  strain 85 10  GN groL PE 3  SV 1  CH0 XANCS Maga  57149 Score  42 Matches  1 1  Sequences  1 1    60 kDa chaperonin OS Xanthomonas campestris pv  campestris  strain 8004  GN groL PE 3 Sv 1  CH60 XANCB Mass  57177 Score  42 Matches  1 1  Sequences  1 1    60 kDa chaperonin OS Xanthononas campestris pv  campestris  strain 8100  GN groL PE 3 Ssv 1  CH60 XANCH Mass  57190 Score  42 Matches  1 1  Sequences  1 1    60 kDa chaperonin OS Xanthomonas campestris pv  phaseoli GNe grol PE 3 SV 1   CH6O_XANCE Mass  57149 Score  42 Matches  1 1 
20.   There are often many names for one particular species   e g  homo  sapiens  human  man     3  Names are sometimes misspelled   e g  homo sapeins   4  Continual re classification of species is taking place    5  Some non redundant databases only reliably give one species when  several submissions from different species have identical sequences     6  There are differences of opinion regarding the taxonomy    tree    struc   ture     This section describes how the Mascot taxonomy filter works and how to  configure it  Most of the configuration that will be required should be  simple to change   for example the list of species displayed in the search  form can be modified easily  and it is fairly simple to download updated  taxonomy lists from the vendors of public web sites  However  to modify  the configuration to use a new format and a different numbering system  is a more complex task that may take some time     The NCBI keeps a list of taxonomy ID   s up to date  and guarantees that  the ID for a given species will not change  although some of the names  used for that ID may change   Mascot configurations all use the NCBI  IDs  but it would be possible to configure mascot to use a different sys   tem     166 Mascot  Installation and Setup    Modifying the list in the search form window    The list in the search form is taken from the taxonomy file in the mascot  config directory     cRAP   IPI human   Quantitation   None    Taxonomy  All entnes  All entries  Fixed    Arch
21.   X       2  cA  JO search Sie Favorites A  2  A Cl x Snagit Powermarks DA A         http    Frillimascot x cgi ms config exe u 1172165637      Mascot Configuration  Enzymes    Title Sense Cleave at Restrict Independent Semispecific  Trypsin C Term KR P no no Edit Delete  Arg C C Term R P no no Edit Delete  Asp N N Term BD no no Edit Delete  Asp N_ambic N Term DE no no Edit Delete  Chymotrypsin C Term no no Edit ete  CNBr C Term M no no Edit Delete  P C Term M    CNBr Trypsin Fann KR no no Edit Delete  Formic_acid C Term D no no Edit Delete  Lys C C Term K no no Edit Delete  Lys C P C Term K no no Edit Delete  Pepsin amp  C Term FL no no Edit Delete  Tryp CNBr C Term no no Edit Delete  TrypChymo C Term no no Edit ete  Trypsin P C Term KR no no Edit Delete  V8 DE C Term no no Edit Delete  VB E C Term Ez no no Edit Delete  semitrypsin C Term KR no yes Edit Delete  N Term BD    LysC AspN C Term K no no Edit Delete       None    ade new enzyme       yy Local intranet       Enzyme    None    is a special case  which cannot be modified or deleted  All  the other enzyme definitions can be edited or deleted  and new ones  added     The edit page allows you to test a new enzyme definition against a  protein    Chapter 6  Configuration  amp  Log Files 69    F Mascot configuration   Microsoft Internet Explorer    Eile Edit View Favorites Tools Help    Q pax     Q  x  a EA      Search Sie Favorites    B  s Cil Hi  Snagit Powermarks We A         Address http    Frill mascot x cgi ms confi
22.   and  provided that you do these two things     a  Accompany the combined library with a copy of the same work based on the  Library  uncombined with any other library facilities  This must be distributed  under the terms of the Sections above     b  Give prominent notice with the combined library of the fact that part of it is a  work based on the Library  and explaining where to find the accompanying  uncombined form of the same work     8  You may not copy  modify  sublicense  link with  or distribute the Library except as  expressly provided under this License  Any attempt otherwise to copy  modify   sublicense  link with  or distribute the Library is void  and will automatically terminate  your rights under this License  However  parties who have received copies  or rights   from you under this License will not have their licenses terminated so long as such  parties remain in full compliance     9  You are not required to accept this License  since you have not signed it  However   nothing else grants you permission to modify or distribute the Library or its derivative  works  These actions are prohibited by law if you do not accept this License  Therefore   by modifying or distributing the Library  or any work based on the Library   you indicate  your acceptance of this License to do so  and all its terms and conditions for copying   distributing or modifying the Library or works based on it     10  Each time you redistribute the Library  or any work based on the Libra
23.   and then click Settings     1394 Connection Settings     Local   rea Connection  Wireless Network Connection    Security Logging    You can create a log file for troubleshooting purposes  Settings       ICMP  With Internet Control Message Protocol  ICMP   the Settings  computers on a network can share error and status          information     Default Settings    To restore all Windows Firewall settings to a default state  Restore Defaults  click Restore Defaults                Choose ICMP Settings  check Allow incoming echo request  and choose  OK     188 Mascot  Installation and Setup    ICMP Settings    Internet Control Message Protocol  ICMP  allows the computers on  a network to share error and status information  Select the requests  for information from the Internet that this computer will respond to     T Allow incoming echo request A    O Allow incoming timestamp request   O Allow incoming mask request   O Allow incoming router request   Allow outgoing destination unreachable  O Allow outgoing source quench   O Allow outgoing parameter problem   O Allow outgoing time exceeded   O Allow redirect   C Allow outgoing packet too big _                      Description    Messages sent to this computer will be repeated back to the  sender  This is commonly used for troubleshooting  for example   to ping a machine  Requests of this type are automatically  allowed if TCP port 445 is enabled        Now  go to the Exceptions tab and ensure that File and Printer Sharing  is che
24.   daemon   file    data F981122 dat     daemon   release MSDB 20020121 fasta     daemon   queries 8      daemon   num_hits 6  H   daemon   h1 1A6K 103 1 00 17004 1      daemon   h1_ text myoglobin   sperm whale     daemon   reptype concise     daemon   S1igscoreprot 72      daemon    lonquery1 734 992175 from 736 000000 1       daemon    Llonquery2 746 992175 from 748 000000 1       daemon   ionquery3 939 992175 from 941 000000 1       daemon   ionquery4 1515 992175 from 1517 000000 1       daemon   ionquery5 1591 992175 from 1593 000000 1       daemon   ionquery6 1853 992175 from 1855 000000 1       daemon   ionquery7 1980 992175 from 1982 000000 1       daemon   ionquery8 2111 992175 from 2113 000000 1       daemon   Selectpeptides 0                  For an MS MS ions search  the output is of the form        daemon   file    data F981123 dat      daemon   release MSDB 20020121 fasta      daemon   queries 4      daemon   num_hits 1     Fdaemon   HH1 Q9XZI2 286 477 1 00 79480 1      daemon   h1 text HEAT SHOCK PROTEIN 70   Crassostrea gigas  Pacific  oyster        daemon   reptype peptide      daemon   S1gscoreprot 72      daemon   ionqueryl 1341 784350 from 671 900000 2   query  1      daemon   scorel 95 12     daemon   Sigscorel 49     daemon    ionquery2 1614 584350 from 808 300000 2   query  2      daemon   score2 74 55     daemon   Sigscore2 48     daemon    ionquery3 1945 784350 from 973 900000 2   query  3      daemon   sScore3 89  84     daemon   Sigscore3 47     daem
25.   distribution of files and executables is all handled when Mascot Monitor  starts     Windows    During Mascot installation on a Windows system  the following dialog  will be displayed     184 Mascot  Installation and Setup          Be  Mascot Server Setup oC  Cluster Configuration MATRIX  Choose whether to use Mascot cluster mode  SCIENCE        Your mascot licence permits you to use cluster mode  If you wish to enable this feature   please select the option below and then click the Configure button to specify the nodes  that will be in the duster     Enable Mascot duster mode    Configure At least one node must be defined in the cluster              If you enable cluster mode  the configure button invokes the following                dialog  Node Address Port   Processors UNC Node Path Node Directory   eases    eat   _Deete    asoc   escanas             Choose Add to configure each cluster node    Chapter 11  Cluster Mode 185       Mascot Cluster Node Se    Enter the UNC path to the location on the node where Mascot will install its cluster node files   Make sure that this directory path is unique to this node entry       koala c  mascotnode Browse    Enter the equivalent of the above path as seen locally on the node     c  mascotnode    Node Address    The node name or IP address can usually be determined from the UNC path above   However  you may override these values below if desired      V  Use this specific host name  koala   7  Use this specific IP address  192  168 
26.   engine is run by the web server as a CGI application     It is also possible to execute the search engine as a    console    or    command  line    application  This Chapter provides the information that is required  to write scripts or applications which interface to the Mascot search  engine and associated programs     Mascot Search Engine    The Mascot search engine  cgi nph mascot exe  accepts command  line arguments and a MIME format ASCII text file on standard input   STDIN  containing search data and parameters     nph mascot exe 1   commandline    f path      taskID number     sessionID string   lt  in asc    The first argument is required  and is a digit  between 1 and 4  which  determines the mode of operation     1  Normal search  MS MS data  if any  form part of the MIME  format input file    2  Monitor test mode 0  3  Monitor test mode 1    4  Repeat search  the MIME format input file contains a reference  to a Mascot results file which may contain MS MS data    Optional argument  commandline is a flag  If present  HTML formatted  output is not written to STDOUT     Optional argument  f allows a result file path to be specified  In the  absence of this argument  the result file will be written to a daily sub   directory of mascot data and have the filename F123456 dat  where  123456 is an auto incremented job number     110 Mascot  Installation and Setup    Optional argument     taskID is used to specify a unique numeric identi   fier  This identifier should be
27.   one or more accession strings  and an optional text string describing the  entry  Apart from the use of the    greater than    character  the precise  syntax of the title line varies from database to database  The title line is  delimited from the sequence that follows by a platform dependent new  line character     The title line is followed by lines of contiguous sequence characters  Line  lengths vary between databases  anything from 60 characters to a thou   sand or more  Mascot can handle lines up to 50 000 characters long  The  end of a sequence is indicated when the following line is either a new  title line or the end of the file  For example     VYEYVRKYAEHRMLVVAEOPLHAMRKGLLDVLPKNSLEDLTAEDFRLLVNGCGEVNVOML  ISFTSFNDESGENAEKLLOFKRWFWS IVERMSMTERQDLVYFWTSSPSLPASEEGFOPMP  SITIRPPDDOQHLPTANTCISRLYVPLYSSKOILKOKLLLAIKTKNFGFV    gt 104K THEPA  P15711  104 KD MICRONEME RHOPTRY ANTIGEN   MKFLILLFNILCLFPVLAADNHGVGPOGASGVDPITFDINSNOTGPAFLTAVEMAGVKYL  OQVOHGSNVNIHRLVEGNVVIWENASTPLYTGAIVTNNDGPYMAYVEVLGDPNLOFFIKSG  DAWVTLSEHEY LAKLOETROAVHIESVFSLNMAFOQLENNKYEVETHAKNGANMVTFIPRN    Mascot doesn   t search the Fasta file directly  When a new database is  recognised  Mascot Monitor uses the Fasta file to create a set of com   pressed files  One reason for doing this to separate the sequence string  from the title line  because only the sequence string needs to be memory  mapped  In the case of a database with predominantly short sequences   this greatly reduces the amount of memory r
28.   t be more than  10 Mb     All request parameter names are case insensitive  Any parameter value  can be optionally quoted     DB     mandatory parameter and can only appear once  If several  databases are searched than ms getseq must be called separately for  each database     ACCESSION     must appear at least once and consist of entries in the  format    accession_string     frameNo     Quotes around accession strings are mandatory  Frame number can be  integer from 0 to 6 and can only be specified for NA databases  Other   wise  an error will be reported  Accessions can be delimited with com   mas  spaces  tabs or new line characters  Several ACCESSION fields will  be merged by ms getseq exe into one internally     SHOWPI   can appear only once and if set to TRUE pi values will have  to be calculated for each sequence and output     SHOWTITLE     can appear only once and if set to TRUE a description  for each db entry has to be output     SHOWLEN  can appear only once and if set to TRUE a length of  sequence string is output for each db entry     SHOWSEQUENCE  can appear only once and if set to TRUE a se   quence string should be output for every db entry     SHOWREFERENCE  can appear only once and if set to TRUE refer   ence lines should be output for each db entry     SESSIONID     an optional parameter and can appear at most once  If no  session ID is supplied then ms getseq can either process the request  when security is disabled or try to retrieve the ID from cooki
29.   taxonomy id    SearchControl    Any helper application can call bin ms searchcontrol exe to imple   ment asynchronous automation of search submission  Available com   mands are      status     result_file_name     result_file mime     result_file ini     results     xmlresults      create _task_id    134 Mascot  Installation and Setup         mascot_job number    kill_ job       pause_ job      resume_job    nice_ job    set_to_ queued         version    ms searchcontrol exe   status   taskID  lt number gt       sessionID  lt string gt      The    status    command will return one of the following   unknown_id  referring to task ID    id_assigned  referring to task ID    error nnnn   running yy    complete   queued   searchcontrol error nnn    where error indicates an error in the search  and will be the Mascot  error number or one of     TASK_ERROR_NO ERROR   0  TASK_ERROR_JOB_CRASHED    1  TASK _ERROR_JOB_KILLED    2    And searchcontrol error indicates a problem with the ms   searchcontrol exe program  Values will be one of     ERR_TASKID NOERROR   0   ERR_TASKID FATLOPEN    ERR_TASKID FATLCREATE    ERR_ TASKID FATLREAD    ERR_TASKID FATLWRITE    ERR_TASKID FATLCLOSE    ERR_TASKID CHANGEDRECORD    ERR_TASKID INVALIDMASCOTDAT    ERR_TASKID MISSINGRESULTSFILE   8    NAN UF WN HE          Chapter 7  Program Reference 135    ERR_TASKID FILENAMETOOLONG   9  ERR_TASKID SESSTONTIMEDOUT   10  ERR_TASKID PERMISSTONDENTED 11    ms searchcontrol exe   result file name   taskID   lt
30.   this list of conditions and the following disclaimer      2  Redistributions in binary form must reproduce the above copyright    notice  this list of conditions and the following disclaimer in   the documentation and or other materials provided with the  distribution                    3  The end user documentation included with the redistribution   if any  must include the following acknowledgment      This product includes software developed by the  Apache Software Foundation  http   www apache org        Alternately  this acknowledgment may appear in the software itself   if and wherever such third party acknowledgments normally appear                             4  The names    Xerces    and    Apache Software Foundation    must    not be used to endorse or promote products derived from this  software without prior written permission  For written  permission  please contact apache apache org                    5  Products derived from this software may not be called    Apache      nor may    Apache    appear in their name  without prior written  permission of the Apache Software Foundation                    THIS SOFTWARE IS PROVIDED    AS IS    AND ANY EXPRESSED OR IMPLIED     WARRANTIES  INCLUDING  BUT NOT LIMITED TO  THE IMPLIED WARRANTIES     OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE     DISCLAIMED  INNO EVENT SHALL THE APACHE SOFTWARE FOUNDATION OR     ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT  INDIRECT  INCIDENTAL      SPECIAL  EXEMPLARY  OR CON
31.  01   The definition block for NCBI is number 8  and this  contains the following       TAXONOMY FOR NCBInr using GI2TAXID    Taxonomy 8  Enabled  FromRefFile  ErrorLevel    1   0 to disable it  0  0    DescriptionLineSep 1   ctrl a   hex code    1     For multiple descrip   tions per entry    SpeciesFiles  NodesFiles  DefaultRule  number  Identifier       GI2TAXID gi_taxid_prot dmp  NCBI names dmp  NCBI nodes dmp  NCBI merged dmp  GI2TAXID  CHOP     gi    0 9         The gi    NCBI protein FASTA using GI2TAXID    AccFromSpeciesLine      gi  0 9           End    To turn off taxonomy for NCBInr  set the enabled flag to 0    FromRefFile is set to 0 indicates that the taxonomy should be found in  the  fasta file rather than in a reference file     ErrorLevel is set to zero  to indicate the type of warnings or errors that  are found when creating the taxonomy information  If this is set to 0   then an entry is put into the    NoTaxonomyMatch txt    file for every  sequence where no taxonomy information is found  If it is set to 1  then  an entry is put into the file    NoTaxonomyMatch txt    for every sequence  that had any gi number in a sequence without a match  Since some  sequences will have up to 200 gi numbers  sources   there is a reasonable  chance that some of these entries will not have species information  and  this would cause the errors files to become very large     As mentioned earlier  each entry in the description line is separated by a  CTRL A  so DescriptionLin
32.  12  Mixtures in decoy results  if automatic decoy PMF   13  Peptides  if SQ or MIS    14  Decoy peptides  if SQ or MIS and automatic ET    15  Error tolerant peptides Gf SQ or MIS and automatic ET   16  Proteins  if SQ or MIS    17  Query data  one block for each query   18  Index    150 Mascot  Installation and Setup    General Notes    1  Values are shown in italics   Label case doesn   t matter    3  Labels are used to assist readability  but kept short to minimise  file size   4  Parameters are grouped logically   5  Order of blocks is not important except that the index block  must be the last block  Presence of blank lines within the index  block may cause a problem    6  Because the MIME type is defined as an unknown application   if this file passes through a mail agent  it will be treated as an     octet stream    and encoded    base64    for transmission     Search parameters      gc0p4J0q0M2Yt084jU534c0p  Content Type  application x Mascot  name  parameters       USERNAME user name in plain text  USEREMAIL email address in plain text  SEARCH  PMF   COM search title text   DB SwissProt   CLE Trypsin   MASS Monoisotopic   MODS Mod 1 Mod 2    RULES 1 2 5 6 8 9 13 14    gc0p4Jqo0M2Yt08jU534c0p    The Parameters section contains the complete set of parameter values  from the search form apart from the contents of the uploaded data file or  the query window  Labels must be unique  independent of case  Where a  parameter can be multivalued  e g  mods  the values are list
33.  2500   vmemory 1048576 kbytes  threads 1024    These values will be different for root and a normal user  and possibly  different again for the owner of CGI processes  apache or www data    Since you may not be able to log in as the CGI user  it can be hard to find  out what the real values are  If a script or binary is failing in the web  browser  try running from the command line as both root and a normal  user     Changing the default limits    There are different utilities   configuration files on every system  Refer to  system documentation     Detailed Information on each memory limit    This section gives details about how the mascot software reports errors   and tries to increase the limits where appropriate     Virtual address space    Mascot executables are compiled as both 32 bit and 64 bit programs  If  you use the 32 bit executables  the amount of virtual address space is  limited to 3 or 4 GB  according to platform  and this limit cannot be  exceeded  However  default limit may be set lower than this by the  operating system     If memory cannot be mapped  the error M00048    Failed to create  memory map for  filename   Error  detailed message     will be displayed  and put into the errorlog txt file     The amount of memory that can be locked    As well as the obvious limitation of physical memory  there is generally a  limit set on the amount of memory that can be locked  Another fre   quently used term for locked is    wired        On most systems  memory ca
34.  3063 2364 3264 2 2 v R  KPLVIIARDVDGEALS TLYLNR   L  2 6l 828 1238 2481 3495 2481 3942 Ld a v R  TALLDAAGVASLLTTARVVVTEIPK E  e 62 828 1322 2481 3748 2481 3942 o 1 v R  TALLDAAGVASLLTTAEVUVTEIPE  E  64 854 0588 2559 1545 2559 2413  33 90 0 1 K  LVQDVANWTWEEAGDG TTTATVLAR           21 04 2012 15 42    54 Mascot  Installation and Setup    2 65 1038 5031 3112 4873 3112 5023    4 61 0 13 15 1 y K  DMAIATGGAVEGEEGLTLNLEDYQPHDLGK  V   Oxidation  M       2  CH  D DROME Mass  60885 Score  175 Matches  4 2  Sequences  4 2   60 kDa heat shock protein  mitochondrial OS Drosophila melanogaster GN Msp60 PE 1 SV 3       Check to include this hit in error tolerant search or archive report    Query Observed Mx  expt  Mx  calc  ppm Miss Score Expect Rank Unique Peptide    u 417 1822 832 3498 832 3828  39 57 0 45 0 018 i K APGPGONR K   27 617 2857 1232 5569 1232 5885  25 63 0 42 0 058 2 u R VGGSSEVEVNEK K   59 1163 1570 2364 2994 2364 3264  11 42 1 12 26 2 u R KPLVIIAEDIDGEALSTLVVNR L  64 054 0588 2559 1545 2559 2413  33 90 9 75 1 2   05 1 KK  LVQOVANNTNEEAGDGTTTATVLAR  A       3  CH60_CABEL Mass  60235 Score  139 Matches  3 3  Sequences  2 2   Chaperonin homolog Msp 60  mitochondrial OS Caenorhabditis elegans GNehsp 60 PE 1 SV Z    Check to include this hit in rror tolerant search or archive report    Query Observed Mr expt  Mr calc  ppm Miss Score Expect Rank Unique Peptide       12  427 1822 832 3499 932 3028  39 57 0 45 0 018 1 K APGPGONR K   39 752 8643 1503 7141 1503 7490  23 23 0  90  
35.  65536  Maximum number of nodes in nodelist txt 4096    233       Web Server Configuration       Mascot Directory Structure    The Mascot directory structure is described in Chapter 2  Installation   Linux    Microsoft Internet Information Services    The Mascot installation program automatically configures Microsoft IIS  5 0 or later     CGI Timeout    The CGI timeout is set to 1 day  and any searches running longer than  this will be terminated  If you wish  you can increase this timeout     As of Mascot 2 2  the CGI timeout value is set only on the parent node    w3sve 1 root mascot so that it is inherited by both the cgi and x cgi  nodes  If a value is also set on  w8sve 1 root mascot cgi  e g  from a  previous Mascot installation or set by an administrator  then it will  override any inherited value     IIS 5 x and 6 0  2000  XP  2003 Server     At the command prompt  go to c   Inetpub AdminScripts directory  To get the value of the current Mascot cgi timeout   cscript adsutil vbs get  w3svc 1 root mascot cgi cgitimeout    If you get an error message saying    not set at this node     go up one level  to the mascot node     escript adsutil vbs get  w3svc 1 root mascot cgitimeout    If you still get an error message saying    not set at this node     then you  should set a value at this node  If cgitimeout was already set at this node    234 Mascot  Installation and Setup    or at the cgi node  you can change the value  The default value as set by  Mascot will be 86400 se
36.  Sequences  1 1    60 kDa chaperonin OS Xanthomonas campestris pv  campestris GNegrol PE 3 Svel   CH60 XANOM Mass  57121 Score  42 Matches  1 1  Sequences  1 1    60 kDa chaperonin OS Xanthomonas oryzae pv  oryzae  strain MAFF 311018  GNegroL PE  3 sV 1  CH60 XANOP Mass  57121 Score  42 Matches  1 1  Sequences  1 1    60 kDa chaperonin OS Xanthomonas oryzae py  oryzae  strain PXOS9A  GNegrol PE 3 SV 1  CH60_XANOR Mass  57121 Score  42 Matches  1 1  Sequences  1 1    60 kDa chaperonin OS Xanthomonas oryzae py  oryzae GNegroL PE 3 SV 1       Peptide matches not assigned to protein hits   no details means no match           Query Observed Mx  expt  Mz  calc  ppm Miss Score Expect Rank Unique Peptide    33 724 3649 1426 7153 1426 8078  64 84 2 20 9 2 2  2 9 747 3962 746 3089 746 3381 68 1    a8 23 1  z 14 442 2283 882 4421 882 5267  98 15 1 17   s  2 30 663 8379 1325 6612 1325 7667  79 54 0 15 23     z    662 2756 661 2683 661 3217  80 80 0 14 a 3  2 6 673 3495 672 3422 672 3555  19 74    12 e2 2    23 1201 6217 1100 6144 1100 6012 12 0    12 6 2  2 8 714 3725 713 3652 713 4072  58 79 0 a 6  2 LAPAQSK  a 57 747 0361 2238 0864 2238 1089  10 08    10 52 1i  z 38 749 3840 1496 7534 1496 7657  8 21 0 9 1e 02 1  ad 22 1101 5366 1100 5293 1100 5349  5 09    9 1 302 1  z 29 642 3536 1282 6926 1282 7357  33 64 0 a 1 3602 1  a 2 500 2560 499 2487 499 2278 41 9    8 64 1  z 19 932 3644 931 3571 931 4433  92 57 1    se 1  56 1119 0452 2236 0758 2236 0947  8 44    6 1 5602 2  e 28 642 3526 1
37.  a  notice that there is no warranty  or else  saying that you provide   a warranty  and that users may redistribute the program under  these conditions  and telling the user how to view a copy of this  License   Exception  if the Program itself is interactive but   does not normally print such an announcement  your work based on  the Program is not required to print an announcement      These requirements apply to the modified work as a whole  If  identifiable sections of that work are not derived from the Program    and can be reasonably considered independent and separate works in  themselves  then this License  and its terms  do not apply to those  sections when you distribute them as separate works  But when you  distribute the same sections as part of a whole which is a work based  on the Program  the distribution of the whole must be on the terms of  this License  whose permissions for other licensees extend to the   entire whole  and thus to each and every part regardless of who wrote it     Thus  it is not the intent of this section to claim rights or contest  your rights to work written entirely by you  rather  the intent is to  exercise the right to control the distribution of derivative or  collective works based on the Program     In addition  mere aggregation of another work not based on the Program  with the Program  or with a work based on the Program  on a volume of  a storage or distribution medium does not bring the other work under  the scope of this Licen
38.  an empty string       false   FALSE   False   0        off  All missing parameters are defaulted to    false    value     Translation table number is always output as well as taxonomy Id and  scientific name     Output format    In response to any POST request  XML format output is returned   Encoding UTF 8 is to be used for output  XML output is schema vali   dated and schema versioned  All XML output must be XML escaped  using the following substitutions      gt   amp gt    lt   amp lt    amp   amp amp     6     amp apos        amp quot    Taxonomy information is returned in the order requested  A   lt msgs frame gt  element will only be output for an NA database     The example input file would produce output similar to this  edited for  brevity       lt  xml version  1 0  encoding  UTF 8  standalone  no    gt    lt msgt ms_gettaxonomy_out xmlns msgt  http   www matrixscience com   xmlns schema msgettaxonomy_1   majorVersion  1  minorVersion  0   xmlns xsi  http   www w3 org 2001 XMLSchema in   stance     xsi schemaLocation  http   www matrixscience com   xmlns schema msgettaxonomy_1 msgettaxonomy_1 xsd    gt    lt msgt results jobid  874  gt    lt msgt db_ entry gt    lt msgt  db gt SwissProt lt  msgt  db gt    lt msgt accession_str gt RL19 YEAST lt  msgt accession_str gt    lt msgt title gt   gt sp P05735 RL19 YEAST 60S ribosomal protein L19  OS Saccharomyces cerevisiae GN RPL19A PE 1 SV 5 lt  ms  gt title gt    lt msgt all_ accessions gt    lt msgt  accession gt     13
39.  be found in the file config   apache conf     Installation Script    Step 1  Web Server Operation    Launch a JavaScript aware web browser  and navigate to the URL  corresponding to install  html  e g  http   your domain mascot   install  html     Follow the instructions on this web page and those that follow to perform  some simple system checks and create or update the Mascot configura   tion file  mascot  dat      Step 2  Perl    If you get an error message  or a    File Save As       dialog box  after click   ing on the    Test Perl    button  then Perl is not functioning correctly  This  must be corrected before proceeding  Possible reasons for this problem  are listed along with useful links     e Perl is not installed  or was installed incorrectly     e Perl  or a soft link to Perl  was not found at  usr local bin   perl    e The mascot cgi directory is not configured for CGI execution   e JavaScript is disabled    If Perl is functioning correctly  the next page displays the Perl version  number  If any of the required modules are missing  there will be a  warning  Instructions for installing the required modules can be found  towards the beginning of this chapter     Step 3  GD Graphics Library    Assuming Perl and the required modules are present  click on the button  to test the GD Graphics library  If GD is installed and working  the next  page contains a small graphic to confirm this     GD 0K      If you do not see this picture  or get an error message  refer to
40.  cne       Chapter 3  Installation  Microsoft Windows 27    The Microsoft web server for Vista is IIS 7 0  which is provided as part of  the standard distribution  A default installation of IIS 7 0 does not  support running a CGI application such as Mascot  From the Control  Panel  choose    Programs and Features     Choose    Turn Windows features  on or off  Expand the node for Internet Information Services and ensure  that all the checkboxes shown below are checked  in addition to any  default selections  Then  choose OK     In Vista Home Premium  the IIS 7 simultaneous request execution limit  is 3  In Vista Business  Enterprise  and Ultimate Editions  the limit is  10  This will limit the number of simultaneous searches that can be run  from a simple web browser form     Server 2008  including R2     Mascot will run under all Server 2008 editions except for Core     It is advisable to ensure that the latest service pack has been installed   Check the following URL for current information     http   msdn microsoft com en us windowsserver bb794698  http   support microsoft com ph 1163 tab1    The Microsoft web server for Server 2008 is IIS 7 0 and 7 5 for Server  2008 R2  From the Control Panel  choose Turn Windows features on or  off to launch Server Manager  Select Go to Roles  scroll down to Web  Server  IIS   and choose Add Role Services  Then follow the configuration  notes under the Windows Vista section  above    Windows 7    Mascot will run under all Windows 7 edit
41.  command line    This pseudo user is always used when running programs from the com   mand line  and can perform any task without restriction  This    user     doesn   t appear in the security administration utility and hence the  account cannot be deleted or disabled  The userid is 3     daemon    This user should be used to run searches in Mascot Daemon  See the  Mascot Daemon help for details  The user account is disabled by default   so it will need to be enabled and before use  The userid is 4     public_searches    This is a pseudo user that is used for the example searches  This    user     doesn   t appear in the security administration utility and hence the  account cannot be deleted or disabled  It isn   t possible to login as this  user  The userid is 5      system     The Mascot Integra system account is used to query data on the Mascot  server  Do not change the name of this account or the type of the account   There is no password associated with this account since it can only be  called from the secure Mascot Integra server  The userid is 6     Types of user    Six    types    of user are available  and the appropriate type should be  selected using a the drop down list in the administration screen     Standard Mascot User    The user name and password are stored by Mascot    218 Mascot  Installation and Setup    Mascot Integra User    The password  password expiry and timeouts for these users are set in  the Mascot Integra administration screens     The standa
42.  conforms to a client   server architecture  and the primary user  interface is a JavaScript aware web browser  Searches can be submitted  from web browser forms  customised for different types of searches  or  from a variety of client software  Mascot Daemon is a client application   bundled with Mascot Server  for batch automation of search submission   Mascot Distiller is a powerful application  licensed separately  that can  process a wide range of native file formats into peak lists  submit  searches to a Mascot Server  and import the search results for examina   tion or further processing  There are also a number of third party clients   including many mass spectrometry data systems that support search  submission to Mascot     2 Mascot  Installation and Setup    In most cases  the Mascot search engine is executed as a CGI program   On completion of a search  it calls a Perl CGI script that reads the re   sults file and returns an HTML report  or some other machine readable                                                                      MS Data System Mass Spectrometer        Server          Mascot    HTML  amp  CGI scripts search    results           Mascot Search Engine i           Public  sequence  databases       FASTA  FTP Database Management sequence  databases         digest of the results  to the client  Links to additional CGI scripts provide  more detailed views of the results     Chapter 1  Introduction 3    Mascot Components    In this manual     server
43.  core Networking   Destination Unreacha    Core Networking     core Networking   Destination Unreacha    Core Networking     core Networking   Dynamic Host Config    Core Networking     core Networking   Dynamic Host Config    Core Networking     core Networking   Internet Group Mana    Core Networking     Core Networking   IPHTTPS  TCP In  Core Networking     core Networking   IPv6  IPv6 In  Core Networking      e       Filter by Group    Refresh                   There will be two entries for Apache  one for UDP protocol and one for  TCP  Double click the TCP row  On the    Protocols and ports    tab   configure as shown       Generel Programs and Services    Protocols and Ports Scope   Advanced                 Protocols and ports    a Protocol type   Protocol number                        Specific Forts   80   Example  80  443  5000 5010  All Ports x                      Example  80  443  5000 5010    Intemet Control Message Protocol   ICMP  settings                       Chapter 3  Installation  Microsoft Windows 43    On the Advanced tab  check all three profiles  domain  private  public   and Apply  Back in the top level dialog  with the Apache TCP row se   lected  choose    Enable rule       Security    Mascot security is disabled on installation  To enable Mascot security   refer to Chapter 12    Miscellaneous    LCQ DTA    This utility  an option on the Mascot search form selection page  makes it  possible to a upload a Thermo   raw file  as opposed to a peak list  w
44.  dat taxonomy definition  to specify  at a database level  which code is to be used     For further information on genetic codes  see   http   www ncbi nlm nih gov Taxonomy Utils wprintgc cgi mode c  Modifying the    Taxonomy lineage    link    In the protein view  a link to taxonomy lineage is shown     Chapter 9  Taxonomy 171                      E  Mascot Search Results  Protein View   Microsoft Internet Explorer    file Edit View Go Favorites Help    eee oe a ff 3    Back F Stop Refresh Home Search Favorites History Channels   Fullscreen Mail Print       Address je http    g6 400 mascot cgi protein_view  pl file    data 19990922 F001 239  dat amp hit 2 z   l Links    a    bde  xce  Mascot Search Results                                           Protein View       Match to 143E_HUMAN  14 3 3 PROTEIN EPSILON  MITOCHONDRIAL IMPORT STIMULATION FACTOR     This sequence is common to the following entries    143E_HUMAN from Homo sapiens   143E_HUMAN from Rattus norvegicus   143E_HUMAN from Mus musculus   143E_HUMAN from Ovis aries                Nominal mass of protein  M    29155    Cleavage by Trypsin  cuts C term side of KR unless next residue is P  Matched peptides shown in Bold Red    1 MDDREDLVYQ AKLAEQAERY DEMVESMKKV AGMDVELTVE ERNLLSVAYK  51 NVIGARRASW RIISSIEQKE ENKGGEDKLK MIREYRQMVE TELKLICCDI  101 LDVLDKHLIP AANTGESKVF YYKMKGDYHR YLAEFATGND RKEAAENSLYV  151 AYKAASDIAMN TELPPTHPIR LGLALNFSVF YYEILNSPDR ACRLAKAAFD  201 DAIAELDTLS EESYKDSTLI MQLLRDNLTL WTSDMQGDGE EQNKEAL
45.  data     ppm  243501029130836  Content Disposition  form data     243501029130836  Content Disposition  form data     0 1  243501029130836    Chapter 8  I O File Formats 147    name     COM       name   DB       name   CLE       name   PFA       name     QUANTITATION       name   TAXONOMY       name   MODS       name  IT MODS       name   TOL       name   TOLU       name  PEP ISOTOPE ERROR       name  TTOL       148 Mascot  Installation and Setup    Content Disposition  form data  name   ITOLU       Da  243501029130836  Content Disposition  form data  name   CHARGE       1   243501029130836  Content Disposition  form data  name  MASS       Monoisotopic   243501029130836  Content Disposition  form data  name   FILE      filename  test_search mgf     Content Type  application octet stream    BEGIN IONS  PEPMASS 498  272888  CHARGE 1   157 096962 23 72  185 160000 26 69  286 134951 80 7  385 210000 13 49    2000 120000    3 142  2000 568167 4 108  2001 020697 2 098  2001 820000 1 103  END IONS  243501029130836    Content Disposition  form data  name   FORMAT       Mascot generic  243501029130836  Content Disposition  form data  name   PRECURSOR       243501029130836  Content Disposition  form data  name   INSTRUMENT       ESI QUAD TOF  243501029130836  Content Disposition  form data  name  REPORT       AUTO  243501029130836       Chapter 8  I O File Formats 149    Results File    The results file contains the search results together with the search  input parameters and peak li
46.  files automatically to a  specified schedule  This section describes how to update the files for a    database if your Mascot Server is not connected to the Internet or if you  choose not to use Database Manager        When a new release of a database becomes available  it should be copied  or downloaded into the incoming directory  In many cases  the  downloaded file will have to be de compressed  The filename may or may  not be constant from release to release     The Fasta database should be renamed to a name that includes a version  or date stamp and matches the wild card path for the database  then  moved to the current directory  Never copy a large file to the current  directory under its final name because this will take time and the ex   change process may be triggered prematurely     If you are using a local reference file  rename and move this file first   Otherwise  the exchange process will be triggered by the appearance of  the Fasta file  but will immediately fail because the new reference file is  not yet available  Note that Fasta and reference files must have identical  names apart from the filename extension     Once Mascot Monitor sees a new Fasta file that matches the wild card  path for the database  it will begin the exchange process  Progress can be  monitored from the Mascot Database Status page     Obtaining Fasta files    If your Mascot Server has an Internet connection  and you are able to use  Database Manager  ignore the information in this secti
47.  for clarity   The individual columns contain the  following information     Column 1  Mascot job number  Job numbers are allocated sequentially   but will appear in the log in the order in which searches are completed  If  the submitted search contained an error which prevented the search  starting  there will be no entry in searches   log  but there should be an  entry in errorlog txt     Column 2  Process ID  Column 3  Sequence Database searched    Column 4  User name  User names are required by the  JavaScript   search forms  but not by the search engine  so this field may be empty  If  an entry logs utility program activity  rather than a search  this field  contains the name of the utility  e g  TESTPARSE or GETSEQ     Column 5  User email address  User email addresses are required by the   JavaScript  search forms  but not by the search engine  so this field may  be empty     Column 6  Search title  Empty if none supplied    Column 7  Relative path to Mascot search results file   Column 8  Start time in the format illustrated in the example above   Column 9  Duration in seconds    Column 10  Completion Status  normally    User read res     If  EmailUsersEnabled is set to 1  and the user disconnected before the  search was complete  this entry would read    user emailed        Column 11  Job Priority  Not currently implemented    106 Mascot  Installation and Setup    Column 12  Type of search  PMF  SQ  or MIS    Column 13  Enzyme  Either yes  if user selected an enzyme 
48.  freely  A line which starts with  the   character  pound in the US  hash in Europe  is a comment line     Databases    Do not modify this section if you ever use Database Manager    Databases      NCBInr c  inetpub mascot sequence NCBInr current   NCBInr_  fasta AA 1234 1411 10067 0 8   SwissProt c  inetpub mascot sequence SwissProt current   SwissProt_  fasta AA 1234 15 11 101 33 13 15 3    end    A line that is commented out with a   character at the start is an inac   tive database definition  Each line defines a database using the following  14 parameters     76 Mascot  Installation and Setup    1  Name  Each database must have a unique name  Ideally  the name  should be short and descriptive  Note that these names are case sensi   tive  and much confusion can be caused by creating  say  Sprot and  SPROT  The name does not need to be the same as or even similar to the  filename of the actual FASTA file  Allowed characters are alphanumerics  and   S  amp          2  Path  FASTA database files must be available locally  Mascot creates  its compressed files in the same directory as the original FASTA file  The  location of the FASTA file is defined in the Path field  This must be the  fully qualified path to the FASTA file  with a wild card in the filename to  allow incoming and outgoing database files with different version or date  stamps to be present in the current directory simultaneously  The  delimiters between directories must always be forward slashes  even if  Mas
49.  i  pubwww1 i pubwww1 Status     Currency Converter       B MS Bugs  amp  FAQ E Twiki OB Family report  amp    2009_ASMS_Fall_Wor    Gil MascotImproveRepott             Mascot search status page        MASCOT search status page    Version  2 3 00   Licensed to  Matriz Science Internal Test   10 processors   Using 5 nodes and 10 processors   0 searches running         Search log  monitor log ferror log  Error message descriptions  nodelist tzt  De not auto refresh this page             SwissProt Family   C  Inetpub Mascot sequence SwissProt current SwissProt_  fasta  SwissProt_57 12 fasta Pathname   C  Inetpub Mascot sequence SwissProt current SwissProt_57 12 fasta  Status In use Statistics Unidentified taxonomy   State Time Sun Dec 20 17 11 19   searches   0   Mem mapped   YES Request to mem map   YES Request unmap   NO Mem locked   YES   Number of threads   1 Current   YES    Name  Filename       oho od       Cluster Nodes       Node   IP Address Os Responding  Physical Memory   Swap file  Disk space  sleepy  192 168 70 1   Windows NT     OK    62  free    100  free  O 53  free  dopey  192 168 70 2   Windows NT     OK    62  free    99  free     53  free  grumpy  192 168 70 3   Windows NT     OK    62  free    99  free     53  free  bashful  192 168 70 4   Windows NT     OK    63  free    99  free     54  free  happy  192 168 70 5   Windows NT     OK    62  free    99  free  O 54  free                                                    If all is well  you will see rows of ha
50.  in httpd conf in the Mascot config directory  Also   ensure that ForkForUnixApache in the Options section of mascot  dat  is set to 1  Further information on web server configuration can be found  in Appendix D     Installation is finished  but don   t clear the checkbox     Licence Registration    If you cleared the checkbox at the end of the installation wizard  from the  Windows Start menu  choose Programs  Mascot  Admin  Database  Status  The following screen will be displayed in your default web  browser       Nr A      ha Ole http   ec vm64 mascot x cgi ms status exe Shc O      X       Register product key                      Mascot Server Product Key Registration    View current licence information View database status       You are about to be transferred to the Matrix Science licensing website to register a new product key  When the registation process has  been completed  a licence file will be sent via e mail which should then be saved into the following directory on this server   C  inetpub mascot config licdb    Register Online Now    m7    Offline Registration   If you are unable to view this page from a computer that can access the Intemet  then click the button below to download a product  registration file  You can then transfer this file to another computer which does have Internet access and open the URL shown below in  a web browser  When prompted  select the registration file that you saved     http    www matrixscience com licensing register    Save Re
51.  included the     E  I accept the terms in the Licence Agreement    Printo   mBan  Net                  ks z NEEDIS Jaaa i  Product Key E MATRIX  Prepare for product key registration  SC  TEN CE        Please ensure that you have a product key for Mascot Server 2 3 241RC1 before  proceeding  After the software is installed  you will be required to register this product  key in order to obtain a licence to use the software  Mascot Server will not function until  the registration process has been successfully completed  If you received this software  on physical media from Matrix Science  the product key can be found on a printed label  attached to the case  Example  A12B C34D E56F G78H I90              This is a reminder that you will need to register a product key to create a  licence file  This product key may be printed on a sticker on the CD case    32 Mascot  Installation and Setup    or it may have been sent by email  If you cannot locate your product key   contact support matrixscience com for assistance  The next screen  allows you to choose which components will be installed        iy Mascot Server Setup    o me   Custom Setup MATRIX  Select the way you want features to be installed  fi C TEN CE     Click the icons in the tree below to change the way features will be installed            Main application components for  Mascot Server         amp      IIS Web Site  X    Apache Web Site  B r   Sequence Databases   EM SwissProt 2012_03 This feature requires 490MB on  you
52.  is there a taxonomy section in mascot dat for that number     When the compressed files are built  the taxonomy index has the name     database_name t00    If this file doesn   t exist for the database  it may be  necessary to stop Mascot Monitor  delete the   stats file for the database   and restart Monitor     How Mascot gets a species ID for sequences    This section contains complex configuration information  It is normally  only necessary to read and understand this section when adding a new  database of a different type     When ms monitor creates the compressed files  it also makes a file  containing the taxonomy ID s  for each sequence  To do this it needs to  follow certain rules  These rules are defined in mascot dat  The rule  number for each database is specified as the 14 parameter in the  databases section of the mascot dat file  To help explain these rules  the  following sections describe these rules for NCBInr  SwissProt  and  EST_others     All text searches and comparisons are case insensitive  except where  stated  Taxonomy definition keywords in the mascot dat file are also case  insensitive     Several taxonomy definition blocks are obsolete  and retained only for  backwards compatibility  Only the current definitions are described  below     174 Mascot  Installation and Setup    NCBInr    This non identical protein database from the NCBI may contain multiple  title lines for each sequence  The titles are separated by a control    A      character code
53.  it and or modify it   under the terms of the GNU Lesser General Public License as published by  the Free Software Foundation  either version 2 1 of the License  or  at   your option  any later version     This library is distributed in the hope that it will be useful  but   WITHOUT ANY WARRANTY  without even the implied warranty of  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE  See the GNU  Lesser General Public License for more details     You should have received a copy of the GNU Lesser General Public  License along with this library  if not  write to the Free Software  Foundation  Inc   59 Temple Place  Suite 330  Boston  MA 02111 1307   USA     Also add information on how to contact you by electronic and paper mail   You should also get your employer  if you work as a programmer  or your school  if any  to sign  a    copyright disclaimer    for the library  if necessary  Here is a sample  alter the names     Yoyodyne  Inc   hereby disclaims all copyright interest in the library     Frob     a library for tweaking knobs  written by James Random Hacker     xxiv Mascot  Installation and Setup    signature of Ty Coon  1 April 1990  Ty Coon  President of Vice    That   s all there is to it     Regex    Copyright 1992  1993  1994 Henry Spencer  All rights reserved  This  software is not subject to any license of the American Telephone and  Telegraph Company or of the Regents of the University of California     Permission is granted to anyone to use this software for an
54.  mixtures found  hl_score total score for mixture 1   hl _numprot number of proteins in mixture 1  hl_nummatch number of queries matched   hl _ml accession string for protein component 1  hl _m2 accession string for protein component 2    160 Mascot  Installation and Setup    hl_mm accession string for protein component m  h2_score     hn_mm      gc0p4Tq0M2Yt 08jU534c0p  The Mixture section is only output for a peptide mass fingerprint  If any  statistically significant protein mixtures are found  the mixture compo   nents are summarised  For details of individual components  use the  accession strings to refer back to the Summary section     If this is an automatic decoy database search  a second mixture block  appears  containing the second set of results  The section name is  decoy_mixture  The syntax of the contents is identical    Peptides      gc0p4Jq0M2Yt084jU534c0p  Content Type  application x Mascot  name  peptides       ql_pl missed cleavages    1 indicates no match   peptide Mr   delta     number of ions matched   peptide string   peaks used from Ions1   variable modifications string   ions score   ion series found   peaks used from Ions2   peaks used from Ions3      accession string     data for first protein  frame number   start   end   multiplicity      accession string     data for second protein  frame number   start   end   multiplicity   etc   ql_pl_et_mods modification mass   neutral loss mass   modification description  ql_pl_et_mods master neutral loss m
55.  number gt     sessionID  lt string gt      This will return either the results file name   filename  lt filename gt    or   searchcontrol error nnn   with values of    nnn    as for     status     Note that  lt filename gt  may be empty for some states     this is not an  error     This may then be used from the command line for other applications to  provide functionality that is not in ms searchcontrol exe For example  a  client application needs the USER name from a search  In this case  a  perl script    getusername pl    could be written that takes the passed  unique task ID  finds the results file name using     ms searchcontrol exe   result_file name  and then looks for the user name in the results file     ms searchcontrol exe   result file mime   taskID   lt number gt     sessionID  lt string gt      This will return the results file as a mime format file or  searchcontrol error nnn  with values of    nnn    as for      status     ms searchcontrol exe   result file ini   taskID   lt number gt     sessionID  lt string gt      This will return the results file as a windows     ini    format file or  searchcontrol error nnn    with values of    nnn    as for      status     136 Mascot  Installation and Setup    ms searchcontrol exe   results   taskID  lt number gt       sessionID  lt string gt      If the job is complete  then this will return the search results in a format  recognised by Mascot Daemon     For a peptide mass fingerprint  the output is of the form      
56.  obtained from the SearchControl utility   described later in this chapter  By specifying an identifier  progress  reports and search results can be obtained asynchronously from  SearchControl     Optional argument   sessionID is used to specify a Mascot security  session identifier   see Chapter 12      The file piped to STDIN must be a MIME format file containing the  search parameters and mass spectrometry data     Monitor test mode has a different syntax     nph mascot exe 2 3 path  number   lt  in asc    Required argument path is the path to a flag file  e g     data test   SwissProt_2011_ 06  fasta  bu253neb5renpqtv2jiiannc2y testedok  and optional argument number is the cluster number  The input file  e g      data test SwissProt asc  is created automatically from the  do_not_delete asc template     The Monitor application must be running before search engine can be  invoked  During search execution  warnings  errors  progress reports  etc   are written to standard output  STDOUT   This output is formatted as  HTML text for viewing on a web browser  If the search engine is not  being executed as a CGI application  the calling application may need to  parse the output to remove the HTML tags     When a search is complete  an HTML string is written to STDOUT   which causes the client browser to invoke the script defined in   mascot  dat for displaying a results report   master _results pl or  master results 2 p1   Ifthe search engine is not being executed as a  CGI appli
57.  or no  if  user selected enzyme type None      Column 14  User IP address    Monitor Log    Mascot Monitor activity  such as sequence database exchange  is logged  to logs monitor log  The following extract shows a typical example of  the contents        Fri Apr 20 17 21 28 2012   ms monitor 2 4 0 started   Fri Apr 20 17 21 28 2012   Locked memory for file    data mascot control   Fri Apr 20 17 21 28 2012   Waiting for valid licence   Fri Apr 20 17 30 28 2012   Licensed to  Edman University  XQ5P TFRR 3APW FB33 7H6X        Fri Apr 20 17 30 28 2012   Starting up to Checking that Mascot  Nodes exist   Fri Apr 20 17 30 28 2012   Checking that Mascot Nodes exist to Loading DB informa   tion   Fri Apr 20 17 30 28 2012   Loading DB information to Started up success   fully   Fri Apr 20 17 30 29 2012   SwissProt0O Not in use to Preparing to run  ist test   Fri Apr 20 17 30 29 2012   SwissProt0O Preparing to run lst test to Waiting   Fri Apr 20 17 30 30 2012   SwissProt0O Waiting to About to compress  files   Fri Apr 20 17 30 30 2012   SwissProt0O About to compress files to Creating compressed  files    Fri Apr 20 17 30 33 2012   Creating compressed files from  usr local mascot sequence   SwissProt current SwissProt_2012 03 fasta   Fri Apr 20 17 30 33 2012   Creating compress file  usr local mascot sequence SwissProt   current  SwissProt_2012 03 100   Fri Apr 20 17 30 33 2012   Creating compress file  usr local mascot sequence SwissProt   current  SwissProt_2012_03 s00   Fri Apr 20 
58.  proteins     EMBL EST      TAXONOMY FOR EMBL EST  Taxonomy 13    Identifier EMBL EST Fasta   Enabled 1   0 to disable it   FromRefFile 0   ErrorLevel 0   SpeciesFiles ACC2TAXID acc_to taxid mapping txt  NCBI names dmp  NodesFiles NCBI  nodes dmp  NCBI merged dmp   DefaultRule ACC2TAXID  CHOP      gt EM_EST    A Z0 9         GencodeFiles NCBI  gencode dmp   MitochondrialTranslation 0   end    The ACC2TAXID identifier is used to identify a file that contains a  simple mapping of accession to taxonomy ID  It has two values per line     Accession taxonomyID  For example     A00001 10641  A00002 9913    Chapter 9  Taxonomy 177    where A00001 and A00002 are accessions and 10641 is the NCBI tax   onomy id for    Cauliflower mosaic virus    and 9913 is the NCBI taxonomy  id for    Bos taurus       The accession and ID can be separated using any white space  The  acc_to_taxid mapping txt file from the EMBL contains entries for all the  EMBL EST databases so is very large  approximately 3Gb   The file is  created at the start of each EMBL release  every 3 months   and so does  not include the latest entries in the    cumulative    Fasta files     Performance when creating the compressed files is faster if the order of  entries in the taxonomy file is the same as the order of sequences in the  fasta file  When the ACC2TAXID file is first used or is updated  lookup  files   cdb  are created in the taxonomy directory  These files are only  used when compressing the database and are not 
59.  re   centroiding profile data  Must be a floating point number between 0 and  10  Re centroiding is applied whenever the number of peaks in a single  scan exceeds CentroidWidthCount    DecoyTypeNoEnzyme 3  DecoyTypeSpecific 1    These parameters determine how decoy sequences are created for Mascot  Auto decoy searches  DecoyTypeSpecific applies to MS MS searches using  fully specific or semi specific enzymes  DecoyTypeNoEnzyme applies to MS   MS searches with no enzyme  For PMF  random protein sequences are used   whatever the settings  For NA databases  the sequences are randomized  before translation  Classifications are based on G  Wang  et al   2009       Decoy Methods for Assessing False Positives and False Discovery Rates in  Shotgun Proteomics     Anal Chem  81 1  146 159  Values supported in  Mascot 2 4 are     1 Reverse the sequence of each protein entry     Chapter 6  Configuration  amp  Log Files 83    3 For each protein entry  generate a random sequence of the same  length  with the composition based on the average composition of the  whole database  This is the default in Mascot 2 3 and earlier     4 Digest each protein sequence into peptides  then generate a random  sequence for each peptide  but keep the same terminal residues and  don t introduce new cutting sites     EmailErrorsEnabled 0  EmailFromTextName  EmailFromUser   EmailPassword   EmailProfile   EmailService  EmailTimeOutPeriod 120  EmailUsersEnabled 0  ErrMessageEmailTo  MailTempFile C  TEMP MXXXXXX
60.  returned task_id    e You can monitor   control the running search using ms   searchcontrol exe    A simpler     static    system could be implemented by adding a  SUBCLUSTER command to a Daemon parameter set  SwissProt   SCl1 par might contain SUBCLUSTER 1  so selecting this for a task  would direct searches to sub cluster 1  etc     Database Status    If multiple sub clusters are defined  the database status screen  ms   status exe  only shows one sub cluster at a time  An additional summary  table is shown at the bottom of the page  with links for the other sub   clusters     214 Mascot  Installation and Setup    215       Security       Overview    The security model allows a Mascot administrator to     e Prevent un authorised changes of Mascot server configuration  files using  for example  the database maintenance utility    e Restrict access to results files and sequence databases based on  group and user definitions    e Provide standard    session    support  with time outs  so users do  not need to continually re enter passwords    e Restrict access to Mascot server based utilities that allow dele   tion of searches and other job control functions    e Provide read only access to configuration files for third party  applications without requiring login    e Optionally allow submission of searches etc  for 3rd party appli   cations without a login    e Switch OS platform painlessly if Mascot or Mascot Integra  authentication is used    e Easily set up Mascot Daemon 
61.  should  only be run if search fulfills criteria for running Percolator    The title string will be displayed in the search progress while the process  is running  This string must not contain a comma    The command string can include literals and also the following tags   which will be substituted at run time     Tag Replaced with    resultfilepath Relative path from the cgi directory to the  results file    resultfilename File name part of  resultfilepath   percolator_pip Relative path from the cgi directory to the    Percolator input file     percolator_decoy_pop Relative path from the cgi directory to the  Percolator output file for the decoy matches     percolator_target_pop Relative path from the cgi directory to the  Percolator output file for the target matches   session_id is the session  identifier of the logged in user when Mascot Security is enabled    task_id is the task identifier assigned using client pl when called from  client applications      PercolatorExeFlags The value of PercolatorExeFlags    Paths to executables and any paths included as arguments should use  forward slashes and should not include spaces    FeatureTableLength 30000    If a nucleic acid sequence is longer than 30000 bases  the protein view  report will automatically switch to feature table mode and output the  matches as a GenBank feature table  The threshold for switching to  feature table mode can be altered using the parameter  FeatureTableLength in the Options section of mascot dat o
62.  source and binary forms  with or without  modification  are permitted provided that the following conditions are  met     Redistributions of source code must retain the above copyright notice   this list of conditions and the following disclaimer  Redistributions   in binary form must reproduce the above copyright notice  this list of  conditions and the following disclaimer in the documentation and or  other materials provided with the distribution  Neither the name of   the University of Chicago nor the names of its contributors may be  used to endorse or promote products derived from this software without  specific prior written permission     THIS SOFTWARE IS PROVIDED BY THE UNIVERSITY OF CHICAGO AND CONTRIBU   TORS      AS IS    AND ANY EXPRESS OR IMPLIED WARRANTIES  INCLUDING  BUT NOT  LIMITED TO  THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A  PARTICULAR PURPOSE ARE DISCLAIMED  IN NO EVENT SHALL THE UNIVERSITY OF  CHICAGO OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT  INDIRECT  INCIDENTAL   SPECIAL  EXEMPLARY  OR CONSEQUENTIAL DAMAGES  INCLUDING  BUT NOT LIM   ITED   TO  PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES  LOSS OF USE  DATA  OR  PROFITS  OR BUSINESS INTERRUPTION  HOWEVER CAUSED AND ON ANY THEORY  OF   LIABILITY  WHETHER IN CONTRACT  STRICT LIABILITY  OR TORT  INCLUDING  NEGLIGENCE OR OTHERWISE  ARISING IN ANY WAY OUT OF THE USE OF THIS  SOFTWARE  EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE     II   Copyright  c  1995 1998    The University of Ut
63.  the  fragment peaks  which may give rise to spurious fragment ion matches   It is usually best if the precursor is removed before the search     With the default arguments of    1    1  a smart filter is created  This  removes peaks within the fragment ion tolerance window about each of  the precursor isotope peaks  The number of isotopes is assumed to be as  follows     Mr Number    lt  1000 3   1000   1999 4  2000   2999 5  3000   3999 6  4000   4999    7  5000   5999 8  6000   6999 9   gt  7000 10    So  if the precursor m z was 800  the charge was 2  and fragment ion  tolerance was       0 1 Da  the filter would remove 4 notches of width    m z 800 0     0 1  m z 800 5     0 1  m z 801 0     0 1  m z 801 5     0 1    At first sight  this may seem a strange mix of m z and Da  The reason is  that we need to avoid matches from 1  fragment ions  whatever the  charge on the precursor     If the arguments are anything other than    1    1  a single notch is used  where the first argument is the mass offset of the beginning of the notch  and the second value is the mass offset of the end of the notch  For the  precursor in the last example  if the arguments were    1 4 then the notch  would run from m z 799 5 to m z 802 0  However  if the precursor charge  was 1  then the notch would be from m z 799 to m z 804     The mascot dat setting can be over ridden in a search by using the  search parameter CUTOUT  Note that the peaks removed by this filter  are not recorded in the resul
64.  the Mascot authenti   cation  A typical case for this might be for a service lab manager running  Windows and IIS with integrated authentication  This user would not  typically want to create a separate Windows login account for the admin   istrator  but would choose to login explicitly as administrator to update  configuration files etc  For an Apache server  with authentication  switched on  most users would want to be set to use the authenticated  login    Users  New users are added using the Mascot security administration utility   There are 6 special    system    user accounts   guest    The guest user is not enabled by default  If this account is enabled  then  any user is automatically logged in as guest  and needs to explicitly login  as another user to gain further access rights  The guest account cannot  be deleted  but the account can be disabled  The userid is 1     Chapter 12  Security 217    admin    This account should be used to perform administration on the Mascot  server  It is recommended that you always log in as administrator to  perform security and other administration rather than assign adminis   trator rights to another user  The administrator account cannot be  deleted or disabled and the admin user cannot be removed from the  administrators group  By default  the administrator can access all the  administrator screens  but cannot submit searches  The userid is 2  The  initial password for admin is admin  but this must be changed on first  login    
65.  the add_user pl script in  the mascot bin directory can be used     Usage  add_user pl  u username   p password   x password expiry   f fullname   e email address   g group to which user should belong    The password expiry should be 0 for never expires or 1 to force the user  to change the password when they first log in     Chapter 12  Security 223    Resetting the administrator password    If the admin user password is lost  the easiest way to reset it is to re run     enable_security pl    from the command line as described above  This will  not affect any existing groups or users  but will just reset the password     User ID    The user ID for each search is saved in the results file  If security is  disabled  then the search ID will be set to zero  Special user IDs are  listed above  Other users will have an automatically assigned IDs start   ing at 1000     224 Mascot  Installation and Setup    225       Basic Regular Expressions       equence database parsing in Mascot is defined using rules which   conform to Basic Regular Expression  BRE  notation as defined in   standard ISO IEC 9945 2  1993  BRE notation is widely used in  Unix  e g  in the grep command  but it may be less familiar to those from  a DOS or Windows background  Man pages containing a rigorous defini   tion of BRE notation can be found on most Unix systems     The following description is much simplified  and is intended to provide  just enough information to understand the existing rules in  mascot 
66.  the first  part of this chapter for information on installing GD pm        12 Mascot  Installation and Setup    Step 4  Configuration       Mascot Installation  So CS fi Obogong t 2_4_0_64 cgi install2 pl  Mascot Installation    You should see a blue rectangle below containing the text  GD OK  in  red           GD OK          If you do not see this picture  then the GD library is not functioning  correctly        Step 4  Configuration       Configure Mascot for single server operation  oO Configure Mascot for master node of a cluster    Click here to continue  Configure Mascot          Copyright   2011 Matrix Science Ltd  All Rights Reserved        Indicate whether you plan to configure Mascot as a single  SMP  server  or a cluster and choose    Configure Mascot     If this is a version upgrade   the main configuration file  mascot  dat  will be updated  If it is a clean  install  a new mascot  dat will be created     Chapter 2  Installation  Linux 13    Step 5  Start Mascot Monitor    Mascot Installation    Step 5  Start Mascot Monitor  As root  enter the following at a shell prompt     ed  usr local mascot_2_4 0 64 bin     ms monitor  exe    Then  follow this link to the Mascot Database Status page    Unless there is already a valid licence file in place for this version of  Mascot  you will be asked to register your product key     The system will be ready for searches when all databases show as    In  Use        Copyright    2011 Matrix Science Ltd  All Rights Reserved  
67.  this section if you ever use Database Manager    The WWW section defines where CGI scripts look for the information  needed to compile a results report     At least one line is required for each database  to define the source from  which the sequence string of a database entry can be obtained  A second  line can optionally define the source from which the full text report of an  entry can be obtained  The syntax is very similar in both cases  inde   pendent of whether the information originates locally or on a remote  system     Sequence strings can always be retrieved locally  because the FASTA file  must be present on a local disk  The Mascot utility ms getseq  exe is  normally used to retrieve a sequence string     If full text for an entry is available locally and the database has been  defined as including a ref file   Column 10 in the Database section of  mascot  dat   ms getseq exe can be used to retrieve the full text   Otherwise  a utility or URL must be identified which can accept an  accession string and return the report text in a parseable format  An  example of a suitable external URL for full annotation text is shown in  the example for Trembl  below    Each line in the WWW section contains 5 columns     Chapter 6  Configuration  amp  Log Files 79    WWW    Trembl SEQ    8       localhost       80       c  inetpub mascot x cgi ms getseq exe  Trembl  ACCESSION  seq     Trembl REP    23       www uniprot org       80        uniprot   ACCESSION  txt       end    
68.  to main status page             From which  links allow details of any specific search to be displayed     Mascot Job status   Job 1276    Database SwissProt   Job Number   1276   Process ID 6547   Task ID   0   User Name   User ID   0   User email   Search title   MS MS Example   Percent complete 27    Intermed file af di 422 F00127  Start time Sun Apr 22 13 42 04 2012  End time   Searching    time   Upload time   Query prep time   Whole process time    Job status Searching       Priority   current value 0  Change this by  5  1  1  5  IP address 192 168 8 3   Type of srch   NIS   Enzyme  Yes   CPU utilisation   0    Job requests  No requests  Kill   Pause   Resume    Back to main status page          Chapter 7  Program Reference 123    Status can also be used to print Mascot configuration and result files to  STDOUT  This provides a method to display these files in a browser  For  example     http   your_server mascot x cgi ms status exe Show MS_ENZYMES    where the argument to Show determines the file to be displayed        MS_ENZYMES enzymes   MS_ FRAGMENTATION RULES fragmentation_rules  MS_MASCOT_DAT mascot dat  MS_MASSES masses  MS_MOD_FILE mod_file  MS_QUANTITATIONXML quantitation xml  MS_SUBSTITUTIONS substitutions  MS_TAXONOMY taxonomy  MS_UNIMODXML unimod xml   MS_USERS users xml    The above files are all displayed as plain text  without any formatting  If  Show RESULTFILE  then a results file from any directory under mas   cot data can be returned  with HTML forma
69.  to the list of Inbound Rules     Windows 7     On each search node  log in as a user with local administrator rights  Go  to Control Panel  Network  amp  Sharing Center and ensure the network  connection to the master node is described as Work  If it shows as Public   click on the hyperlink to change it  Choose Change Advanced sharing  settings and ensure File and Printer Sharing is enabled     194 Mascot  Installation and Setup                         ee   Go      All Control Panel Items    Network and Sharing Center oa    r  I Search Contro  Pane 2     Control Panel Home    View your basic network information and set up connections    Change adapter settings W Fo Q See full map    Ch  dvanced sh  ange advanced sharing EC VM64 Network 2 Internet  settings   This computer     View your active networks Connect or disconnect    Network 2 Access type  Internet  Work network Connections  Local Area Connection         Change your networking settings   g   Set up a new connection or network    lt   Set up a wireless  broadband  dial up  ad hoc  or VPN connection  or set up a router or access  point     Connect to a network  Connect or reconnect to a wireless  wired  dial up  or VPN network connection   a Choose homegroup and sharing options    Acct       ss files and printers located on other network computers  or change sharing settings     Sj Trouble       cot problems    HomeGroup Diagnose and repair network problems  or get troubleshooting information     Internet Options    W
70.  wish to avoid the danger that redistributors of a free  program will individually obtain patent licenses  in effect making the  program proprietary  To prevent this  we have made it clear that any  patent must be licensed for everyone   s free use or not licensed at all     The precise terms and conditions for copying  distribution and  modification follow     GNU GENERAL PUBLIC LICENSE  TERMS AND CONDITIONS FOR COPYING  DISTRIBUTION AND MODIFICATION    0  This License applies to any program or other work which contains  a notice placed by the copyright holder saying it may be distributed  under the terms of this General Public License  The    Program     below   refers to any such program or work  and a    work based on the Program     means either the Program or any derivative work under copyright law   that is to say  a work containing the Program or a portion of it   either verbatim or with modifications and or translated into another  language   Hereinafter  translation is included without limitation in  the term    modification      Each licensee is addressed as    you        Activities other than copying  distribution and modification are not  covered by this License  they are outside its scope  The act of  running the Program is not restricted  and the output from the Program  is covered only if its contents constitute a work based on the   Program  independent of having been made by running the Program    Whether that is true depends on what the Program does     1  
71.  you will need to change the  shebang lines of all scripts to something similar to       c  perl bin perl exe    User authentication     Apache provides several ways to restrict access to directories or files   One method is to limit access to clients from a range of IP addresses or a  particular domain  Another method is to require a username and pass   word  which may be a convenient way for a system administrator to limit  access to the x cgi directory     Setting up user authentication takes two steps  firstly  creating a file   containing the usernames and passwords  Secondly  telling the server  what resources are to be protected and which users are allowed  after  entering a valid password  to access them     Creating a User Database    A list of users and passwords needs to be created in a file  For security  reasons  this file should not be under the document root  This example  assumes the file is called  usr local mascot config  passwd     The file will consist of a list of usernames and a password for each  The  format is similar to the standard Unix password file  with the username  and password being separated by a colon  However you cannot just type  in the usernames and passwords because the passwords are stored in an  encrypted format     238 Mascot  Installation and Setup    The program htpasswd is used to add create a user file and to add or  modify users  This can be found in the bin directory of the Apache  distribution  To create a new user file and add 
72. 0 23   day of the  month  1 31   month of the year  1 12   day of the week  0 6 with 0 Sun   day   Each of these patterns may be an asterisk  meaning all legal  values   a range of integers or a list of comma separated integers     An element is either a number or two numbers separated by a minus  sign  meaning an inclusive range   Note that days may be specified in  two different ways  day of the month and day of the week   If both are  specified as a list of elements  both are adhered to  For example     OO debe aL    104 Mascot  Installation and Setup    Log files    would run a command on the first and fifteenth of each month  as well as  on every Monday  To specify days by only one field  the other field should  be set to    for example  0 0     1 would run a command only on  Mondays      The sixth field is a string that is executed by the shell  command prompt   at the specified times  The string must be on a single line  The entire  string  up to the end of the line  is passed to the command prompt for  execution  The part of the string up to the first space must be the fully  qualified path to an executable  The remainder of the line will be passed  to the command as parameters     Mascot maintains several log files  which are described below  When  trouble shooting  it can be useful to inspect the web server log files  also   Errors in Perl scripts  for example  will be appear in the web server error  log  not the Mascot error log     Error Log    All errors are logg
73. 0 Mascot  Installation and Setup     lt msgt accession_ str gt RL19 YEAST lt  msgt accession_str gt    lt msgt   taxonomy gt    lt msgt  db gt SwissProt lt  msgt  db gt    lt msgt  taxonomy _id gt 4932 lt  msgt taxonomy_id gt    lt msgt scientific name gt Saccharomyces cerevisiae lt    msgt scientific_ name gt    lt msgt translation_ table id gt 1 lt  msgt translation_ table id gt    lt msgt  common_names gt    lt msgt synonym gt Candida robusta lt  msgt  synonym gt    lt msgt Synonym gt Saccaromyces cerevisiae lt  msgt  synonym gt    lt msgt synonym gt Saccharomyces capensis lt  msgt  synonym gt    lt msgt synonym gt Saccharomyces italicus lt  msgt synonym gt    lt msgt synonym gt Saccharomyces oviformis lt  msgt  synonym gt    lt msgt synonym gt Saccharomyces uvarum var  melibiosus lt    msgt   synonym gt    lt msgt Ssynonym gt Saccharomyes cerevisiae lt  msgt   synonym gt    lt msgt synonym gt Sccharomyces cerevisiae lt  msgt  synonym gt    lt msgt  Synonym gt YEAST lt  msgt   synonym gt    lt msgt Ssynonym gt baker amp apos s yeast lt  msgt  synonym gt    lt msgt  Synonym gt brewer amp apos s yeast lt  msgt  synonym gt    lt msgt synonym gt lager beer yeast lt  msgt  synonym gt    lt msgt   Ssynonym gt yeast lt  msgt   synonym gt    lt  msgt  common_names gt    lt msgt tree gt    lt msgt node level  12  gt Saccharomyces cerevisiae lt    msgt  node gt    lt msgt node level  11  gt Saccharomyces lt  msgt  node gt    lt msgt node level  10  gt Saccharomycetaceae lt  msgt  no
74. 0 residues   position 0 would indicate the amino terminus and position 11 would  indicate the carboxy terminus  If there is no location information  the  range is output as 0 256    hn_qm_terms shows the residues the bracket the peptide in the protein   If the peptide forms the terminus of the protein  then a hyphen is used  instead     hn_qm_subst is output when the matched peptide contained an ambigu   ous residue   B  X  or Z   The argument is one or more triplets of comma  separated values  For each triplet  the first value is the residue position   the second is the ambiguous residue  and the third is the residue that  has been substituted to obtain the reported match     For a large MS MS search  num_hits is set to zero  and the summary  block only contains entries for qmassn  gexpn  qmatchn   qplugholen  The threshold for switching to this mode is specified using  two parameters in the Options section of mascot  dat  SplitDataFileSize  is the size of the search process in bytes   default 10000000   and  SplitNumberOfQueries is the size of the search in queries   default  1000      If this is a two pass search  either an automatic decoy database search or  an automatic error tolerant search  a second summary block appears   containing the second set of results  The section name is either  et_summary or decoy_summary  The syntax of the contents is identical    Mixture      gc0p4Jq0M2Yt084jU534c0p  Content Type  application x Mascot  name  mixture       num_hits number of
75. 1   is  contained in the HTML help pages  Choose Help from the Mascot main  menu bar and then choose Quantitation     Database Manager    Database Manager is mainly described in the HTML help pages  Choose  Help from the Mascot main menu bar and then choose Sequence Data   base Setup  Database Manager     Configuration Options  The is a simple interface to the Options section of mascot   dat  which    contains a variety of global settings  Reference material can be found  below     Chapter 6  Configuration  amp  Log Files 75    mascot dat    Two sections of mascot   dat  Processors and Cluster  have no interface  in either Database Manager or Configuration Options  and the only way  to make changes is to edit mascot  dat     Windows users should note that the path delimiters used in mascot  dat  must always be forward slashes  never the backward slashes used at the  command prompt  If sequence database files are not on a local disk drive   the remote drive must be mapped to a local drive letter  UNC path  specifications cannot be used  Finally  spaces are not allowed in file or  directory names  Hence     C  InetPub mascot config mascot dat correct yY   C  InetPub mascot config mascot dat wrong       matrix_nt_01 InetPub mascot config mascot dat wrong x     matrix_nt_01 InetPub mascot config mascot dat wrong x  General    mascot dat is divided into sections  Each section starts with a unique  keyword and ends with the keyword    end        Comments and blank lines can be used
76. 1  Identifier  An identifier constructed from the name of the database   an underscore character  and either the keyword SEQ or REP  Thus   Tremb1_ SEQ is the source for the sequence string of an entry in the  database called Trembl     2  Parse rule  The index of a rule in the PARSE section that can be used  to extract the information required  Note that the rule for parsing a  sequence string from ms getseq exe the same for all databases     3  Host  The information source  For ms getseq  exe or a similar local  executable  this column should contain localhost  For a remote source   or a local source that will be queried as a CGI application  enter the  hostname   NB the word localhost is used to determine whether the  application is a command line executable or a CGI application  If you  want to specify a CGI application on the local server  just specify the  hostname in some other way  for example 127 0 0 1      4  Port  The port number  This should be left at 80 unless another value  is required to access a web server operating on a non default port     5  Path  A string containing the path to the executable and parameters   some of which are variables     In the case of a command line executable  the parameters will generally  be delimited by spaces  In the case of a CGI application  the parameters  may be delimited from the executable by a question mark  and there  must be no spaces within the parameter string  In general  spaces in  URL   s must be replaced by plus sy
77. 100  longest Longest sequence matched ions  reported separately    for each ion series as with    fracIonsMatched     backbone only      142 Mascot  Installation and Setup    fracIonsMatched Fraction of calculated ions matched  reported sepa   rately for each ion series  with NLs lumped to   gether  e g  fracIonsMatchedB1   fracIonsMatchedBlderiv  fracIonsMatchedB2   fracIonsMatchedB2deriv   matchedIntensity Matched ion intensity  reported separately for each  ion series  as with fraclIonsMatched  numUnigPeps   The excess of the number of unique peptide matches   unique primary sequence  over the number of  matches expected by chance  qmatch The number of peptide matches for which an ms ms  match was attempted  peptide The peptide string that was matched  proteins A tab separated list of accessions of proteins that  contain this peptide  Must be last        indicates that this feature is not implemented    Error codes  Return Description    1 Invalid parameters  Use    help for help  Missing or invalid mascot dat  Error   No MS MS spectra in results file    Automatic decoy search not enabled    2  3  4   5 Insufficient number of queries   6 Insufficient number of sequences searched   7 Cannot read the results file  Error    8 Failed to create output file     9 Invalid feature in mascot dat options      10 Invalid feature for  a option      11 Invalid feature for  r option   Miscellaneous Utilities    Service    Supplied for Windows only  This application shows the status of t
78. 17 30 33 2012   Creating compress file  usr local mascot sequence SwissProt   current  SwissProt_2012 03 a00   Fri Apr 20 17 30 33 2012   Creating compress file  usr local mascot sequence SwissProt   current  SwissProt_2012_03 t00   Fri Apr 20 17 30 33 2012   Creating compress file  usr local mascot sequence SwissProt   current  SwissProt_2012 03 stats   Fri Apr 20 17 32 26 2012   SwissProt0O Creating compressed files to Finished compress   ing files   Fri Apr 20 17 32 26 2012   SwissProt0O Finished compressing files to Running 1st test          Fri Apr 20 17 32 33 2012   SwissProt0O Running 1st test to First test just run  OK   Fri Apr 20 17 32 33 2012   SwissProtO First test just run OK to Waiting for other  DB to end    Fri Apr 20 17 32 33 2012   SwissProt0O Waiting for other DB to end to Trying to memory  map files   Fri Apr 20 17 32 33 2012   SwissProtO Trying to memory map files to Just enabled memory  mapping   Fri Apr 20 17 32 33 2012   SwissProtO Just enabled memory mapping to In use       Chapter 6  Configuration  amp  Log Files 107    IPC Log    In cluster mode  only  an interprocess communication log can be enabled  by setting  PCLogging  in the cluster section of mascot   dat  to 1 or 2   This log can be used to investigate communications errors at the socket  level     108 Mascot  Installation and Setup    109       Program Reference       ascot implements a client server architecture using the HTTP  protocol   web server   web browser   In this mode  the search
79. 282 6906 1262 6447 35 0 i 4 3 5e2 2  2 AG 949 5507 1897 0869 1697 0520 14 o 3 2 7002 1  bd 55 1099 0947 2196 1749 2196 1646 4 67 2 3 2 7e 02 1  z 67 1116 1775 3345 5106 3345 7564  73 45 1 2 1 4e02 1  a 10 747 4125 746 4052 746 3633 56 2 0 2 1 1e 03 1  z 20 933 4990 932 4917 932 4498 45 0 2 2 8 5e 02 2  2 3 741 3647 710 3574 710 3711  19 31 0 2 4 1e0  2  2 32 711 3707 1420 7269 1420 7054 isa 0 2 6 3e 02 2  Ll 498 2729 497 2656    2of3 21 04 2012 15 43    Chapter 4  Validation 55    3 575 5584 574 5511  662 4172 661 4099  930 6831 929 6758  930 7030 929 6957  932 4608 1862 9071  933 0038 1863 9930  z al 665 0096 1992 0069   50 1068 5615 2095 1085          2 63 832 7986 2495 3739  66 1113 8947 3338 6621    Search Parameters    Type of search i MS MS Ion Search  Enzyne   Trypsin P   Fixed modifications   Carbamidomethyl  C   Variable modifications   Oxidation  M    Mass values   Monoisotopic  Protein Mass   Unrestricted    Peptide Mass Tolerance     100 ppm  Fragnent Mass Tolerance  t 0 1 Da   Max Missed C  s  4   Instrument type   ESI QUAD TOF  Number of queries   67       Mascot  http  www  matrix science com        3 of 3 21 04 2012 15 44    56 Mascot  Installation and Setup    57       Sequence Database Setup       equence database URL   s and formats change constantly  Provided   your Mascot Server can connect to the Internet  Mascot Database   Manager will keep database definitions up to date automatically  for many popular public databases  For each database  you can confi
80. 4 mascat x cai ms status exe x   ce    Links  Mascot search status page i       Fatal error  failed to initialise the memory map  Please try the following     o Check that the Mascot service is running    From the start menu  choose Programs  Mascot  config  Show Mascot Service Status   Ifthe screen says that the Mascot service is NOT running  then start the service    From the start menu  choose Programs  Mascot  config  Start Mascot Service    o Try refreshing this screen  Click here       Technical Information  for support personnel     Command line parameters were  0 C   INETPUB MASCOT  x cgi ms status exe       NG        E  Done   fa Local intranet       There are several possible causes   Service not started    Since one of the first things that the Monitor service does is to create the  memory mapped file  this could indicate that the service has not started   You can tell whether the service has started by choosing Start  Pro   grams  mascot  config  Show Mascot ms monitor service status     If the service is not running  check the monitor log and  errorlog txt file inthe logs directory  If there is nothing in those  files  then it may be necessary to try and run ms monitor exeasa  command line executable  You should only do this if the Mascot service is  not running  To do this  open a command prompt window  and change  directory to the mascot bin directory  If your installation path was the  default  you will need to type     cd  Inetpub mascot bin  next start the m
81. 76190  beta actin  Trichosurus vulpecula     Other parameters are possible for ms gettaxonomy exe   see the refer   ence section    ms gettaxonomy    in Chapter 7     Common Questions    Why do I sometimes get results for a species I didn   t  specify     Sometime  when specifying for example    Human    species  the results may  appear at first sight to be from for example a Mouse sample  The most  common reason for this is that  for a non redundant database  exactly  the same sequence has been found in many species  To check this  look at  the protein view  where you should see at least one entry for the species  you selected     Chapter 9  Taxonomy 173    What is the    unclassified    and    other    species     The NCBI cannot always classify every sequence   either because no  species information was supplied with the data or because it currently  doesn   t fit into any currently classification  There are about 1500 such  sequences in the NCBI nr database        Other    species include plasmids  and artificial sequences     How do I see which sequences Mascot couldn   t assign a  taxonomy ID     In the status screen  click on the    Unidentified taxonomy    link  This will  show sequences where one or more of the species names were not identi   fied by Mascot     Why do I get the message    Taxonomy    xxx    ignored  No  taxonomy indexes for this database       Check the following     In the mascot dat file  is parameter 14 for the problem database a valid  number and
82. 8 83  Port number  5001  Number of processors to use on this node  2            Use the Browse button to ensure that the UNC path to the node is  correct  If the machines are in a Windows domain  and the remote drive  is not explicitly shared  you can enter C  for drive C  etc   to use the  administrative shares  If the base directory does not exist  create it using  the    Make New Folder    button  The recommended base folder name is  MascotNode     Ensure that the local path to the MascotNode directory matches the  UNC path  This must be a local or mapped drive on the node so that the  path can be specified using a drive letter  The dialog will try to guess the  local path from the UNC path  but it may get it wrong  Ensure that this  path is correct before pressing OK     It is not necessary to fill in the Host name and IP Address fields unless  the node is a multi homed system and it is necessary to define which  network interface will be used for communication with the Mascot  master     The default port number for cluster communication is 5001  If there are  conflicts  this can be changed    The number of processors must be specified  The total number of proces   sors specified for all nodes can be greater than the number of processors  in the Mascot licence  The surplus processors will then behave as    hot     186 Mascot  Installation and Setup    spares     to be swapped into the cluster as required if there is a hardware  problem on another node     NOTE  If the mas
83. 9       Error Messages       complete listing of Mascot error codes  messages  and explana  tions can be found at the URL mascot cgi ms geterror exe  ALL     3  Mascot Errors   Netscape    File Edit View Go Cor Ip     is 34 2 a0 826 8    E  Back Foward Reload Home Search Netscape Print Security Shop Stop    7     Bookmarks    Location   htp   192 168 42 13 mascot ogi ms geteror exe7ALL    E7 What s Related  A          unic              M00027    Sorry  the database   database name   is not currently available for  searching    Further help     Only databases listed in mascot  dat can be searched  It is possible that there was an error when the database was  coming on  line   To check this and to see the status of all databases  look at the status screen  there s a link to it from the home page on the  intranet versions   To add a new database  see the  Sequence Database Setup  chapter     Actions     e Show message to end user     Terminate search  e Message put into errorlog  txt file  e Message not put into monitor log file     Message not emailed to the Mascot administrator  a Message not put into the search results file         ah SD     Document  Done ES a Se  ga te          The same text can also be found in the file errors  htm1 in the root  directory of the Mascot CD ROM     230 Mascot  Installation and Setup       System Limits    231       Number of different modifications in unimod xml unlimited    Number of enzymatic peptides per sequence user definable  MaxNumPep
84. ARTICULAR PURPOSE  THE ENTIRE RISK AS  TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU  SHOULD THE  PROGRAM PROVE DEFECTIVE  YOU ASSUME THE COST OF ALL NECESSARY SERVIC   ING   REPAIR OR CORRECTION     12  INNO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRIT   ING  WILL ANY COPYRIGHT HOLDER  OR ANY OTHER PARTY WHO MAY MODIFY AND OR  REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE  BE LIABLE TO YOU FOR DAM   AGES   INCLUDING ANY GENERAL  SPECIAL  INCIDENTAL OR CONSEQUENTIAL DAMAGES  ARISING  OUT OF THE USE OR INABILITY TO USE THE PROGRAM  INCLUDING BUT NOT LIMITED  TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED  BY  YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY  OTHER    End User Licence Agreements xiii    PROGRAMS   EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE  POSSIBILITY OF SUCH DAMAGES     END OF TERMS AND CONDITIONS    bzip2    This program     bzip2     the associated library    libbzip2     and all  documentation  are copyright  C  1996 2005 Julian R Seward  All  rights reserved     Redistribution and use in source and binary forms  with or without  modification  are permitted provided that the following conditions  are met     1  Redistributions of source code must retain the above copyright  notice  this list of conditions and the following disclaimer     2  The origin of this software must not be misrepresented  you must  not claim that you wrote the original software  If you use this  sof
85. BR gt    Siete ioe 60  complete lt BR gt    Lisa is 70  complete lt BR gt     B ee Gs 90  complete lt BR gt    271397 sequences and 86500527 residues checked  lt BR gt    lt SCRIPT LANGUAGE  JavaScript    gt      lt     Begin hiding Javascript from old browsers    if  window navigator userAgent indexOf     MSIE         1     window  location replace       cgi master_results pl file         amp     data 20090312 F002642 dat           else if  window location replace    null       window  location assign     cgi master_results pl file       amp     data 20090312 F002642 dat            else    window  location replace       cgi master_results pl file       amp     data 20090312 F002642 dat              End hiding Javascript from old browsers     gt    lt  SCRIPT gt    lt NOSCRIPT gt      lt A HREF     cgi master_results pl file    data 20090312 F002642 dat  gt       amp  Click here to see Search Report lt  A gt     lt  NOSCRIPT gt     lt  BODY gt  lt  HTML gt     The executable called nph mascot1 exe is for Mascot TD     BIG    Mas   cot   where the precursor mass limit of 16 kDa has been removed  It will  only be used for searches if enabled in the licence     112 Mascot  Installation and Setup    Monitor    The primary function of Mascot Monitor  bin ms monitor  exe  is to  manage the sequence databases  Monitor must be running in order for  the search engine to execute  Under Linux this runs as a daemon  and  under Windows this runs as a service     Monitor does the following
86. COPYRIGHT HOLDERS AND CONTRIBUTORS      AS IS    AND ANY EXPRESS OR IMPLIED WARRANTIES  INCLUDING  BUT NOT LIM    ITED TO  THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A   PARTICULAR PURPOSE ARE DISCLAIMED  IN NO EVENT SHALL THE COPYRIGHT   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT  INDIRECT  INCIDENTAL    SPECIAL  EXEMPLARY  OR CONSEQUENTIAL DAMAGES  INCLUDING  BUT NOT LIM    ITED TO  PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES  LOSS OF USE    DATA  OR PROFITS  OR BUSINESS INTERRUPTION  HOWEVER CAUSED AND ON ANY   THEORY OF LIABILITY  WHETHER IN CONTRACT  STRICT LIABILITY  OR TORT    INCLUDING NEGLIGENCE OR OTHERWISE  ARISING IN ANY WAY OUT OF THE USE   OF THIS SOFTWARE  EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE           C Clustering Library       The C clustering library   Copyright  C  2002 Michiel Jan Laurens de Hoon     End User Licence Agreements xxvii    This library was written at the Laboratory of DNA Information Analysis   Human Genome Center  Institute of Medical Science  University of Tokyo   4 6 1 Shirokanedai  Minato ku  Tokyo 108 8639  Japan     Contact  mdehoon    AT    gsc riken jp    Permission to use  copy  modify  and distribute this software and its  documentation with or without modifications and for any purpose and  without fee is hereby granted  provided that any copyright notices  appear in all copies and that both those copyright notices and this  permission notice appear in supporting documentation  and that the  names of 
87. Default Port    Sets the default port number to be used when this parameter is missing  from nodelist txt  Recommended default is 5001    UseCompleteDatabase    Not used  If specified  must be set to 1     206 Mascot  Installation and Setup                                             nodelist txt    This file is used to define the nodes that belong to the cluster  For a very  large cluster  it is advisable to define a few percent of additional nodes as     spares     For example  if 51 nodes with 102 processors were available  and  Mascot was configured to use 2 sub clusters  each of 50 processors  the  node with the 2 spare processors could be used to replace a failed node  automatically     Cluster node definitions    Each line begins with the word Node  followed by a space and  then a comma delimited list of configuration parameters   ip address port  computer  host  name  maximum number of node CPU   s to be used  operating system  local path to home directory  home directory as seen from master  specify for NT master only     Node 10 0 0 1 5001  searchOl  2  Windows NT  c  MascotNode          search01 c  MascotNode  Node 10 0 0 2 5001  search02  2  Windows NT  c  MascotNode    amp    search02 c  MascotNode  Node 10 0 0 3 5001  search03  2  Windows NT  c  MascotNode       search03 c  MascotNode  Node 10 0 0 4 5001  search04  2  Windows NT  c  MascotNode       searchod4 c  MascotNode  Node 10 0 0 5 5001  search0O5  2  Windows NT  c  MascotNode       search05 c  MascotNode 
88. ENHANCEMENTS  OR MODIFICATIONS     Linux glibc  section 6b applies     Appendix G GNU Lesser General Public License    Version 2 1  February 1999  Copyright    1991  1999 Free Software Foundation  Inc   59 Temple Place   Suite 330  Boston  MA 02111 1307  USA    Everyone is permitted to copy and distribute verbatim copies  of this license document  but changing it is not allowed      This is the first released version of the Lesser GPL  It also counts  as the successor of the GNU Library Public License  version 2  hence the  version number 2 1      G 0 1 Preamble  The licenses for most software are designed to take away your freedom to share and change it     By contrast  the GNU General Public Licenses are intended to guarantee your freedom to share  and change free software   to make sure the software is free for all its users     xvi Mascot  Installation and Setup    This license  the Lesser General Public License  applies to some specially designated  software   typically libraries   of the Free Software Foundation and other authors who decide to  use it  You can use it too  but we suggest you first think carefully about whether this license or  the ordinary General Public License is the better strategy to use in any particular case  based  on the explanations below     When we speak of free software  we are referring to freedom of use  not price  Our General  Public Licenses are designed to make sure that you have the freedom to distribute copies of  free software  and ch
89. GVAVLK V  z i   456 7806 911 5467    59 0 00072 a y K VGLQVVAVK A   gt  22 480 7647 959 4748 o 46 0 033 2 U  R VTOAMNATR A  2 24 595 7855 1189 5565 0  s7  0 002 1 v KR  EICHIISDNE KC  2  25 605 7720 1205 5294 1205 5962 o 60 1 v K EIGNIISDAK K   Oxidation 0     26 608 3099 1214 6052 1214 6507 o a U  K NAGVEGSLIVEK I  27 617 2857 1232 5569 1232 5085    a v K  VGGTSDVEVNEK  K  s 31 672 8375 1343 6605 1343 7085 o 1 v R TVIIEQSWGSPK V  2e 34 714   7623 1427 8058 o a U  R GVMLAVDAVIAELK K    3 ma   7730 1427 8058 LJ a v R  GVMLAVDAVIAELK  K  e 36 722   7552 1443 8007    1 v R GVMLAVDAVIAELK K   Oxidation  i   2 37 722  7722 1443 8007 o a v R GVNLAVDAVIAELK K   Oxidation  M   6 2 1503 7490    a U KTLNDELETIECm  F  z 40 1519 7439 o 1 v K  TLNDELEIIECMK  F   Oxidation  H   2 a 1770 8458 o 1 v R  CIPALDSLTPANEDQK  I  2 as 2928  0636    2 Y K ISSIQSIVPALEIANAHNR K    46 1918  0636 o a U  K ISSIQSIVPALEIANANR K  2 48 2037 0153 o 1 LA R   IQEIIEQLDVTTSEYEK  E  z a 2040 0375 o 1 uv K  PVTTPEEIAQVATISANGDK  E    51 2112 1323    a U  R AMIQGVDLLADAVAVTMGPK  G  z 52 2128 1272    1 v R ALMLOGVDLLADAVAVTMGPK G   Oxidation  M   z 53 1065 0623 2128 1100 2128 1272 o  25  1 6 1 Y R  ALMLQGVDLLADAVAVIMGEK  G   Oxidation  M   z 54 1073 0477 2144 0009 2144 1221     92  3 7e 07 1 v R ALIQGVOLLADAVAVTNGPK G   2 Oxidation  M   2 58 789 2062 2364 2968 2364 3264 2  56  6 0012 i v BR  KPLVIIAEDVDGEALS TLVLBR  L  z 59 1183 1570 2364 2994 2364 3264 1 0 00018 1 v R KPLVIIARDVDGEALSTLVLNR L  2 60 7869 1094 2364
90. H3 if b significant and fragment includes RKNQ  10   b   H20 if b significant and fragment includes STED  13   y series   14   y   NH3 if y significant and fragment includes RKNQ  15   y   H20 if y significant and fragment includes STED  17   internal yb  lt  700 Da   18   internal ya  lt  700 Da    minInternalMass 200    maxInternalMass 1000  k    74 Mascot  Installation and Setup    Quantitation       Mascot configuration   Microsoft Internet Explorer    File Edit View Favorites Tools Help    Q sxx   Q  x  a O 2 Search Sie Favorites g ie Se i Snaglt Powermarks Pt A      Address http    Frill mascot x cgi ms config exe u 1172165637 amp QUANT_SHOW 1 Mi   Go    Mascot Configuration  Quantitation Methods    Quantitation Methods    Name Protocol   None null   ITRAQ 4plex reporter Copy Delete  ICAT ABI Cleavable  MD  precursor Copy Delete  ICPL duplex pre digest  MD  precursor Copy Delete       ICPL duplex post digest i precursor Copy Delete    SILAC K 6 R 10  MD  precursor Copy Delete  180 corrected  MD  precursor Copy Delete  180 corrected multiplex multiplex Copy Delete  SILAC K 6 R 6 multiplex multiplex Copy Delete  15N Metabolic  MD  precursor Copy Delete    New quantitation method     Serva ICPL TM  post digest  so all N terms are labelled                   http    Frillimascot_2_2_beta x cgi ms config exe u 1172165637 g Local intranet       A detailed description of quantitation methods  the relevant Configura   tion Editor pages  and the underlying file   quantitation  xm
91. Homo sapiens  gt Homo  gt Hominidae  gt Catarrhini  gt Primates  gt Eutheria  gt Theria  gt  Mammalia    gt Amniota  gt Tetrapoda  gt Sarcopterygi  gt Euteleostomi  gt Teleostomi  gt Gnathostomata  gt  Vertebrata    gt Craniata  gt Chordata  gt Deuterostomia  gt Coelomata  gt Bilateria  gt Eumetazoa  gt Metazoa    gt Fungi Metazoa group  gt eukaryote crown group  gt Eukaryota  gt cellular organisms  gt root    AAA35732 Homo sapiens  human  man    Homo sapiens  gt Homo  gt Hominidae  gt Catarrhini  gt Primates  gt Eutheria  gt Theria  gt  Mammalia    gt Amniota  gt Tetrapoda  gt Sarcopterygu  gt Euteleostomi  gt Teleostomi  gt Gnathostomata  gt  Vertebrata    gt Craniata  gt Chordata  gt Deuterostomia  gt Coelomata  gt Bilateria  gt Eumetazoa  gt Metazoa    gt Fungi Metazoa group  gt eukaryote crown group  gt Eukaryota  gt cellular organisms  gt root       Description lines    gt CCHU cytochrome c   human          Parameters Returns    1 database accession Space separated list of accession string   tax_ID number  and scientific species name   Where a database entry represents multiple  accessions  this information is repeated for  each accession  Plain formatted     Chapter 7  Program Reference 127    2 database accession    3 database accession  4 database accession  5 database tax_ID  6 database tax_ID  7 database tax_ID    8 database species    9 database accession  Batch mode    Request format    Space separated pair of accession string and  scientific species name  Where 
92. ITY  WHETHER IN AN ACTION OF CONTRACT  TORT OR  OTHERWISE  ARISING FROM  OUT OF OR IN CONNECTION WITH THE SOFTWARE OR  THE USE   OR OTHER DEALINGS IN THE SOFTWARE     Except as contained in this notice  the name of a copyright holder shall not  be used in advertising or otherwise to promote the sale  use or other dealings  in this Software without prior written authorization of the copyright holder     End User Licence Agreements vii    gzip  ht   Dig  cksum  touch  libstdc      GNU GENERAL PUBLIC LICENSE  Version 2  June 1991    Copyright  C  1989  1991 Free Software Foundation  Inc    675 Mass Ave  Cambridge  MA 02139  USA  Everyone is permitted to copy and distribute verbatim copies  of this license document  but changing it is not allowed     Preamble    The licenses for most software are designed to take away your  freedom to share and change it  By contrast  the GNU General Public  License is intended to guarantee your freedom to share and change free  software   to make sure the software is free for all its users  This  General Public License applies to most of the Free Software  Foundation   s software and to any other program whose authors commit to  using it   Some other Free Software Foundation software is covered by  the GNU Library General Public License instead   You can apply it to  your programs  too     When we speak of free software  we are referring to freedom  not  price  Our General Public Licenses are designed to make sure that you  have the freedom to d
93. MATRIX  SCIENCE     Mascot     Installation and Setup      2012 Matrix Science Ltd  All rights reserved     The information contained in this publication is for reference purposes only and is subject  to change at any time  Every effort has been made to supply complete and accurate infor   mation  However  Matrix Science Ltd  assumes no responsibility and will not be liable for  any consequential loss or damages that might result from the use of this manual or from  any errors or omissions in the information contained herein     No part of this document may be reproduced or transmitted in any form or by any means   electronic or mechanical  for any purpose  without the express written permission of Matrix  Science Ltd     Mascot is a trade mark of Matrix Science Ltd  All third party trade marks and service  marks referred to in this publication are hereby acknowledged     Matrix Science Limited  64 Baker Street  London W1U 7GB    UK  Phone  44  0 20 7486 1050  Fax  44  0 20 7224 1344  Email info matrixscience com  WWW http   www matrixscience com     April 2012  Revision 2 4 0    End User Licence Agreements       MASCOT PROTEIN IDENTIFICATION SYSTEM    END USER LICENCE AGREEMENT    IMPORTANT   PLEASE READ CAREFULLY  This End User Licence Agree   ment is a legally binding contract between you  either an individual or a  single corporate entity  and Matrix Science Limited for the product identi   fied above  which includes computer software  electronic documentation   printed d
94. May need to be increased for very large searches     6  Stack space  Not normally an issue for executables or any of the  perl scripts     7  Thread stack space  Not normally an issue for executables  The  perl scripts are not threaded    File size limits     This is normally unlimited  but a limit may have been configured  e g     etc security limits conf      You should manually verify that your system can successfully FTP a file  larger than 2 GB  as FTP doesn   t necessarily report an error when it  fails     How the errors are reported    If the Mascot executables report a memory error  the error can be found  in the errorlog txt file  including the error code returned by the  operating system  For a Perl script running in CGI mode  the web server  may just kill the job  and no error will be logged     Determining what the limits are     Most systems have two sets of limits     the current limits and the hard  limits     There is no standard Unix command across all platforms  although    limit     or    limits    will work on most systems from the    C    shell      ulimit  a    cputime unlimited  filesize unlimited  datasize 1048576 kbytes  stacksize 65536 kbytes  coredumpsize unlimited  memoryuse 250004 kbytes  descriptors 200   vmemory 1048576 kbytes    threads 1024    20 Mascot  Installation and Setup      ulimit  aH    cputime unlimited  filesize unlimited  datasize 1048576 kbytes  stacksize 524288 kbytes  coredumpsize unlimited  memoryuse 524288 kbytes  descriptors
95. NECESSARY SERVICING   REPAIR OR CORRECTION     16  INNO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN  WRITING WILL ANY COPYRIGHT HOLDER  OR ANY OTHER PARTY WHO MAY  MODIFY AND OR REDISTRIBUTE THE LIBRARY AS PERMITTED ABOVE  BE  LIABLE TO YOU FOR DAMAGES  INCLUDING ANY GENERAL  SPECIAL   INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR  INABILITY TO USE THE LIBRARY  INCLUDING BUT NOT LIMITED TO LOSS OF  DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY  YOU OR THIRD PARTIES OR A FAILURE OF THE LIBRARY TO OPERATE WITH  ANY OTHER SOFTWARE   EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN  ADVISED OF THE POSSIBILITY OF SUCH DAMAGES     G 0 2 How to Apply These Terms to Your New Libraries    If you develop a new library  and you want it to be of the greatest possible use to the public  we  recommend making it free software that everyone can redistribute and change  You can do so  by permitting redistribution under these terms  or  alternatively  under the terms of the ordinary  General Public License      To apply these terms  attach the following notices to the library  It is safest to attach them to the  start of each source file to most effectively convey the exclusion of warranty  and each file  should have at least the    copyright    line and a pointer to where the full notice is found     one line to give the library   s name and an idea of what it does   Copyright  C  year name of author    This library is free software  you can redistribute
96. OQDY    251 EDENQ  Sort Peptides By   Residue Number C Increasing Mass    Decreasing Mass  Start   End Mr Miss Sequence  L  4 535 21 O MDDR  i   12 1481 68 1 MDDREDLVYQAK  Ions score QO   5   12 964 49 O EDLVYQAK     ni egea A A        F Local intranet zone    The default behaviour is for this to link to the NCBI taxonomy browser   For non redundant databases with more than one species source per  sequence  there will be a list of the species  each with a link  For the  NCBInr database  a separate    gi    number will be shown for each database  entry  with a link to Entrez and the NCBI Taxonomy browser for each  entry     If security and confidentiality protocols may make this unacceptable for  your installation  then change the entry in the Options section of the  mascot  dat file from     TaxBrowserUrl http   www ncbi nlm nih gov htbin     gt    amp  post Taxonomy wgetorg lvl 0 amp lin f amp id  TAXID     to  TaxBrowserUrl    x cgi ms gettaxonomy  exe 4  DATABASE   ACCESSION     In this case  the link will display the information in the following format    172 Mascot  Installation and Setup    Taxonomy for gi  4501885    gi  4501885 Unknown species    gi  113270 Homo sapiens  human  man  HUMAN  103D  10GS  11GS  121P  12CA  12GS  133L   Homo sapiens  gt Homo  gt Hominidae  gt Catarrhini  gt Primates  gt Eutheria  gt Theria  gt Mammalia  gt  Am   niota  gt Tetrapoda  gt lobe finned fish and tetrapod clade  gt bony vertebrates  gt Gnathostomata  gt  Verte   brata  gt Craniat
97. SEQUENTIAL DAMAGES  INCLUDING  BUT NOT     LIMITED TO  PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES  LOSS OF     USE  DATA  OR PROFITS  OR BUSINESS INTERRUPTION  HOWEVER CAUSED AND    ON ANY THEORY OF LIABILITY  WHETHER IN CONTRACT  STRICT LIABILITY      OR TORT  INCLUDING NEGLIGENCE OR OTHERWISE  ARISING IN ANY WAY OUT    vi Mascot  Installation and Setup      OF THE USE OF THIS SOFTWARE  EVEN IF ADVISED OF THE POSSIBILITY OF    SUCH DAMAGE            This software consists of voluntary contributions made by many     individuals on behalf of the Apache Software Foundation and was    originally based on software copyright  c  1999  International     Business Machines  Inc   http   www iom com  For more     information on the Apache Software Foundation  please see      lt http    www apache org  gt     T    Curl    COPYRIGHT AND PERMISSION NOTICE  Copyright  c  1996   2003  Daniel Stenberg   lt daniel haxx se gt    All rights reserved     Permission to use  copy  modify  and distribute this software for any purpose  with or without fee is hereby granted  provided that the above copyright  notice and this permission notice appear in all copies     THE SOFTWARE IS PROVIDED    AS IS     WITHOUT WARRANTY OF ANY KIND  EXPRESS  OR   IMPLIED  INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY   FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT OF THIRD PARTY  RIGHTS  IN   NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY  CLAIM    DAMAGES OR OTHER LIABIL
98. SearchControlSaveE 0    Obsolete     SearchLogFile see ErrorLogFile  SendmailPath see EmailErrorsEnabled    SelectSwitch 1000    If the number of queries in an MS MS search is less than or equal to this  number  the default report is the Peptide Summary  If it is greater than  this number  the default report is the Select Summary     100 Mascot  Installation and Setup    SeparateLockMem 0    Only required for 32 bit versions if the total amount of memory to be  locked is greater than 2Gb  or lower if some system limit is set   Setting  this value to 1 indicates that ms monitor exe will run a separate  program  ms lockmem  exe  that will lock the memory blocks  A value  greater than 1 specifies the block size in Mb  For example  if there is a  1 5 Gb   s00 file  and this parameter is set to 750  then two instances of  ms lockmem  exe will be run     ShowAllFromErrorTolerant 0    ShowSubSets 0    Standard behaviour for the result report of a manual error tolerant  search is to show only those matches that satisfy two criteria   i  the  score must be at least as high as the match for the same query in the  original    parent    search   ii  the score equals or exceeds the identity  threshold for the same query in the original    parent    search  Setting  ShowAl1FromErrorTolerant to 1 causes all matches to be displayed   This global default can be overridden on an individual report URL by  appending  amp _showallfromerrortolerant X  where X is 0 or 1     If this is set to 1  und
99. TRegex           A Z0 9       ABEV      0 9           End    There is just a single species per sequence  so the DescriptionLineSep  is set to 0     The SpeciesFiles are from NCBI and SwissProt  and the NodesFiles are  taken from NCBI as before     There is only one database source  so the DefaultRule can be used  This  takes everything after the first underscore to the next space  For exam   ple      gt 104K THEPA  P15711  104 KD MICRONEME RHOPTRY ANTIGEN     Would find the text THEPA  which it would look up in speclist txt   NCBI EST    Definition 9 for EST_others is very similar to that for NCBInr     176 Mascot  Installation and Setup      TAXONOMY FOR dbEST using GI2TAXID   Taxonomy 9   Enabled 1   FromRefFile 0   ErrorLevel 0   DescriptionLineSep 1   ctrl a   hex code    1     For multiple descrip   tions per entry      0 to disable it       SpeciesFiles GI2TAXID gi_taxid_nucl dmp  NCBI names dmp   NodesFiles NCBI  nodes dmp  NCBI merged dmp   DefaultRule GI2TAXID  CHOP     gi    0 9         The gi  number   Identifier NCBI nucleotide FASTA using GI2TAXID   GencodeFiles NCBI  gencode dmp   AccFromSpeciesLine         gi  0 9          MitochondrialTranslation 0       End    A different species file  gi_taxid_nucl dmp  is used for nucleic acid se   quences     The file containing genetic code data is specified as an argument to  GencodeFiles  while MitochondrialTranslationis set to 0  specify   ing that all entries should be translated using the genetic code for nu   clear
100. UsersEnabled is set to 1  search results will be emailed to a  user if their web browser does not respond within the number of seconds  specified in EmailTimeOut Period following the completion of a search     Email messages can be sent in batches at intervals specified by  MonitorEmailCheckFreg  in seconds   MailTempFile is the name of  the temporary file used to store email messages until they can be sent     If EmailErrorsEnabled is set to 1  serious error messages will be  emailed to ErrMessageEmailTo     MAPI Configuration  Windows Only   Set MailTransport to 1     Set the EmailPassword to the password  if any  that is required to log  onto MAPI     Set the EmailProfile to the profile name used by MAPI  This can be  found by opening the Windows Control Panel and clicking on Mail    Depending on whether you have an    internet mail only or a    corporate or  workgroup    installation of MS Outlook  you will have a list of either  account names or profile names to choose from      Sendmail Configuration  Linux Only   Set MailTransport to 2     Set the EmailFromUser parameter to the name that is required in the     From    field of the email messages     Set EmailFromTextName as the name of the server that is running  mascot  For example setting EmailFromUser to www and  EmailFromTextName to Mascot Server will result in emails from www   Mascot Server   The From field of the email will be  www www your_domain com     Set sendmailPath as the path for the sendmail program  e 
101. You may copy and distribute verbatim copies of the Program   s  source code as you receive it  in any medium  provided that you  conspicuously and appropriately publish on each copy an appropriate  copyright notice and disclaimer of warranty  keep intact all the  notices that refer to this License and to the absence of any warranty   and give any other recipients of the Program a copy of this License  along with the Program     You may charge a fee for the physical act of transferring a copy  and  you may at your option offer warranty protection in exchange for a fee     2  You may modify your copy or copies of the Program or any portion  of it  thus forming a work based on the Program  and copy and  distribute such modifications or work under the terms of Section 1  above  provided that you also meet all of these conditions     End User Licence Agreements ix    a  You must cause the modified files to carry prominent notices  stating that you changed the files and the date of any change     b  You must cause any work that you distribute or publish  that in  whole or in part contains or is derived from the Program or any  part thereof  to be licensed as a whole at no charge to all third  parties under the terms of this License     c  If the modified program normally reads commands interactively  when run  you must cause it  when started running for such  interactive use in the most ordinary way  to print or display an  announcement including an appropriate copyright notice and
102. _2 pl  ms   createpip exe  MSAnatomiser class  mi_getpeaklist pl   msms gif pl  nph mascot exe  ms searchcontrol exe    ResultsCache master _results pl  master results 2 pl  protein _view pl   export _dat pl  export _dat_2 pl  ms createpip exe   MSAnatomiser class  mi_getpeaklist pl  nph mascot exe   ms searchcontrol exe    Comma  space or tab delimited string of scripts and applications that  will use cache files to speed up access to the results files  To prevent the  use of the cache for a particular script  remove it from this list  There are    Chapter 6  Configuration  amp  Log Files 99    two sets of cache files  one for the results file  independent of any par   ticular report format  controlled by ResfileCache  and one for each combi   nation of summary report format settings  controlled by ResultsCache   See also CacheDirectory     ResultsFileFormatVersion    If present  and the argument is 2 1  the result file format will be    2 1  compatible     That is  no xml sections  No other arguments are supported  at this time     ResultsFullURL see NoResultsScript  ResultsFullURL_2 see NoResultsScript    ResultsPerlScript see NoResultsScript          ResultsPerlScript 2 see NoResultsScript       ReviewColWidths 7 8 8 27 30 100 32 25 6 13 2 4 6 16 7  This sets the widths of the columns in ms review exe     SaveEveryLastQueryAsc see LastQueryAscFile  SaveLastQueryAsc see LastQueryAscFile    ScoreThresholdForAuto  Deprecated  use SigThreshold     SearchControlLifetime 7200  
103. a    7 Powermarks   A A      Address      http   slave02 mascat x cai security_admin pl vE sna    Mascot Security Administration   Add user Logged in as Administrator       Nome User is a member of the following groups          Guests       Administrators  PowerUsers   O Never Daemons      Default MascotIntegraSystem       Force change at next login    Password    Password  expiry       Full name         Email address       User type Standard Mascot user Multiple selections can be made by means of  the shift and control keys  platform dependent           Account  enabled          M                   Help window    Enter a user name  password  full name and email address for the new user   Select one or more groups for the user to belong to    Finally  press the    Add user  button     For further help on any input parameter  hold the mouse over the blue text                   2 Local intranet       Chapter 12  Security 221    The new user must be given a name  password  full name and email  address  There is a description of the different user types of user earlier  in this chapter  You should also select one or more groups that the user  should belong to before pressing the    Add user    button     New groups may be added or edited     E Mascot Security Administration Utility   Microsoft Internet Explorer    File Edit View Favorites Tools Help   L      EE    amp   x  a CA  gt    Search she Favorites  amp  A 7    Z mm a    t Powermarks MA A      Address    http  fslaveo2 mas
104. a  gt Chordata  gt Deuterostomia  gt Coelomata  gt Bilateria  gt Humetazoa  gt Metazoa  gt  Fungi   Metazoa group  gt eukaryote crown group  gt Eukaryota  gt root    gi  3320892 Trichosurus vulpecula  Tricosurus vulpecular  Trichosurus vulpecular  common  brush tailed possum  TRIVU    Trichosurus vulpecula  gt Trichosurus  gt Phalangeridae  gt Diprotodontia  gt Metatheria  gt Theria  gt  Mam   malia  gt Amniota  gt Tetrapoda  gt lobe finned fish and tetrapod clade  gt bony vertebrates  gt   Gnathostomata  gt Vertebrata  gt Craniata  gt Chordata  gt Deuterostomia  gt Coelomata  gt Bilateria  gt   Eumetazoa  gt Metazoa  gt Fungi Metazoa group  gt eukaryote crown group  gt EKukaryota  gt root    Description lines     gt gi  4501885   ref  NP_001092 1 pACTB   beta actin   gi  113270   sp  P02570 ACTB_HUMAN ACTIN  CYTOPLASMIC 1  BETA ACTIN   gi  71618   pir    ATHUB actin beta   human   gi  71619   pir    ATMSB actin beta   mouse   gi  279669   pir     ATCHB actin beta   chicken   gi   28252   emb   CAA25099   X00351  beta actin  Homo sapiens    gi   49866   emb   CAA27307    X03672  beta actin  aa 1 375   Mus musculus   gil 55575  emb   CAA24528   V01217  beta actin  Rattus norvegicus    gi  177968  M10277  cytoplasmic beta actin  Homo sapiens    gi  211237  L08165  beta actin  Gallus gallus    gi   2116655   dbj   BAA20266    AB004047  beta actin  Homo sapiens    gi   2182269  U39357  beta actin  Ovis aries    gi  2661136  AF035774  beta actin  Equus caballus    gi  3320892  AF0
105. a database  entry represents multiple accessions  this  information is repeated for each accession   Followed by the FASTA title line for the  accession supplied as an argument  Pretty  formatted     Same as mode 2  plus a list of common spe   cles names in parentheses     Same as mode 3  plus complete taxonomy  tree    The scientific species name as a string  Pretty  formatted     Same as mode 5  plus a list of common spe   cles names in parentheses     Same as mode 6  plus complete taxonomy  tree    verbose tax_ID information    genetic code number    GET request always means single entry mode  POST request automati   cally means batch mode  A batch mode request should use UTF 8 encod   ing and be of    multipart form data    enctype  for example     41184676334  Content Disposition  form data     SwissProt  41184676334  Content Disposition  form data        RL19 YEAST     41184676334  Content Disposition  form data     1061  41184676334  Content Disposition  form data     name     db       name  accession       name  taxID       name   showtitle       128 Mascot  Installation and Setup    on    41184676334    Content Disposition  form data  name   showSynonyms       on    41184676334    Content Disposition  form data  name  showTaxTree       on    41184676334    Content Disposition  form data  name  SessionID       123456    41184676334       The batch format aggregatesboth    find taxonomy from accession    and     find taxonomy from id    requests     Maximum number of a
106. accession     accession string    frame     frame number  0 if not supplied in the input or  missing if AA database     Chapter 7  Program Reference 121    Status    The Database Status utility  x cgi ms status exe  provides an  overview of the active and recent searches on all of the configured  databases  The top level display will resemble this      3  Mascot search status page    e CS fi O locahost t  MASCOT search status page    Version  2 3 109   Matrix Science  5214 224E 9C5D 477E 8D0B  Licence Info    8 logical  2 physical Intel processors  hyper threading disabled in bios  quad core   CPUs  0 2234567 available  using  07  234567   0 searches running          Search log  monitor log  error log  Error message descriptions  Do not auto refresh this page                                                    Name  Filename  Status    contaminants Family  usr local mascot_2_3_02_64 sequence contaminants c  contaminants_ fasta Pathname    usr local mascot_2_3_02_64 sequence contaminants c  In use Statistics Compression warnings   State Time Mon Apr 16 11 05 38   searches   0   Mem mapped YES Request to mem map   YES Request unmap   NO Mem locked   NO   Number of threads   2 Current   YES       Name  Filename  Status    CRAP Family    usr local mascot_2_3_02_64 sequence cRAP current c  cRAP_20120229 fasta Pathname    usr local mascot_2_3_02_64 sequence cRAP current c  In use Statistics   State Time Mon Apr 16 11 05 38   searches   0   Mem mapped YES Request to mem map   YES Req
107. aea  Archaeobacterna   modifications   Eukaryota  eucaryotes        Alyeolata  alveolates       Plasmodium falciparum  malana parasite    coenes Other Alveolata      Metazoa  Animals   Variable         Caenorhabditis elegans  modifications   gt  Drosophila  fruit flies     Ty Chordata  vertebrates and relatives   eoenrere bony vertebrates  eer er lobe finned fish and tetrapod dade       ere trae Mammalia  mammals   Peptide tol t   oss ss Primates  POUTVCCr COREE Homo sapiens  human   Peptide charge      i  Other primates  Data file  sss essere eee Rodentia  Rodents   rae se Ramee en eee Mus   Data format    sss steerer eee Mus musculus  house mouse   ic eee eceereer ss Rattus  Inctrument   Mefarilt v Frrar talerant  l    This file can be edited using any text editor   Under Windows  from the  start menu  choose Programs  Mascot  Config  Mascot taxonomy file      The following is an extract from the supplied file     Title All entries  Include  1    Exclude  0       Title     Archaea  Archaeobacteria   Include  2157   Exclude        Title     Eukaryota  eucaryotes   Include  2759   Exclude        Title         Alveolata  alveolates   Include  33630   Exclude          TAtles  2 g a 4 we 4 a hb SR w On 4 e Primates    Chapter 9  Taxonomy 167    Include  9443   Exclude    k   Title     a aa a aaa 2             Homo sapiens  human   Include  9606   Exclude    k   Title     a a aaa aa aa a  a     Other primates  Include  9443   Exclude  9606       The first line of each block must 
108. ah and the Regents of the University of California  All Rights Reserved    End User Licence Agreements XV    Permission is hereby granted  without written agreement and without  license or royalty fees  to use  copy  modify  and distribute this  software and its documentation for any purpose  provided that    1  The above copyright notice and the following two paragraphs  appear in all copies of the source code and  2  redistributions  including binaries reproduces these notices in the supporting  documentation  Substantial modifications to this software may be  copyrighted by their authors and need not follow the licensing terms  described here  provided that the new terms are clearly indicated in  all files where they apply     IN NO EVENT SHALL THE AUTHOR  THE UNIVERSITY OF CALIFORNIA  THE  UNIVERSITY OF UTAH OR DISTRIBUTORS OF THIS SOFTWARE BE LIABLE TO ANY  PARTY FOR DIRECT  INDIRECT  SPECIAL  INCIDENTAL  OR CONSEQUENTIAL  DAMAGES ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION   EVEN IF THE AUTHORS OR ANY OF THE ABOVE PARTIES HAVE BEEN ADVISED OF  THE POSSIBILITY OF SUCH DAMAGE     THE AUTHOR  THE UNIVERSITY OF CALIFORNIA  AND THE UNIVERSITY OF UTAH  SPECIFICALLY DISCLAIM ANY WARRANTIES INCLUDING  BUT NOT LIMITED TO    THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR  PURPOSE  THE SOFTWARE PROVIDED HEREUNDER IS ON AN    AS IS    BASIS  AND  THE AUTHORS AND DISTRIBUTORS HAVE NO OBLIGATION TO PROVIDE MAINTE   NANCE    SUPPORT  UPDATES  
109. ame     maximum number of node CPU s to be used     operating system     local path to home directory     home directory as seen from master  specify for NT master only        Node 10 0 0 1 5001  searchOl  2  Linux   usr local mascotnode   Node 10 0 0 2 5001  search02  2  Linux   usr local mascotnode   Node 10 0 0 3 5001  search03  2  Linux   usr local mascotnode   Node 10 0 0 4 5001  search04  2  Linux   usr local mascotnode   Node 10 0 0 5 5001  search05  2  Linux   usr local mascotnode  4  Re start ms monitor exe  Note that you must change directory to  mascot  bin and have super user privileges to execute ms   monitor  exe   Note   Linux only  Under Redhat Linux 8 0  if ms monitor exe termi   nates immediately after launch  without any error messages  the prob   lem may relate to a bug in gethostbyname_r   In the cluster section of  mascot  dat  try using the IP address for the master node  rather than  the hostname  as the argument to MasterComputerName   5  In a web browser  navigate to ms  status  exe and verify that the  system starts up correctly    Reference    The Cluster section in mascot dat    Cluster         Enable  1  or disable  0  cluster mode  Enabled 1         MasterComputerName must be the hostname  MasterComputerName mascot master         Chapter 11  Cluster Mode 203      Node defaults   DefaultNodeOS Windows NT  DefaultNodeHomeDir c  mascotnode       Following line must be commented out UNLESS this is a  DefaultNodeHomeDirFromMaster    lt host_name gt  
110. an run this wizard again          Select a role  If the role has not been added  you can add it  If it has already been added  you can  remove it  If the role you want to add or remove is not listed  open Add or Remove Programs     Server Role Configured Application server  IIS  ASP NET   File server Yes                      Print server No   Application server  IIS  ASP NET  No Application servers provide the core  Mail server  POP3  SMTP  No technologies required to build  deploy   Terminal server No and operate XML Web Services  Web  Remote access   VPN server No applications  and distributed   Domain Controller  Active Directory  No applications  Application server   DNS server No technologies include ASP NET  COM   DHCP server No and Internet Information Services  Streaming media server No ils     WINS server No        Read about application servers          Proceed through the wizard  accepting all the defaults  to install IIS     IIS 6 does not serve files with unknown MIME types  and its default list  of MIME types does not include XML schema documents  See Microsoft  Knowledge Base article Q326965 for the procedure to add   XSD to the  TIS 6 list of MIME types    http   support microsoft com default aspx scid kb en us 326965  Vista    Mascot will run under all Windows Vista editions except for Starter and  Home Basic     It is advisable to ensure that the latest service pack has been installed   Check the following URL for current information     http   windows micro
111. and    Chapter 6  Configuration  amp  Log Files 71    cleavage after M at the other  When Independent is omitted or given a  value of 0  the specificities are combined  as if the reagents had been  applied simultaneously or serially to a single sample aliquot  The key   word Independent does not take an index     Title semiTrypsin  Cleavage  0   KR  Restrict  0  P  Cterm 0   SemiSpecific 1   k k    If the keyword SemiSpecific appears and is given a value of 1  this  means that any given peptide need only conform to the cleavage  specificity at one end  The other end can result from non specific cleav   age  When SemiSpecific is omitted or given a value of 0  peptides are  required to conform to the cleavage specificity at both ends  The keyword  SemiSpecific does not take an index     72 Mascot  Installation and Setup    Instruments      Mascot configuration   Microsoft Internet Explorer DER   File Edit Yiew Favorites Tools Help a            Q sxx   Q  x  a O     Search Sie Favortes     amp  Mi Snaglt Powermarks P      A      Address http    fril mascotlx cgi ms config exe u 1172165637 amp FRAGMENTATIONRULES_SHOW 1    i Go    Mascot Configuration  Instruments    Default ESI MALDI ESI ESI ESI MALDI ESI FTMS ETD MALDI  Ion series QUAD TOF TRAP QUAD FTICR TOF 4SECTOR ECD TRAP QUAD  TOF PSD TOF TOF    1  x x x x x x x x x x  2  x x x x x x x x    2    precursor gt 3      immonium  a   at   a0   b   b     y must be  significant    y must be  highest score    z 1    x x  Minimum ma
112. ant that for a period of ninety days  90  from the date of delivery     the  Warranty Period        5 1    the medium on which the Software is recorded will be free from defects in  materials and workmanship under normal use  If the medium fails to  conform to this warranty  you may  as your sole and exclusive remedy   obtain  at your option  either a replacement free of charge or a full refund if  you return the defective medium to us or to your supplier during the War   ranty Period  and    End User Licence Agreements iii    5 2 the copy of the Software in this package will materially conform to the  documentation that accompanies the Software  If the Software fails to  operate in accordance with this warranty  you may  as your sole and exclu   sive remedy  return all of the Software and the documentation to us or to  your supplier during the Warranty Period  specifying the problem  and we  will provide you either with a new version of the Software or a full refund   at your option      6 Disclaimer    We do not warrant that this Software will meet your requirements or that its  operation will be uninterrupted or error free  We exclude and hereby expressly  disclaim all express and implied warranties or conditions not stated herein  so far  as such exclusion is or disclaimer is permitted under the applicable law  THIS  AGREEMENT DOES NOT AFFECT YOUR STATUTORY RIGHTS     7 Liability    7 1 Our liability to you for any losses shall not exceed the amount you originally  paid f
113. arge for this service if you wish   that you receive source code or can get it  if you want it  that you can change the software and use pieces of it in new free programs  and  that you are informed that you can do these things     To protect your rights  we need to make restrictions that forbid distributors to deny you these  rights or to ask you to surrender these rights  These restrictions translate to certain  responsibilities for you if you distribute copies of the library or if you modify it     For example  if you distribute copies of the library  whether gratis or for a fee  you must give the  recipients all the rights that we gave you  You must make sure that they  too  receive or can get  the source code  If you link other code with the library  you must provide complete object files to  the recipients  so that they can relink them with the library after making changes to the library  and recompiling it  And you must show them these terms so they know their rights     We protect your rights with a two step method   1  we copyright the library  and  2  we offer you  this license  which gives you legal permission to copy  distribute and or modify the library     To protect each distributor  we want to make it very clear that there is no warranty for the free  library  Also  if the library is modified by someone else and passed on  the recipients should  know that what they have is not the original version  so that the original author   s reputation will  not be affect
114. ary specifies a version  number of this License which applies to it and    any later version     you have the option  of following the terms and conditions either of that version or of any later version  published by the Free Software Foundation  If the Library does not specify a license  version number  you may choose any version ever published by the Free Software  Foundation       If you wish to incorporate parts of the Library into other free programs whose    distribution conditions are incompatible with these  write to the author to ask for  permission  For software which is copyrighted by the Free Software Foundation  write  to the Free Software Foundation  we sometimes make exceptions for this  Our decision  will be guided by the two goals of preserving the free status of all derivatives of our free  software and of promoting the sharing and reuse of software generally       BECAUSE THE LIBRARY IS LICENSED FREE OF CHARGE  THERE IS NO    WARRANTY FOR THE LIBRARY  TO THE EXTENT PERMITTED BY APPLICABLE  LAW  EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT  HOLDERS AND OR OTHER PARTIES PROVIDE THE LIBRARY    AS IS    WITHOUT  WARRANTY OF ANY KIND  EITHER EXPRESSED OR IMPLIED  INCLUDING  BUT  NOT LIMITED TO  THE IMPLIED WARRANTIES OF MERCHANTABILITY AND  FITNESS FOR A PARTICULAR PURPOSE  THE ENTIRE RISK AS TO THE QUALITY  AND PERFORMANCE OF THE LIBRARY IS WITH YOU  SHOULD THE LIBRARY    End User Licence Agreements Xxiii    PROVE DEFECTIVE  YOU ASSUME THE COST OF ALL 
115. ases without disrupting ongoing searches is handled by  Mascot Monitor  The new database is compressed and tested by running  a standard search  If errors are detected in the new database  the data   base exchange process is abandoned  and searches continue to use the  earlier database    Assuming the test is successful  all new searches are performed against  the new database  while searches that were in progress against the old  database are allowed to continue  Once the final search against the old  database is complete  the compressed files are deleted and the FASTA  file is moved to an archive directory  If the database being exchanged is  memory mapped  the mapping and unmapping are also handled auto   matically     4 Mascot  Installation and Setup    Status    The Mascot package includes a CGI application that provides a live  status display via a web browser  For each database  the Mascot job  queue  the executing jobs  and the completed jobs are listed  The status  lines for completed jobs contain hyperlinks to individual results reports     Review    Review is a CGI application that provides easy access to the flat file  database of search result files  Key search parameters  such as time and  date  job number  user name  search type  etc  are displayed ina  spreadsheet like table  Columns can be hidden  sorted and filtered to  facilitate locating a specific file or group of files  Each row includes  hyperlinks  either to generate a Mascot results reports or to disp
116. ass   neutral loss mass         ql_pl_et_mods_ slave neutral loss mass   neutral loss mass                 Chapter 8  I O File Formats 161    ql_pl_ primary _nl neutral loss string  ql_pl_na_diff original NA sequence    modified NA sequence  ql_pl_tag tagNum startPos endPos seriesID       ql_pl_drange startPos endPos  ql_pl_terms residue  residue   residue residue         ql_pl_subst posil1 ambigi matched1      posn ambign matchedn  ql_pl_comp quantitation component name  ql_pl_summed_mods variable modifications string  ql _p2e       qn _pme        gc0p4dq0M2Yt08jU534c0p    Each line contains the data for a peptide match followed by data for at  least one protein in which the peptide was found     If there multiple entries in the database containing the matched peptide   there will be a corresponding number of pairs of bracketing residues  listed in qn_pm_terms     Otherwise  individual field descriptions are identical to those for the  Summary section    If this is a two pass search  either an automatic decoy database search or  an automatic error tolerant search  a second peptides block appears   containing the second set of results  The section name is either  et_peptides or decoy_peptides  The syntax of the contents is identical    Proteins      gc0p4Jq0M2Yt084jU534c0p  Content Type  application x Mascot  name  proteins          accession string    protein mass      title text          accession string    protein mass      title text       gc0p4Jq0M2Yt084jU534c0p    This 
117. ate 2H 5  124 068498 124 1515 2H 5  C 7  N O Copy Delete Print  Phospho 79 966331 79 9799 H O 3  P Copy Delete Print  Phosphoadenosine 329 052520 329 2059 H 12  C 10  N 5  O 6  P Copy Delete Print  Phosphoguanosine 345 047435 345 2053 H 12  C 10  N 5  O 7  P Copy Delete Print  PhosphoHex 242 019154 242 1205 H O 3  P Hex Copy Delete Print  PhosphoHexNa  c 283 045704 283 1724 H O 3  P HexNAc Copy Delete Print  Phosphopantetheine 340 085794 340 3330 H 21  C 11  N 2  O 6  P S Copy Delete Print  PhosphoribosyldephosphoCoA  881 146904 881 6335 H 42  C 26  N 7  O 19  P 3  S Copy Delete Print  Phosphouridine 306 025302 306 1660 H 11  C 9  N 2  O 8  P Copy Delete Print  Phycocyanobilin 586 279135 596 6780 H 38  C 33  N 4  O 6  Copy Delete Print  Phycoerythrobilin 588 294785 588 6939 H 40  C 33  N 4  O 6  Copy Delete Print    Page 20 of 25 Go to page   PE to Ph     lt  lt   S gt  page size   20         Add new modification   Main menu    Biotinyl iodoacetamidyl 3 6 dioxaoctanediamine  Alternative name 1  Pierce EZ Link PEO Iodoacetyl Biotin                    http    Frill mascot_2_2_beta x cgi ms config exe u 1172165637 Le Local intranet    For modifications  more detailed help can be found in the Unimod help  pages  This is also the place to find details of the file format  which is  fully defined by a schema called unimod_1 xsd     68 Mascot  Installation and Setup    Enzymes       Mascot configuration   Microsoft Internet Explorer    Eile Edit view Favorites Tools Help ay    Q ax  
118. ation  http     www unimod org xmlns schema unimod_2 unimod_2 xsd    gt    lt umod elements gt     lt umod elem avge_mass  1 00794  full name  Hydrogen     mono_mass  1 007825035  title  H   gt     lt umod elem avge_mass  2 014101779  full name  Deuterium     mono_mass  2 014101779  title  2H    gt     lt umod elem avge_mass  6 941  full name  Lithium     mono_mass  7 016003  title  Li   gt     lt umod elem avge_mass  12 0107  full name        Carbon    mono_mass  12   ELELEe  C    gt        This section is an extract from unimod xm1 containing data for the  elements  amino_acids  and any modifications specified in the search  form  For more details and a link to the schema  refer to the help pages  at www unimod org    154 Mascot  Installation and Setup    Enzyme       gc0p4Jq0M2Yt08jU534c0p  Content Type  application x Mascot  name  enzyme       Title Trypsin  Cleavage KR  Restrict P    Cterm       This section is simply an extract from the enzyme file  Syntax details can  be found in Chapter 6    Taxonomy       gc0p4Jq0M2Yt08jU534c0p  Content Type  application x Mascot  name  taxonomy       Title                            Homo sapiens  human   Include  9606    Exclude        This section is simply an extract from the taxonomy file  Syntax details  can be found in Chapter 9    Header      gc0p4Jqo0M2Yt08jU534c0p  Content Type  application x Mascot  name  header       sequences number of sequences in DB  sequences after _tax number of sequences after taxonomy filter  residu
119. ation File       Please Note  As part of the product registration process  the following information will be transmitted to Matrix  Science        Details of any existing licence      Machine identifiers for node locking purposes  eg  MAC address         A product key is required and must be registered online  The licence file  will be returned by email and must be saved to the specified location on  the Mascot server  If the Mascot server cannot connect to the Internet  a  file containing registration information can be saved and copied to a  system with Internet access for submission     The registration form allows a second email address to be specified  in  case the person installing Mascot is not the end user  Ensure that the  end user email address is entered into the upper part of the form and the  email address to which the licence file should be sent is entered into the  CC email field in the lower part of the form     The licence file must be saved to the config licdb directory as a file  with the extension   lic     Chapter 2  Installation  Linux 15    Verify System Operation    A copy of the SwissProt database is included in the files copied from the  DVD  It is recommended that the operation of Mascot is verified and  tested using this database before adding further databases or making  configuration changes     Mascot Monitor  ms monitor exe  is used to manage the swapping and  memory mapping of the sequence databases used by Mascot  For Mascot  to operate  ms m
120. axon N Official name  Node C Common name  S Synonym       AAV2 V 010804  N Adeno associated virus 2  C AAV2  ABDS2 B 056673  N Antarctic bacterium DS2 3R  ABIAL E 045372  N Abies alba  C Edeltanne    This file is available at   ftp   ftp ebi ac uk pub databases uniprot       amp  knowledgebase docs speclist txt    170 Mascot  Installation and Setup    If you wish to add more entries  a new file should be made with just the  new entries  Mascot will load multiple files as specified below  Most  Mascot updates will contain the updated speclist txt file     Genetic code selection    During a search of a nucleic acid database  Mascot also uses the tax   onomy of each entry to choose the correct genetic code for translation   The genetic codes are defined in the NCBI file gencode  dmp  which is  included in the archive taxdump  tar  mentioned above     Nodes dmp is used as a lookup table to obtain a genetic code number  from a TaxID     For many species  the genetic code is different for mitochondrial and  nuclear proteins  Although Mascot could try to determine whether a  database entry is mitochondrial by performing a keyword search of the  FASTA description  this is unreliable  In any case  mitochondrial pro   teins will usually represent only a very small fraction of the entries in  any comprehensive database  The most important requirement is to use  the correct code for a database that is specifically mitochondrial proteins   The solution is to include a flag in each mascot 
121. be supplied  and an optional nice value  If a valid new nice value is supplied  this will return the text   job_niced    If a nice value is not supplied  the program will return the current nice  value     nice xxx   If there is an error  one of the following will be returned   unknown_id   job_not_running   searchcontrol error nnn   with values of    nnn    as for   status     The    nice    is implemented by setting a flag in the mascot control memory  mapped file  The nph mascot exe task is responsible for    resuming    itself   Nice values range from    20 to  20  A value of  20 will set the task toa  very low priority  The Mascot status screen shows the    nice    value as a  priority  which is simply    1   the nice value  Microsoft Windows does not  allow such a fine grained control of priorities  so  for example  20 and   19 will map to the same priority     ms searchcontrol exe   set to queued   taskID   lt number gt     sessionID  lt string gt      If the task is successful  this will return the text   queued  If there is an error  one of the following will be returned     unknown_id    140 Mascot  Installation and Setup    already running    already complete    searchcontrol error nnn    with values of    nnn    as for      status     A batch processing client can make queued jobs visible to the Mascot  system by getting a taskID and using this call to set the status to     queued     When the search is eventually submitted  nph mascot exe will  set the status    
122. because  searches may run slower than normal  If you are trying to search an  assembled genome  you might want to consider searching shorter se   quences instead  such as a database of the contigs     MaxVarMods 9    The maximum number of variable mods allowed for an MIS search   global default  can be over ridden for a group in security   Value is an  integer in the range 0 to 32    MinPeaksForHomology 6    For an MS MS search  a homology threshold will not be reported if the  number of peaks in a spectrum is less than this value    MinPepLenInPepSummary 7    In a Peptide Summary report  two proteins are reported as distinct  matches if the peptide matches to one protein are not identical to or a  sub set of the peptide matches to the other protein  Since matches to  very short peptides are usually random  peptides shorter than  MinPepLenInPepSummary are not considered in this comparison     MinPepLenInSearch 7    Peptides shorter than MinPepLenInSearch are rejected during the  search  Matches to very short peptides are meaningless because a 2 mer  or 3 mer can occur in almost every entry in a database  If such matches    Chapter 6  Configuration  amp  Log Files 93    are allowed in the peptides section  it can cause serious bloating of the  result file     MonitorEmailCheckFreq see EmailErrorsEnabled  MonitorLogFile see ErrorLogFile    MonitorPidFile monitor pid    The name for the file that holds the process ID number for ms   monitor exe  Default is monitor pid     Monito
123. block contains reference data for the proteins listed in the peptides  block     162 Mascot  Installation and Setup    Input data for query n      gc0p4Jq0M2Yt084jU534c0p  Content Type  application x Mascot  name  queryn       title query title   index query index   seql sequence qualifier  e g  N ABCDEF   seq2        seqn   compl composition qualifier  e g  O P 2 W     comp2         compne      PepTol peptide tolerance qualifier  e g  2 000000 Da   IT_MODS Mod 1  Mod 2         INSTRUMENT instrument identifier   e g  ESI TRAP   RULES 1 2 5 6 8 9 13 14   INTERNALS min mass max mass   CHARGE charge state  e g  2     RTINSECONDS a   b    c  d      SCANS a   b    c  d      tagl sequence tag  e g  t 889 4   QK S 1104 54     tagne      mass _min lowest mass   mass _max highest mass  int_min lowest intensity  int_max highest intensity  num_vals number of mass values  num_used1  1  obsolete   ions1 1344 65 34 3 1365 41 13 2  ions2 y 1344 65 34 3 1365 41 13 2  ions3 b 1344 65 34 3 1365 41 13 2      gc0p4Jqo0M2Yt08jU534c0p    Value    queryn    runs from    query1     no leading zeros   ionsn values are  sorted in the order that they were selected for scoring     Chapter 8  I O File Formats 163    Most searches will only require a few of these fields  For example  a  peptide mass fingerprint would only include the charge field     The index is a 0 based record of the original query order before sorting by  Mr    ions2 and ions3 are only required when fragment ions are specified in a  s
124. c  mascotnode             Following line must be commented out WHEN this is a     MascotNodeScript HHHROOTHHH bin load_node pl         Sub cluster definition     Syntax is SubClusterSet X Y where X is the sub cluster number    and Y is the maximum number of processors in the sub cluster  SubClusterSet 0 10         Time outs  log files   IPCTimeout 5   seconds with no  IPCLogging 0   no logging   0   IPCLogfile    logs ipc log   relative path  CheckNodesAliveFreq 30   seconds between node  SecsToWaitForNodeAtStartup 20   seconds to wait for      end    Enabled    1 to enable cluster mode  0 to enable single server mode  MasterComputerName    Enter the host name for the master computer and  optionally  the IP  address separated by a comma  The IP address may need to be specified  for a multi homed master where it is necessary to define which network  card is on the LAN and which is the gateway to the outside world     DefaultNodeOS    If no OS is defined for a particular node  then this OS is assumed  Must  be one of     e Windows_NT  e Linux    Note that these names are case sensitive   DefaultNodeHomeDir    If no specific home directory is specified for a particular node  then this  default is used     204 Mascot  Installation and Setup    On a Linux system  this will typically be  usr local mascotnode  It is  best not to use  usr local mascot as this is the directory mostly used  for the master     On a Windows system  this will typically be C   MascotNode or D    MascotNode
125. can be re started  Programs  Mascot   config  Start Mascot service      Percolator 0   PercolatorFeatures mScore  lgDScore  mrCalc  charge  dM  dMppm  absDM   absDMppm  isoDM  isoDMppm  mc  varmods  totInt   intMatchedTot  relIntMatchedTot  PercolatorMinQueries 100   PercolatorMinSequences 100   PercolatorUseProteins 0   PercolatorUseRT 0   PercolatorExeFlags  i 10  D 14  v 0       Set Percolator to 1 if percolated results should be opened by default  0  otherwise  PercolatorFeatures specifies the list of features used by  Percolator  To see the list of available features  run ms createpip exe   help  Percolator will only be run if the number of queries in the search  is at least PercolatorMinQueries and the number of entries in the  sequence database is at least PercolatorMinSequences  Percolator  will use the assignment of proteins to peptides as a feature if  PercolatorUseProteins is set to 1  This can have undesirable results  and should be used with great care  This flag is not supported in the  current release  Percolator will use the retention times of peptides as a  feature if PercolatorUseRT is set to 1  PercolatorExeFlags is used  to specify the Percolator command line arguments with the exception of  the file path arguments    j    B  r  If the string includes the argument  D  num  this will be removed unless PercolatorUseRT is set to 1    PrecursorCutOut  1  1    96 Mascot  Installation and Setup    The precursor peak can often have very high intensity relative to
126. cation  the name of the results file can be parsed directly from  this string  The output to STDOUT from a successful search will closely  resemble the following      null  200 OK    Server   null  Content type     Pragma  no ca   lt HTML gt    lt HEAD gt  lt TITLE gt      lt META HTTP EQ   lt META HTTP EQ     lt  HEAD gt  lt BODY   lt     comment   lt     comment   lt     comment   lt     comment     lt     comment       text html  che    Mascot searching    lt  TITLE gt   UIV  Expires    CONTENT  0  gt   UIV  Pragma    CONTENT  no cache     gt   BGCOLOR    FFFFFF     gt        here    gt   here    gt   here    gt   here    gt   here    gt     Chapter 7  Program Reference 111        lt     comment here    gt     lt     comment here    gt     lt     comment here    gt     lt     comment here    gt     lt     comment here    gt     lt H1  gt  lt IMG SRC       images 88x31_logo_white gif    WIDTH  88  HEIGHT   gt    Q    31    ALIGN  TOP    BORDER  0  NATURALSIZEFLAG  3  gt  Mascot Search lt  H1 gt   Licensed to  Matrix Science In house test system  lt BR gt Not a real form    Finished uploading search details    lt BR gt     lt B gt IMPORTANT   lt  B gt  If you get disconnected or choose not to wait     amp  for your search results lt BR gt DO NOT RESUBMIT THE SEARCH  Your     amp  results will be sent by email when the search is complete lt BR gt     Searching     lt BR gt     10  complete lt BR gt      20  complete lt BR gt       30  complete lt BR gt        50  complete lt 
127. cations that expect to find these files     Files in config dbmanager are configuration files used by Data   base Manager  For descriptions  see the Database Manager HTML  help page     A browser based Configuration Editor is provided to view and edit  these files  These files are all text files  so can also be edited in any    66 Mascot  Installation and Setup    text editor  If you choose to edit the files  exercise care and always  make a backup first  because seemingly small errors can render  Mascot unusable     Configuration Editor    The local Mascot home page contains a link to the Configuration  Editor  The top level page is a menu  If Mascot security is enabled   there will be an additional menu item for Mascot security adminis   tration     3  Mascot configuration    e C fi     locahost 8090 mascot_2_4_0_64 x cgi ms c    Mascot Configuration    Elements Element masses  Amino Acids Amino Acid Data  Modifications Modification definitions  Symbols Symbols used in chemical formulae  Enzymes Enzyme definitions  Instruments Fragmentation Rules  Quantitation Quantitation Methods  Configuration Options Global Options in mascot dat  Database Manager Sequence databases  Parse Rules and automated downloads       unimod xml    The first four menu items  Elements  Amino acids  Modifications  and  Symbols  are interfaces to different aspects of unimod xm1  It should  only be necessary to make changes to unimod xm1 in exceptional cir   cumstances  An example might be if you wante
128. ccessions   taxIDs submitted at once must not  exceed 100000 and the total size of request should be no more than 10  MB     All request parameter names are case insensitive  Any parameter value  can be in quotes     DB     mandatory parameter and can only appear once  If several  databases are searched than ms getseq must be called separately for  each database     ACCESSION   can appear any number of times  Quotes are mandatory   Can have a list of accessions delimited by commas  spaces  tabs or new  line characters  All ACCESSION fields are merged into one list of acces   sion strings internally     TAXID     can appear any number of times and contains a list of tax   onomy ids delimited by commas  spaces or new line characters  All such  fields are merged into one list internally     SHOWTITLE     can appear only once and if set to TRUE a description  for each db entry has to be output     SHOWSYNONYMS  can appear only once and if set to TRUE a list of  common names should be output for each taxonomy     SHOWTAXTREE   can appear only once and if set to TRUE taxonomy  tree should be output for each taxonomy     SESSIONID     an optional parameter and can appear at most once  If no  session ID is supplied then ms gettaxonomy can either process the  request when security is disabled or try to retrieve the ID from cookies     Chapter 7  Program Reference 129    Boolean values can be coded in different ways     true   TRUE   True   on   any number except 0   any string except 
129. ce     5   Optionally  if Mascot sercurity enabled     sessionID followed by  a space and then the security session identifier    Chapter 7  Program Reference 115    If the keyword seq is supplied  the output from GetSeq has the following  format     Content type  text plain     MMSARGDFLNYALSLMRSHNDEHSDVLPRLY     PLYSSKOQTLKOKLLLAIKTKNFGFV   gt 100K RAT 100 KD PROTEIN  EC 6 3 2       RATTUS NORVEGICUS  RAT      The keyword  all  is only applicable if a local  full text database is avail   able and configured in mascot   dat  In which case  the returned text has  a format similar to the following     Content type  text plain     MMSARGDFLNYALSLMRSHNDEHSDVLPRLY     PLYSSKOTLKOKLLLAIKTKNFGFV   gt 100K RAT 100 KD PROTEIN  EC 6 3 2       RATTUS NORVEGICUS  RAT     gt P1 100K RAT   100 KD PROTEIN  EC 6 3 2       RATTUS NORVEGICUS  RAT      C DOMAIN 827 847 PRO RICH   C BINDING 858 858 UBIQUITIN  BY SIMILARITY     C Keywords  UBIQUITIN CONJUGATION  LIGASE     In all cases  the first line is a content type specifier  followed by a blank  line     For seg and a11 there is then an asterisk followed by the unformatted  sequence in one letter code  The next line is identical to the FASTA title  line  beginning with a right angle bracket     In the case of a full text report  this is followed by the raw text entry  as  retrieved from the sequence database full text file     If the keyword len is supplied  then the length of the sequence is re   turned as ascii text  If the database is a 
130. ception  the source code distributed need not include  anything that is normally distributed  in either source or binary   form  with the major components  compiler  kernel  and so on  of the  operating system on which the executable runs  unless that component  itself accompanies the executable     If distribution of executable or object code is made by offering  access to copy from a designated place  then offering equivalent  access to copy the source code from the same place counts as  distribution of the source code  even though third parties are not  compelled to copy the source along with the object code     4  You may not copy  modify  sublicense  or distribute the Program  except as expressly provided under this License  Any attempt  otherwise to copy  modify  sublicense or distribute the Program is  void  and will automatically terminate your rights under this License   However  parties who have received copies  or rights  from you under  this License will not have their licenses terminated so long as such  parties remain in full compliance     5  You are not required to accept this License  since you have not  signed it  However  nothing else grants you permission to modify or  distribute the Program or its derivative works  These actions are  prohibited by law if you do not accept this License  Therefore  by  modifying or distributing the Program  or any work based on the  Program   you indicate your acceptance of this License to do so  and  all its terms and cond
131. cked  If you plan to use Remote Desktop from the master  you  might want to check this at the same time     Chapter 11  Cluster Mode 189      Windows Firewall    General Exceptions   Advanced     Windows Firewall is blocking incoming network connections  except for the  programs and services selected below  Adding exceptions allows some programs    to work better but might increase your security risk        Programs and Services        Name   File and Printer Sharing  Remote Assistance   CO Remote Desktop   O UPnP Framework             Add Program    Add Port    Delete    Display a notification when Windows Firewall blocks a program       What are the tisks of allowing exceptions              Choose Add Port and enter the following data  Don   t press OK yet     190 Mascot  Installation and Setup    Add a Port    Use these settings to open a port through Windows Firewall  To find the port  number and protocol  consult the documentation for the program or service you  want to use        Name    MascotNodePort5001 TCP          Port number   5001          ICP O UDP    What are the risks of opening a port        Choose Change scope and select the second option  My network  subnet   only     Change Scope    To specify the set of computers for which this port or program is unblocked  click an  option below     To specify a custom list  type a list of IP addresses  subnets  or both  separated by  commas     O Any computer  including those on the Internet     O Custom list          E
132. conds    1 day     To change the timeout value at the cgi node   cscript adsutil vbs set  w3svc 1 root mascot cgi cgitimeout x    To set a new timeout value at the mascot node  or change the existing  value     escript adsutil vbs set  w3svc 1 root mascot cgitimeout x  where x is the value in seconds that you want to set  You will then need to re start IIS  from the IIS Management console     IIS 7 x  Vista  Server 2008  7     In the Windows Start menu  go to Control panel  Administrative Tools   Internet Information Services  IIS  Manager  On the connections tree   expand Sites and Default web site and select mascot  In the central pane   double click the CGI properties icon  The CGI time out will be displayed  and can be edited  If you make changes  choose Apply in the Action pane             Internet Information Services  IIS  Manager a 151 x                                                        09   Gl  gt  WIN I9B9VXG3DIN  gt  Sites    Default WebSite  gt  mascot  gt  jaz x aa  File View Help  ec CGI E  e  1218  mal By soy  ca Start Page Ae Cancel     lt   x Cance  1 3 WIN 9B9vxG20IN  wIn 1989v   _DSbIay _ Friendly Names    Application Pools E Behavior   Help  E   8  Sites Time out  hh mm ss  1 00 00 00 Online Help      Default Web Site Use New Console For Eac False  2 4 mascot E Security  fel coi Impersonate User True   F downloads  E  help  w  images  H E pdf   5  templates  ffl x cai  H E xmins  Impersonate User  Specifies whether a CGI process is created in the syst
133. cot is running on a Windows system     3  AA   NA  AA for an amino acid  protein  database and NA for a nucleic  acid  DNA  database     4  Obsolete  This parameter used to contain the approximate number of  entries  sequences  in the database  used for progress reports during a  search  The value is now just a place holder     5  Obsolete  This parameter used to contain a unique identification  number  The value is now just a place holder     6  Mem map  Flag to indicate whether the database file should be  memory mapped  1  or not  0   Database files should always be memory  mapped  Unlike memory locking  this does not consume physical RAM     7  Obsolete  This parameter  Blocks  must always be set to 1     8  Threads  A Mascot search can use multiple threads  If you are run   ning in cluster mode     Threads    is ignored  Otherwise  set to    1 to allow  the number of threads to be determined automatically  To specify a fixed  number of threads in non cluster mode  set a value of 1 or more     9  Mem lock  Flag to indicate whether a memory mapped database file  should be locked in memory  1  or not  0   This setting is only relevant if  column 6 contains a 1     Memory mapped files can be locked in memory  but only if the computer  has sufficient RAM  Having a database locked in memory means that it  can never be swapped out to disk  ensuring there will never be a lag if  the database files have to be read from disk  Of course  there also needs  to be sufficient RAM for t
134. cot x cai security_admin pl vao sna    Mascot Security   edit group  Guests Logged in as Administrator       Unique ID  1 Users in Group Users not in group    Name Guests                        Tasks that cannot be SEARCH  Allow ms ms  and SQ  searches  performed by group members   SEARCH  Allow msms no enzyme searches   SEARCH  Allow no enzyme pmf searches    Add task   SEARCH  Maximum number of concurrent searches per user     SEARCH  Maximum mascot search job priority       Tasks that can be  Performed by group Task Parameter    SEARCH  Allow pmf searches          To remove tasks  select  one or more check boxes SEARCH  Allow all fasta databases to be searched  and press    Remove        Remove                   GENERAL  View config files using ms status           Save changes   cancel        Help window   Change details for an existing group  Change the users that belong to the group  and the tasks that  members of the group can perform  No changes are saved until the  Save changes  button is pressed        4 Local intranet       Each group has a unique ID that cannot be changed     Users can be added to or removed from groups either on this screen or  from the edit add user screens     Mascot security is fine grained  There is a list of about 20 tasks that  members of a group can  or cannot perform   The tasks that are not  permitted are in the top list  To allow group members to perform one of  these tasks  click on the task in the list  and then    Add task     This ta
135. crypted password to third party SMB serv    Disabled  a a IP Security Policies on Loca RE  Microsoft network server  Amount of idle time required before suspending ses    15 minutes  RE  Microsoft network server  Digitally sign communications  always  Disabled  RE  Microsoft network server  Digitally sign communications  if client agrees  Disabled  Microsoft network server  Disconnect clients when logon hours expire Enabled  Network access  Allow anonymous SID Name translation Disabled  Network access  Do not allow anonymous enumeration of SAM accounts Enabled  Network access  Do not allow anonymous enumeration of SAM accounts and    Disabled  Network access  Do not allow storage of credentials or  NET Passports for ne    Disabled  Network access  Let Everyone permissions apply to anonymous users Disabled  Network access  Named Pipes that can be accessed anonymously COMNAP COMNOD     Network access  Remotely accessible registry paths System CurrentCon     Network access  Shares that can be accessed anonymously COMCFG  DFS           RE  Network security  Do not store LAN Manager hash value on next password ch    Disabled   RE  Network security  Force logoff when logon hours expire Disabled   RE  Network security  LAN Manager authentication level Send LM  amp  NTLM re     RE  Network security  LDAP client signing requirements Negotiate signing   R8  Network security  Minimum session security for NTLM SSP based  including sec    No minimum    RE  Network security  Minimum sessio
136. d     It is advisable to ensure that the latest service pack has been installed   Check the following URL for current information     http   www microsoft com windowsxp downloads default mspx    The Microsoft web server for 32 bit editions of Windows XP is IIS 5 1   which is provided as part of the standard distribution  If IIS is not  installed  choose    Add or Remove Programs    in the Control Panel  Select     Add Remove Windows Components     and check the box for Internet  Information Services  XP Professional x64 Edition uses IIS 6 0  the same  as Server 2008     Server 2003    Mascot will run under all editions of Windows Server 2003 except those  for the Itanium processor     It is advisable to ensure that the latest service pack has been installed   Check the following URL for current information     Chapter 3  Installation  Microsoft Windows 25    http   technet microsoft com en us windowsserver bb405947    The Microsoft web server for Server 2008 is IIS 6 0  You may need to  install IIS by configuring the server as an Application Server  When you  start your server  you should see a    Manage Your Server    wizard  If not   go to Administrative Tools and click on    Manage Your Server     When the  wizard opens  click on    Add or Remove a Role     then select Application  Server            Configure Your Server Wizard x     Server Role  You can set up this server to perform one or more specific roles  If you want to add more than one p  role to this server  you c
137. d to add a modification  that was confidential or experimental  Otherwise  better to add a new  modification to the public Unimod database  www unimod org  and later  download an updated configuration file from the Unimod help page  By  going this route  you share the new modification with others  and benefit  in turn from other people   s updates        Most of the pages of the Configuration Editor are self explanatory   Where necessary  help text is displayed when the mouse rolls over a  hyperlink     Chapter 6  Configuration  amp  Log Files 67    F Mascot configuration   Microsoft Internet Explorer DER   Fie Edit View Favorites Tools Help ay    Q wx      x  a cA     Search she Favortes  amp  B  A Gd Snagit Powermarks ft A        Address http    Frillimascot x cgi ms config exe u 1172165637    Mascot Configuration  Modifications    Title Monoisotopic Average Composition   PEO lodoacetyl LC Biotin 414 193691 414 5196 H 30  C 18  N 4  O 5  S Copy Delete Print  PET i   121 035005 121 2028 H 7  C 7  N Of 1  S Copy Delete Print  PGA1 biatin 660 428442 660 9504 H 60  C 36  N 4  O 5  S Copy Delete Print  Phe  gt Cys  44 059229  44 0310 H  4  C  6  S Copy Delete Print  Phe  gt lle  33 984350  34 0162 H 2  C  3  Copy Delete Print  Phe  gt Ser  60 036386  60 0966 H  4  C  6  O Copy Delete Print  Phe  gt Tyr 15 994915 15 9994 O Copy Delete Print  Phe  gt Val  48 000000  48 0428 C  4  Copy Delete Print  Phenylisocyanate 119 037114 119 1207 H S5  C 7  NO Copy Delete Print  Phenylisocyan
138. dat  and to enable someone without prior knowledge of regular  expressions to write simple rules for new databases  Only the most basic  aspects of BRE notation are touched on     In mascot  dat  the PARSE section contains a number of rules  For each  rule  the pattern in double quotes is a BRE which is used to identify a  string so that it can be parsed from the surrounding text  For example      Report text from NCBI excluding sequence  used for AA entries   RULE_10       LOCUS     ORIGIN      The part of the BRE between the backslashed parentheses     and    is  the string which we are trying to locate and extract  This rule looks for  the word LOCUS followed by a space  It will extract all the text  including  the word LOCUS  up to but excluding the word ORIGIN followed by a  space     BRE Rules    The rules for performing this match are as follows   The BRE always looks for the longest  leftmost matching string   Matching is case sensitive     Newline characters  LF in Unix or CR LF in Windows  are treated like  any other character    226 Mascot  Installation and Setup    The sub expression to be extracted from the surrounding text is defined  using backslashed parentheses        The parentheses are ignored for  matching purposes     Some characters are    Special            The period  left bracket and backslash are special except  when used in a bracket expression       The asterisk is special except when used in a bracket expres   sion  as the first character of an en
139. de gt    lt msgt node level  9  gt Saccharomycetales lt  msgt  node gt    lt msgt node level  8  gt Saccharomycetes lt  msgt  node gt    lt msgt node level  7  gt Saccharomycotina lt  msgt  node gt    lt msgt node level  6  gt Ascomycota lt  msgt  node gt    lt msgt node level  5  gt Dikarya lt  msgt  node gt    lt msgt node level  4  gt Fungi lt  msgt  node gt    lt msgt node level  3  gt Fungi Metazoa group lt  msgt  node gt    lt msgt node level  2  gt Eukaryota lt  msgt  node gt    lt msgt node level  1  gt cellular organisms lt  msgt  node gt    lt  msgt tree gt    lt  msgt   taxonomy gt    lt  msgt accession gt    lt  msgt all_accessions gt    lt  msgt db entry gt    lt msgt tax_from_id gt    lt msgt   taxonomy gt    lt msgt  db gt SwissProt lt  msgt  db gt    lt msgt  taxonomy _id gt 1061 lt  msgt taxonomy_id gt    lt msgt scientific_name gt Rhodobacter capsulatus lt    msgt scientific_ name gt           Chapter 7  Program Reference 131     lt msgt translation_table_id gt 11 lt  msgt translation_table id gt      lt  msgt   taxonomy gt    lt  msgt tax_from_id gt    lt  msgt results gt    lt  msgt ms_gettaxonomy_out gt     The way information is represented in the XML output will be clearer if  a few rules are kept in mind     e msgt title element will only appear in the output if  showTitle true     e msgt common_names element will only appear in the output if  showSynonyms true     e msgt tree element will only appear in the output is  showTaxTree true     e order of e
140. display fragment ion masses in reports  can be altered by changing this value     IteratePMFIntensities 1    Set this option to 0 to prevent selection of PMF values on the basis of  their intensity     LabelAll 0    Set this option to 1 to make the initial display in Peptide View one in  which all peaks that match a calculated mass value are labelled     LastQueryAscFile    logs lastquery asc  SaveEveryLastQueryAsc 1  SaveLastQueryAsc 0    SaveLastQueryAsc is a flag which controls whether the most recent  input file to Mascot  i e  the MIME format file containing MS data and  search parameters  should be saved to disk  1  or not  0   This can be a    90 Mascot  Installation and Setup    useful debugging tool when writing scripts or forms to submit searches to  Mascot  If SaveLastQueryAsc is set to 1  the name of the file is deter   mined by LastQueryAscFile  Each new search over writes this file   NB LastQueryAscFile is a disk path  not a URL     An additional debugging tool is provided by SaveEveryLastQueryAsc   If set to 1  the Mascot input file will be saved for any search that fails to  complete because it generates a fatal error  The name of the output file  follows the same naming convention as a normal Mascot result file   except for the additional suffix   inp  If a search goes to completion  this  file is deleted as soon as the normal output file has been written to disk     LogoImageFile    images 88x31_ logo white gif    This is the URL of the Matrix Science logo  
141. e  Default is 1     102 Mascot  Installation and Setup    TargetFDRPercentages 0 1  0 2  0 5  1   2  5    TaxBrowserURL    Choices available for the FDR drop down list in the Protein Family  Summary report of an auto decoy search  Each item in the list is a  percentage  The   symbol specifies the default setting of the control  1   in this case      No default   The URL used in reports to retrieve taxonomy information  for a Protein View report  By default  this points to the NCBI  If you  don   t want to send such queries out to the internet  the URL can be  replaced by a call to the ms gettaxonomy exe utility     TaxBrowserUrl    x cgi ms   gettaxonomy exe 4  DATABASE   ACCESSION     TestDirectory see ErrorLogFile    UniqueJobStartNumber see ErrorLogFile    UnixDirPerm 777    Specify the Linux permissions for the    daily    result file directories  For  example  775 makes each directory world readable but not writeable   This option provides more fine grained control than UnixWebUserGroup    UnixWebUserGroup    This entry  if present  will restrict access to the files created by  ms monitor exe  and hence improve system security  The  UnixWebUserGroup is the number of the group used by the web server  to run CGI programs  With Apache  the group name will generally be  nobody  and you will need to ascertain the group number from the group  file  For other Web servers  check the documentation that comes with the  server to find out which user name is used for running CGI pro
142. e  db_name is the database family name from mascot dat   e g  MSDB  fasta is the fully qualified path to the FASTA file    ms compress exe compresses the fasta file using the rules specifed in  mascot  dat and must be run so that it   s current working directory is  mascot  bin     Return value of 0 for success   gt  0 for failure    144 Mascot  Installation and Setup    MakeSearchLog    The bin ms makesearchlog  exe utility is used to rebuild the search  log by scanning all the result files located in sub directories of mascot    data  This can be useful if the original search log has been damaged or if  result files have been pruned after archiving  There are no arguments     LockMem    On 32 bit platforms  the 2GB address space limit can quickly be reached  by having several large databases locked into memory  To work around  this limit  the bin ms lockmem  exe utility is provided     LockMem is enabled by adding the parameter    SeparateLockMem 1    to  the options section of mascot dat  Specifying a value greater than 1  specifies the block size in MB  For example  if there is a 1 5 GB   s00  file  and this parameter is set to 750  two instances of ms lockmem  exe  will be run     GetError    The utility cgi ms geterror exe takes an error number as an argu   ment and returns the corresponding text string  For example     C  Inetpub MASCOT cgi gt ms geterror exe 23  You specified enzyme  s which is not available  Choose another     145       VO File Formats       MIME Ver
143. e  library     The former contains code derived from the library  whereas the latter must be  combined with the library in order to run     0  This License Agreement applies to any software library or other program which  contains a notice placed by the copyright holder or other authorized party saying it may  be distributed under the terms of this Lesser General Public License  also called    this  License      Each licensee is addressed as    you        A    library    means a collection of software functions and or data prepared so as to be  conveniently linked with application programs  which use some of those functions and  data  to form executables     The    Library     below  refers to any such software library or work which has been  distributed under these terms  A    work based on the Library    means either the Library  or any derivative work under copyright law  that is to say  a work containing the Library  or a portion of it  either verbatim or with modifications and or translated  straightforwardly into another language   Hereinafter  translation is included without  limitation in the term    modification            Source code    for a work means the preferred form of the work for making modifications  to it  For a library  complete source code means all the source code for all modules it  contains  plus any associated interface definition files  plus the scripts used to control  compilation and installation of the library     Activities other than copying  d
144. e Perl or web server software     You cannot mix x86 and x64 nodes in a Mascot cluster  All must be 32   bit or all must be 64 bit     The master detects that search nodes are responding by issuing an echo  command to TCP on port 7 under Linux and ICMP echo under Windows   Hence  these services must not be disabled or blocked by firewalls    Overview of Implementation    Each search is distributed to all the cluster nodes  but each node  searches just an allocated portion of the sequence database  Search  results are returned to the master  which merges them  writes the result  file to disk  and optionally generates HTML reports to be returned to a  client web browser     All master   node communication is via TCP IP     Configuration and program files are distributed and updated automati   cally from the Master node     Mascot administration tools provide web browser based system status  reports  These are continuously updated and show at a glance important  parameters such as processor usage and free disk space for each of the  nodes  As an option  critical alerts can also be sent to the system admin   istrator by email     In cluster mode  Mascot is intended to run as a dedicated system  Trying  to run other applications on the cluster simultaneously may have unpre   dictable effects on search speed     Installation of Mascot    It is only necessary to install or upgrade Mascot on the master system  In  fact  no files are copied to the Mascot nodes during installation  The
145. ePerl 5 14 bin perl  usr local   bin perl    Mascot Directory Structure    There are two directory structures to consider  One consists of the    real     paths to files on disk  the other consists of the    virtual    directories which  define the web server URL   s  The virtual directories are mapped to real  directories  For example  the server URL    http   your domain mascot home html  might be mapped to the disk file   usr local mascot html1 home  html    Any virtual directory that contains CGI executable programs  e g   nph mascot exe  or scripts  e g  master_results p1  must have  script execution enabled     Under normal circumstances  if a directory is mapped to a URL  all of its  subdirectories are also accessible as subdirectories of the URL  Figure  2 1 shows the recommended directory structure for Mascot  The root of  this structure can be any convenient path     Some of the directory paths can be changed by using a symbolic link or  by modifying the configuration file  mascot   dat  For example  it may be  desirable to have the sequence or data directories on a separate drive  from the rest of the files  Care should be taken with any changes which  affect a URL mapped directory or file  because this may require one or  more HTML files to be edited to modify links     In most cases  the contents of the directories can be deduced from their  names     bin contains  non CGI  executables   cgi contains CGI executables    cluster contains a sub directory for platfor
146. eSep is set to 1  the ASCII character code  for CTRL A     There are two SpeciesFiles   gi_taxid_prot dmp and names dmp  and two  NodesFiles   nodes dmp and merged dmp  All four files must be present  and up to date    The method of finding the species is particularly simple for NCBI  databases  The default rule is applied to the FASTA title line to extract  the    gi    number from the accession string  The species file  gi_taxid_prot dmp is a look up table that returns an NCBI taxonomy ID  number     Chapter 9  Taxonomy 175    For example  the FASTA title line     gt gi 2147497 pir  S73153 hypothetical protein 10   red      amp  alga  Porphyra purpurea  chloroplast    returns a gi number 2147497  Looking this number up in  gi_taxid_prot dmp returns a taxonomy ID of 2787  Looking this number  up in names dmp returns the string    2787   Porphyra purpurea      gt      amp  scientific name      SwissProt  The rules for SwissProt are fairly simple  Block 3 should always used for    SwissProt and Trembl  even if you have a local full text  dat  file       TAXONOMY FOR SwissProt or Trembl from the fasta file  Taxonomy_3    Identifier SwissProt FASTA   Enabled 1   0 to disable it   FromRefFile 0   DescriptionLineSep 0   ctrl a   hex code    1     For multiple  descriptions per entry   SpeciesFiles NCBI names dmp  SWISSPROT speclist txt   NodesFiles NCBI nodes dmp  NCBI merged dmp   DefaultRule SWISSPROT  CHOP      gt   _                    Anything  after _ before space   SWISSPRO
147. ected from a dropdown  list in the report  Each column arrangement is of the form     Name  columns     where Name is the column arrangement name  e g   Standard  and  columns  is a comma separated list of column names  as  used by Report Builder  The following is the standard list of column  names  available in every report     family    98 Mascot  Installation and Setup    member   db   acc   score   mass  matches  matches sig  sequences  sequences sig  empai  frame   desc    Frame will not be shown in the report if the search is against a  proteindatabase  Quantitation methods add additional column names   but these are generated from the quantitation ratio names  The easiest  way to create a column arrangement is to arrange the columns in Report  Builder  then    export    the arrangement as a string     ReportNumberChoices 5 10 20 30 40 50    If present  this list will define the choices provided in the search form     Report top    drop down list     RequireBoldRed 0    If this flag is set to 1  only protein matches which have one or more    bold  red    peptide matches will be listed in a peptide summary report  That is    proteins that include at least one top ranking peptide match that has not  already appeared in the report  This global default can be overridden on   an individual report URL by appending  amp _requireboldred X  where X is  0 or 1     ResfileCache master results pl  master _ results 2 pl  peptide view pl   protein view pl  export _dat pl  export _dat
148. ection  After saving the changes  monitor progress in the  Mascot database status page as the new database is brought on line     Chapter 5  Sequence Database Setup 63               eaa        To     A A nn     zle http   ec vm64 mascot x cgi ms status exe         X       Mascot search status page Lu tas 293                      MASCOT search status page    Version  2 3 241   Edman University  SULW F7M9 TYGH 3GJ3 R3VJ  Licence Info  1 Intel processor  No hyper threading in cpu  single core    0 searches running        Search log   monitor log   error log  Error message descriptions  Do not auto refresh this page    Name   NCBInr Family   C  inetpub mascot sequence NCBinr current NCBInr_  fasta  Filename   NCBInr_20120419 fasta Pathname   C  inetpub mascot sequence NCBinr current NCBInr_2012041   Status   In use Statistics Compression warnings Unidentified taxonomy       State Time Sun Apr 22 09 04 22   searches   0  Mem mapped   YES Request to mem map   YES Request unmap   NO Mem locked   NO    Number of threads    1 Current   YES   Name   SwissProt Family   C  inetpub mascot sequence SwissProt current SwissProt_     Filename   SwissProt_2012_03 fasta Pathname   C  inetpub mascot sequence SwissProt current SwissProt_  Status   In use Statistics Unidentified taxonomy       State Time Sat Apr 21 20 40 18   searches   0  Mem mapped   YES Request to mem map   YES Request unmap   NO Mem locked   NO  Number of threads    1 Current   YES          If all is well  the status line for 
149. ed by problems that might be introduced by others     Finally  software patents pose a constant threat to the existence of any free program  We wish  to make sure that a company cannot effectively restrict the users of a free program by obtaining  a restrictive license from a patent holder  Therefore  we insist that any patent license obtained  for a version of the library must be consistent with the full freedom of use specified in this  license     Most GNU software  including some libraries  is covered by the ordinary GNU General Public  License  This license  the GNU Lesser General Public License  applies to certain designated  libraries  and is quite different from the ordinary General Public License  We use this license for  certain libraries in order to permit linking those libraries into non free programs     When a program is linked with a library  whether statically or using a shared library  the  combination of the two is legally speaking a combined work  a derivative of the original library   The ordinary General Public License therefore permits such linking only if the entire  combination fits its criteria of freedom  The Lesser General Public License permits more lax  criteria for linking other code with the library     We call this license the Lesser General Public License because it does Less to protect the  user   s freedom than the ordinary General Public License  It also provides other free software  developers Less of an advantage over competing non free p
150. ed by the University of    California  Berkeley and its contributors      4  Neither the name of the University nor the names of its con   tributors      may be used to endorse or promote products derived from this  software    without specific prior written permission            THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS    AS IS     AND     ANY EXPRESS OR IMPLIED WARRANTIES  INCLUDING  BUT NOT LIMITED TO   THE     IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR  PURPOSE     ARE DISCLAIMED  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE  LIABLE     FOR ANY DIRECT  INDIRECT  INCIDENTAL  SPECIAL  EXEMPLARY  OR CONSE   QUENTIAL     DAMAGES  INCLUDING  BUT NOT LIMITED TO  PROCUREMENT OF SUBSTITUTE  GOODS     OR SERVICES  LOSS OF USE  DATA  OR PROFITS  OR BUSINESS INTERRUP   TION      HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY  WHETHER IN CONTRACT   STRICT     LIABILITY  OR TORT  INCLUDING NEGLIGENCE OR OTHERWISE  ARISING IN  ANY WAY     OUT OF THE USE OF THIS SOFTWARE  EVEN IF ADVISED OF THE POSSIBILITY  OF     SUCH DAMAGE               COPYRIGHT 8 1  Berkeley  3 16 94   t    PLPA    Copyright  c  2004 2006 The Trustees of Indiana University and Indiana  University Research and Technology  Corporation  All rights reserved    Copyright  c  2004 2005 The Regents of the University of California   All rights reserved    Copyright  c  2007 Cisco Systems  Inc  All rights reserved    Portions copyright     xxvi Mascot  Installation and Setup    Co
151. ed on one  line separated by commas     RULES contains a list of the rule numbers that define the instrument  type in the configuration file fragmentation rules  The rule numbers  are listed explicitly because the contents of the configuration file may  have changed since the search was run     Masses      gc0p4Jq0M2Yt08jU534c0p    Chapter 8  I O File Formats 151    Content Type  application x Mascot  name     masses       A 71 037110  B 114 534940  C 160 030649  D 115 026940  EBH 129 042590  F 147 068410  G 57 021460  H 137 058910  T 113 084060  J 0 000000  K 128 094960  L 113 084060  M 131 040480   N 114  042930  O0 0 000000  P 97 052760  Q 128 058580  R 156 101110  S 87 032030  T 101 047680  U 150 953630  V 99 068410  W 186 079310  X 111 000000   Y 163  063330  Z 128 550590  Hydrogen 1 007825  Carbon 12 000000  Nitrogen 14 003074  Oxygen 15 994915  Electron 0 000549  C_term 17 002740  N_term 1 007825  deltal 15 994919 Oxidation  M   NeutralLoss1 0 000000  FixedMod1 57 021469  Carbamidomethyl  C   FixedModResidues1 C    gc0p4Jq0M2Yt0845U534c0p    This block contains    actual    mass values  That is  average or monisotopic  residue masses  including any fixed modifications  C and N terminus  groups also include any fixed modifications     FixedMod1  FixedMod2  etc   records the delta mass and name for each  fixed modification as comma separated values  FixedModResidues1 gives  the site specificity  If multiple residues are affected  they are listed as a    152 Mascot  Insta
152. ed separately in a number of files that are then memory mapped   The description line s  are not memory mapped  only an index to the  description in the original database  Compared with mapping the origi   nal FASTA database  this can reduce memory requirements by more  than 30   Furthermore  the savings for a nucleic acid database are even  greater because the files are compressed with a 2 1 ratio     For trouble shooting purposes  Monitor can be started from a command  or shell prompt with the argument DEBUG  Under Windows  ms   monitor exe must not be started from the command line if it is already  running as a service     Status and error messages from Monitor can be viewed from a web  browser using the Mascot Status application  described below     GetSeq is a utility for retrieving the sequence  title  or full text of an  entry in a database configured for use by Mascot  The utility can be used  to retrieve information for a single entry  or in batch mode    Single entry mode    The executable  x cgi ms getseq  exe  accepts the following command  line parameters     1  The name of the database  e g  NCBInr This argument is re   quired     2  An accession string  e g  1OOK_RAT  This argument is required     3  One of five keywords  seq  all  len  title or pI  This argument is  required  and is explained further below     4   Nucleic acid databases only  Frame number between 1 and 6 to  retrieve a sequence translated into protein or 0 for the original  nucleic acid sequen
153. ed the choice of which web site to install under  However  there    44 Mascot  Installation and Setup    are a number of issues to be aware of if the    Default Web Site    is not  used     Perl installation     On installation  ActiveState Perl only sets up mappings for the default  web site  To add mappings for another web site  right click on the new  web site  choose properties  and then click on the    Home Directory    tab   Then  click on the configuration button  If there is no entry in the list for   pl extension  click on    Add    and enter the following information     Add Edit Application Extension Mapping X x        Executable  C  Perl bin Perl ex    Extension    pl    Browse       Method exclusions   PUT DELETE    I Script engine    I Check that fle exists Cancel   He         Host name    Mascot will be installed in the correct virtual directory  but the host   base  name may be wrong     the installation program has chosen the  computer name     If you have set up multiple web sites  then it is probable that you have  created a DNS entry that is the same as the    Host header name     In this  case  replace the computer name with the    Host header name       Refer to the IIS documentation for details on setting up multiple web  sites using the same IP and port address  Briefly  you will need to add a  new web site  and then click on the    Web site tab    of the properties box   Next  click on the Advanced button  and enter a host name     Chapter 3  Installa
154. ed to logs errorlog txt  This is may be the only  place to find a fatal error message resulting from a major configuration  problem     Examples of typical error messages are shown below  A comprehensive  list of all Mascot error messages can be found in the file errors  html   in the root directory of the Mascot CD ROM     Error  M00088   Job 2636   X00123 file upload    Thu Mar 11 10 59 30 2009    Invalid command mass at line 1 of your query   Line is where am I     Error  M00034   Job 2638   X00251 modifications    Thu Mar 11 10 59 59 2009    Modification conflict  Both Carbamidomethyl  C  and        amp  Carboxymethyl  C  modify the same residue    Error  M00133   Job 2639   X00938 www    Thu Mar 11 11 00 21 2009    Peptide mass of  1234 is too small  The minimum mass allowed is 30    Chapter 6  Configuration  amp  Log Files 105    Searches Log    Every Mascot search is listed in logs searches log  The Mascot  Review utility provides a web browser interface to this file  displaying  filtered and sorted listings of searches  Mascot Review is described in  Chapter 7     Alternatively  the file can be opened in a spreadsheet program  The file  consists of 14 columns  delimited by tabs  Row 1 contains column titles   An example of a single entry is shown below     2633  t 185  t NCBInr  t JSC  t JSC gmail com  t it  gt    amp     data 20090311 F002633 dat  t Thu Mar 11 09 10 36 2009        amp   t 17  t User read res  t 1  t PMF  t Yes  t 192 168 42 4     Tabs indicated by  t
155. eference file  this value is  ignored and can be set to 0     14  Taxonomy  Index of the taxonomy rule block to be used to parse  taxonomy information  If taxonomy information is not available  or is not  to be used  this value should be set to 0     PARSE    Do not modify this section if you ever use Database Manager    The PARSE section contains Basic Regular Expressions used to extract  strings from various files     PARSE      For NCBI accession e g     RULE 6    s   gi   0 9                 For NCBI description   everything after the first space  RULE_7 Se ee eee        end    78 Mascot  Installation and Setup    The syntax of a standard Basic Regular Expression  BRE  is described in  Appendix A  Rules defined in this section are referred to by means of  their index number in two sections  Databases and WWW     RULE_6  for example  looks for the     gt     at the beginning of the title line   The string to be extracted is in backslashed parentheses     gi     then as  many digits as possible  The match will stop when a non digit is encoun   tered  such as a pipe symbol or a space     If you are not familiar with regular expressions  use the information in  Appendix A to understand how the pre defined rules in mascot  dat  work     A mistake in a rule called from the databases section may prevent  Mascot from using the database concerned  Always use the Database  Manager to configure and test new database definitions before they are  brought on line     WWW  Do not modify
156. egistrati     O Windows Firewall Remote Management   E Windows Management Instrumentation  WMI    O Windows Media Player   C Windows Media Player Network Sharing Service   O Windows Media Player Network Sharing Service  In     C Windows Peer to Peer Collaboration Foundation   O Windows Remote Management    O Wireless Portable Devices il    oo00000gs8000  ooo0o0008008  ooo0000000    mo              If you installed Apache instead of IIS  there may be no entry for HTTP   Choose    Windows firewall with advanced security     then    Incoming ruleg     42 Mascot  Installation and Setup       File Action View Help  e gt  20e 4h    E oora rues  Inbound Rules Name Group ei    By Outbound Rules  Buy Connection Security Rules   Apache HTTP Server   F New Rule       b E Monitoring   Apache HTTP Server Y Filter by Profile    BranchCache Content Retrieval  HTTP In  BranchCache   Coi  Y Filter by State  Y                  BranchCache Hosted Cache Server  HTT    BranchCache   Ho    BranchCache Peer Discovery  WSD In  BranchCache   Pee    Connect to a Network Projector  TCP In  Connect to a Netw View    connect to a Network Projector  TCP In  Connect to a Netw    connect to a Network Projector  WSD Ev    Connect to a Netw    Connect to a Network Projector  WSD Ev    Connect to a Netw sp Export List       connect to a Network Projector  WSD Ev    Connect to a Netw   6 Connect to a Network Projector  WSD Ev    Connect to a Netw     Connect to a Network Projector  WSD In  Connect to a Netw    
157. eins jobid  873  gt    lt msgs protein gt    lt msgs accession gt RL19 YEAST lt  msgs accession gt    lt msgs db gt SwissProt lt  msgs db gt    lt msgs prot_title gt  amp gt sp P05735 RL19 YEAST 60S ribosomal protein  L19 OS Saccharomyces cerevisiae GN RPL19A PE 1 SV   5 lt  msgs prot_title gt    lt msgs prot_len gt 189 lt  msgs prot_len gt    lt msgs prot_ pi gt 11 35 lt  msgs prot_pi gt    lt msgs prot_ sequence gt MANLRT     ALLKEDA lt  msgs prot_sequence gt    lt  msgs protein gt    lt msgs protein gt    lt msgs accession gt G3P2 YEAST lt  msgs accession gt      lt  msgs protein gt     Chapter 7  Program Reference 119     lt msgs protein gt    lt msgs accession gt TRY1_ BOVIN lt  msgs accession gt      lt  msgs protein gt    lt  msgs all_proteins gt    lt  msgs ms_getseq out gt     Error messages    All errors have unique codes and are logged to both the XML output and  the Mascot error log   but only the first 10 instances of any particular  error number   The XML output contains a full set of error messages in a  structured format that can be processed automatically     Fatal Errors  no database entry is going be retrieved   403    Error while reading mascot dat     Parameters     errstring     error message as generated by ms parser    463    db    parameter is missing      464    accession    parameter is missing      440    Invalid session or session ID     Parameters     errstring     error message as returned by security objects  443    Not allowed to search the da
158. em  context or in the context of the requesting user   J   pf e e  Configuration     localhost    applicationHost config    lt location path  Default Web Site mascot  gt          If you have configured IIS 7 with multiple web sites  and the Mascot  server is not installed in the default web site  you will need to browse to    Appendix D  Web Server Configuration 235    the appropriate location  You can also inspect the CGI timeout at other  connection nodes  in case a different timeout has been set manually at  the cgi node or even at the level of individual files  inadvisable      Apache    Apache is a very rugged and popular server for Unix platforms  It is a  less obvious choice for Windows  since the Mascot installation program  will configure Microsoft IIS automatically     Important  When using Apache  in the Options section of mascot   dat   ensure that ForkForUnixApache is set to 1    If the URL  mascot is mapped to disk path mascot  htm1  then URL   mascot images will correspond to disk path mascot  html images  So  it  is important that the entries for the cgi and x cgi directories come  before that for the htm1 directory  Otherwise  the server will report that  it cannot find the cgi and x cgi paths  because it has assumed from the  URL that they are sub directories of mascot  html     If the web browser connection breaks when submitting a large search or  viewing a large result report  add or increase the Timeout directive in  the configuration file  Remember to r
159. en on an individual report    94 Mascot  Installation and Setup    URL by appending  amp _server_mudpit_switch X  where X is the ratio  between the number of queries and the number of database entries    after any taxonomy filter         NoResultsScript    cgi master_results pl  ProteinFamilySwitch 300   ResultsFullURL    URL    cgi master_ results pl  ResultsFullURL_ 2    URLH   cgi master_ results 2 pl  ResultsPerlScript    cgi master results pl  ResultsPerlScript 2    cgi master_results 2 pl    These are URL   s  not disk paths  for the scripts to be called by the search  engine at the completion of a search  A successful search calls  ResultsPerlScript if the number of queries is less than  ProteinFamilySwitch otherwise ResultsPerlScript 2  A search  that didn   t find any hits calls NoResultsScript     The ResultsFullURL and ResultsFullURL 2 are used when a link to  the search results is emailed to a user  Since the email will probably be  received on another system  the link needs to have the full URL includ   ing the Web server hostname     URL    1s replaced by the server URL  during installation    NTIUserGroup Users  NTMonitorGroup Administrators    Under Windows  the Mascot service is generally run using the    Local  System    account  It has to create  write and read the memory mapped  files  The CGI scripts  such as nph mascot  exe  are run by the Web  server  and will be run using a different user name with different  permissions from the service  These program
160. ended to  apply and the section as a whole is intended to apply in other  circumstances     It is not the purpose of this section to induce you to infringe any  patents or other property right claims or to contest validity of any  such claims  this section has the sole purpose of protecting the  integrity of the free software distribution system  which is  implemented by public license practices  Many people have made  generous contributions to the wide range of software distributed  through that system in reliance on consistent application of that  system  it is up to the author donor to decide if he or she is willing   to distribute software through any other system and a licensee cannot  impose that choice     This section is intended to make thoroughly clear what is believed to  be a consequence of the rest of this License     8  If the distribution and or use of the Program is restricted in  certain countries either by patents or by copyrighted interfaces  the  original copyright holder who places the Program under this License  may add an explicit geographical distribution limitation excluding  those countries  so that distribution is permitted only in or among  countries not thus excluded  In such case  this License incorporates  the limitation as if written in the body of this License     9  The Free Software Foundation may publish revised and or new versions    xii Mascot  Installation and Setup    of the General Public License from time to time  Such new versions wi
161. ent    You have read and understand this agreement and agree that it constitutes the  complete and exclusive statement of the agreement between us with respect to the  subject matter hereof and supersedes all proposals  representations   understandings and prior agreements  whether oral or written  and all other com   munications between us relating thereto     Assignment    This agreement is personal to you  either an individual or a single corporate entity   and you may not assign  transfer  sub contract or otherwise part with this agree   ment or any right or obligation under it without our prior written consent     Law and Disputes    This agreement and all matters arising from it are governed by and construed in  accordance with the laws of England and Wales  whose Courts shall have exclusive  jurisdiction over all disputes arising in connection with this agreement     If you have any questions about this agreement  write to us at Matrix Science Ltd    64 Baker Street  London W1U 7GB  UK or call us at  44  0 20 7486 1050 or email  us at info matrixscience com    End User Licence Agreements Vv    Xerces    ie    The Apache Software License  Version 1 1           Copyright  c  1999 2001 The Apache Software Foundation  All rights    reserved      Redistribution and use in source and binary forms  with or without     modification  are permitted provided that the following conditions     are met      1  Redistributions of source code must retain the above copyright     notice
162. equence query as being N terminal or C terminal series     The first field in a tagn value is t for a standard sequence tag and e for  an error tolerant sequence tag    Some search parameters can be define in the local scope of a query   These are CHARGE  COMP  INSTRUMENT  IT_MODS  TOL  TOLU   Any that are used are listed here  If the MGF file contained scan range  information in terms of seconds or scans  this is written to  RTINSECONDS and or SCANS     Index       gc0p4Jq0M2Yt084jU534c0p  Content Type  application x Mascot  name   index       parameters 4  masses 78  unimod 116  enzyme 322  taxonomy 329  header 336  summary 351  et_summary 6059  peptides 6473  et_peptides 7143  proteins 7292  queryl1 7362  query2 7374     query81 8322  query82 8334     gc0p4Jq0M2Yt084jU534c0p       Values in index are the line number offsets of the section    Content   Type     lines  starting from 0 for the first line of the file      164 Mascot  Installation and Setup    165       Taxonomy       ascot supports the use of a taxonomy filter to limit the database   entries to be searched  This is useful because it speeds up the   search  and can reduce the proteins in the results list to those  expected in the sample being analysed     Some databases record taxonomy in a manner that makes it difficult to  extract the information reliably  The major problems are     1  The location of the text containing the species identifier is mostly not  defined  and can even vary within one database    2
163. equired  In the case of a  nucleic acid database  the limited character set allows Mascot to pack  two base codes into each byte of memory  If a taxonomy filter is required   a taxonomy index is built at the same time as the file is compressed     Naming Conventions and Directory Structure    Although Microsoft Windows permits file and directory names to include  spaces  file and directory names to be used by Mascot  or to appear in a  URL  cannot include spaces     By following some simple conventions in database naming  Mascot  Monitor enables sequence databases to be automatically updated with   out any disruption to on going searches     Chapter 5  Sequence Database Setup 59    The procedure followed by Monitor is that the new database is com   pressed and tested by running a standard search  If errors are detected  in the new database  the database exchange process is abandoned   Assuming the test is successful  all new searches are performed against  the new database  while searches that were in progress against the old  database are allowed to continue  Once the final search against the old  database is complete  the disk file is moved into an archive directory  If  the database being exchanged is memory mapped  the mapping and un   mapping are also handled automatically     Assuming that the new database will be updated periodically  a directory  structure similar to the one created for SwissProt during installation is  recommended  For example                        
164. er each protein match in a peptide summary report   matches to proteins that contain a sub set of the same peptides will also  be listed  This was the default behaviour in version 1 6 and earlier  If  this flag is set to 0  which is now the default  the sub set matches will not  be shown  Values between 0 and 1 represent the fraction of the protein  score of the primary hit that a subset hit can lose and still be listed  For  example  if ShowSubSets is 0 2  and the primary hit has a protein score  of 200  sub set hits with scores of 160 or more will be listed     If multiple entries contain the full set of peptides  they are all displayed   whatever the setting of this parameter  This global default can be over   ridden on an individual report URL by appending  amp _showsubsets X   where X is 0 or 1     SigThreshold 0 05    Significance threshold used in result reports  default 0 05  Valid range is  1 to 1E 18  This global default can be overridden on an individual report   URL by appending  amp _sigthreshold X  where X is the significance thresh   old     Chapter 6  Configuration  amp  Log Files 101    SiteAnalysisMD10Prob 0 1    Used to calculate relative probabilities of modification assignments in  Peptide View  It defines the factor in probability that a peptide score  difference of 10 corresponds to  The default is 0 1  which means a score  difference of 10 corresponds to a factor of 10 in probability  Similarly   0 05 corresponds to a factor of 20     SortUnassigned sc
165. er node as the user who will own the ms   monitor exe process   generally root   and generate a version 2 RSA key  pair by executing     ssh keygen    t rsa    2  When asked for a passphrase  press return to indicate no  passphrase is required  Accept the default location for saving the key  files   SHOME  ssh     3  The contents of the public key   HOME  ssh id_rsa pub  must be  added to a file called  HOME  ssh authorized_keys on each of the search  nodes     4  Test communication by logging in to each search node from the  master node using ssh  The first time a connection is made  confirm that  the new host should be added to the list of known hosts   HOME  ssh   known_hosts    Installation    Perform a standard installation of Mascot onto the master system ac   cording to the procedure in Chapter 2  Verify correct system operation as  a single server by performing searches of SwissProt and familiarise  yourself with administrative tools such as ms review exe and ms   status exe  Chapter 7   Any problems need to be resolved before  reconfiguring for cluster operation     Chapter 11  Cluster Mode 201    Cluster Configuration Procedure  1  Kill ms monitor exe    2  Open mascot  config mascot dat in a text editor  Move down to  the    Cluster    section and enter configuration information for the cluster   The parameters are fully described below in the Reference section  In the  databases section  verify that the threads and blocks parameters are  set to 1 for all databases
166. eronin homolog Hsp 60  mitochondrial OS Caenorhabditis elegans GNehsp 60 PE 1 SV 2  CH6O STEMS 60 kDa chaperonin OSsStenotrophomonas maltophilia  strain RS 1 3  GNegroL PE 3 sVel    Mascot Score Histogram    Ions score is  10 Log P   where P isthe probability that the observed match is a random event   Individual ions scores  gt  41 indicate identity or extensive homology  p lt 0 05    Protein scores are derived from ions scores as a non probabilistic basis for ranking protein hits     Numer of Mite       a  15   Protein Scere       Sr    Peptide Summary Report    As Papnde Summary    Significance threshold p lt  005    Help  Max  number of hits AUTO    Standard scoring       MudPIT scoring lons score or expect cut off 9 Show sub sets 0    Show pop ups     Suppress pop ups Sort unassigned Oecreasing Score Require boldred       Select All Select None Seah Selected   D Error tolerant   Archive Report    1  CH6EO HUMAN Mass  61197 Score  1365 Matches  33 29  Sequences  21 19   60 kDa heat shock protein  mitochondrial OS Homo sapiens GN lt HSPD1 PE 1 SV 2  Check to include this hit in error tolerant search or archive report             1of3                Query Observed Mr expt  Mxr calc  ppm Miss Score Expect Rank Unique Peptide  z LL 417 18522 832 3498 6832 3828  39 57 0 as 0 018 1 K  APGFGONR K  2 12 422 7433 843 4720 843 5066    a6 0 037 2 kd K VGEVIVTK D  z 13 430 7328 859 4510 859 4837    36 0 32 1 v K IPAWTIAK N   Oxidation  M   ad 15 451 2499 900 4853 o 56 0 003 1 v K LSD
167. es     Boolean values can be coded in different ways     true   TRUE   True   on   any number except 0   any string except  an empty string    false   FALSE   False   0         All missing parameters are defaulted to    false    value  Missing frame   parameter by default is equal to 0     Output format    In response to any POST request  XML format output is returned   Encoding UTF 8 is to be used for output  XML output is schema vali     118 Mascot  Installation and Setup    dated and schema versioned  All XML output must be XML escaped  using the following substitutions      gt   amp gt    lt   amp lt    amp   amp amp     6     amp apos         amp quot    Proteins are returned in the order requested  A  lt msgs frame gt  element  will only be output for an NA database     The example input file would produce output similar to this  edited for  brevity       lt  xml version  1 0  encoding  UTF 8  standalone  no      gt    lt msgs ms_getseq out xmlns msgs  http   www matrixscience com xmlns   schema msgetseq 1   majorVersion  1  minorVersion  0   xmlns xsi  http   www w3 org 2001 XMLSchema in   stance     xsi schemaLocation  http   www matrixscience com   xmlns schema msgetseq 1 msgetseq 1 xsd    gt    lt msgs all_ errors gt    lt msgs error code  461  gt    lt msgs err description gt Sequence not found lt  msgs err_ description gt    lt msgs err param name  accession     gt ERROR_YEAST lt  msgs err_param gt    lt  msgs error gt    lt  msgs all_errors gt    lt msgs all prot
168. es number of residues in DB  distribution see below   distribution decoy see below   decoy _type n  type of decoy  1  8    exec _time search time in seconds  date timestamp  seconds since Jan 1   1970   time time in hh mm ss   queries number of queries    gt   1    max hits maximum number of hits to be listed  version version information   fastafile full path to database fasta file    Chapter 8  I O File Formats 155    release filename of actual database used   e g  Owl_31 fasta  taskid unique task identifier for searches submitted asynchronously  pmf num _queries used number of mass values selected for PMF match  pmf_ queries used comma separated list of selected query numbers  Warn0    Warnl    Warn2        gc0p4JqOM2Yt084jU534c0p    The Header section contains general values  used in the master results  page header paragraph     Distribution is a comma separated list of values that represent a  histogram of the complete protein score distribution  The first value is  the number of entries with score 0  the second is the number of entries  with score 1  and so on  up to the maximum score for the search  Scores  are converted to integers by truncation  This distribution is only mean   ingful for a peptide mass fingerprint search     If intensity values are supplied for a peptide mass fingerprint  Mascot  iterates the experimental peaks to find the set that gives the best score   The number of values selected is reported in pmf_ num queries _ used  and the selected queries li
169. estart Apache after saving the  change  The argument is in seconds    Timeout 3600    Linux configuration    The following lines illustrate typical mappings and permissions for the  Mascot directories     ScriptAlias  mascot cgi htsearch  usr lib cgi bin   htsearch     lt Directory  usr local mascot cgi gt   AllowOverride None  Options None  Order allow deny  Allow from all   lt  Directory gt   ScriptAlias  mascot cgi  usr local mascot cgi     lt Directory  usr local mascot x cgi gt   AllowOverride None  Options None    236 Mascot  Installation and Setup    Order allow deny  Allow from all   lt  Directory gt   ScriptAlias  mascot x cgi  usr local mascot x cgi     lt Directory  usr local mascot html gt   AllowOverride None  Options None  Order allow deny  Allow from all   lt  Directory gt   Alias  mascot  usr local mascot html    Windows Installation    If you choose to use Apache under Windows  a good starting point for  support information is     http   httpd apache org docs 2 4 platform windows html    Important  If IIS is installed  stop the IIS service before installing  Apache  Perl and Mascot     Mascot 2 4 has been tested with Apache 2 2 22 on Windows 7     After Mascot has been installed  from the Windows Start menu  choose  Programs  Apache HttpServer 2 2  Configure Apache Server  Copy the  customised Apache configuration settings from the httpd conf file in  the Mascot config directory and paste them at the end of the Apache  httpd conf file  Save the changes  From 
170. g   usr lib   sendmail    Set MailTempFile as the name of the file used to store email messages  until they can be sent  must be the path followed by a filename in the  form  MXXXXXX   This will create temporary files that begin with M    Chapter 6  Configuration  amp  Log Files 85    followed by a unique number  Typically this parameter will be  var tmp   MXXXXXX     Blat Configuration  Windows only     Blat is a free  easily installed mail program for Windows  For more  information  visit     http   www blat net   Set MailTransport to 3     Set the EmailUserFrom parameter to the name that is required in the     From    field of the email messages     Set EmailFromTextName as the name of the server that is running  mascot  For example setting EmailUserFrom to www and  EmailFromTextName to Mascot Server will result in emails from www   Mascot Server   The From field of the email will be  www www your_domain com     Set sendmailPath as the fully qualified path  including drive letter  for  the Blat program     Set MailTempFile as the name of the file used to store email messages  until they can be sent  must be in the form path MXXXXXX   This will  create a new temp file where the first letter will be an M and the next 6  characters will make up a unique number  Typically this parameter will  be c  temp MXXXXXX    ErrorLogFile    logs errorlog txt   GetSeqJobIdFile    data getseq job   InterFileBasePath c  inetpub mascot data  Windows    usr local mascot data  Linux    InterFi
171. g exe u 1172165637 he    Edit Enzyme  CNBr Trypsin    General    Title  CNBr Trypsin          Independent o       Semispecific Fi    Components      Sense Cleave At Restrict Delete    C Term     M   o    C Term    KR  p o    Delete    est  Protein      MSEELSQKPSSAQSLSLREGRNRFPFLSLSQREGRFFPSLSLSERDGRKFSFLSMFSFLM    PLLEVIKITISSVASVIFVGFACVTLAGSASALVYS TPYFIIFSPVLYPATIATVVLATGFTAG  GSFGATALGLIMWLVKRRMGVKPKDNPPPAGLPPNSGAGAGGAQSLIKKSKAKSKGGLK        Start End Peptide   1 1 n   2 18 SEELSQKPSS AQSLSLR  19 21 EGR   22 23 NR   24 FPFLSLS0R  kal EGR   36 FFPSLSLSER  46 DGR   49 K   50 FSFLSM   56 FSFLM   61 PLLEVIK    ITIASVASVI FYGFACVTLA GSAAALVVST PYFIIFSPVL VPATIATVV  L  ATGFTAGGSF GATALGLIN    WLVK                               ooon PUNE H    68       g Local intranet       File format  enzymes     Each cleavage agent is defined by a block of lines  Blocks are delimited  from one another by a line containing an asterisk  Each line in a block  starts with a keyword     70 Mascot  Installation and Setup    Title Trypsin  Cleavage KR  Restrict P  Cterm       Title Asp N  Cleavage  DB    Nterm  k    The first line of each block must start with the Title  keyword  fol   lowed by a text string that is used to identify the cleavage agent in forms  and reports  The definition should be short and self explanatory  It  should only include alphanumeric characters and spaces  Internal spaces  are significant     Each block must also include a line starting with the keyword Cleav   age  fol
172. gistration File    Please Note  As part of the product registration process  the following information will be transmitted to Matrix Science     e Details of any existing licence   e Machine identifiers for node locking purposes  eg  MAC address   z          A product key is required and must be registered online  The licence file  will be returned by email and must be saved to the specified location on  the Mascot server  If the Mascot server cannot connect to the Internet  a  file containing registration information can be saved and copied to a  system with Internet access for submission     The registration form allows a second email address to be specified  in  case the person installing Mascot is not the end user  Ensure that the  end user email address is entered into the upper part of the form and the    38 Mascot  Installation and Setup    email address to which the licence file should be sent is entered into the  CC email field in the lower part of the form     The licence file must be saved to the config licdb directory as a file  with the extension   lic     Verify System Operation    A copy of the SwissProt database is included in the files copied from the  CD ROM  It is recommended that the operation of Mascot is verified and  tested using this database before adding further databases or making  configuration changes     The Mascot Monitor service is used to manage the swapping and memory  mapping of the sequence databases used by Mascot  For Mascot to  operate  
173. grams     A value of  2 can be used if the same user name is used to run Web  server scripts as runs ms monitor  exe   This is generally only possible  under Irix  using capabilities   In this case  The files created by  ms monitor exe will not be world accessible  and    chown    is not used on  the files to change ownership     Failure to put the correct group name will generally result in one of two  error messages     Chapter 6  Configuration  amp  Log Files 103    Failed to open memory mapped file  lt filename gt          amp  Error  access denied    or  Failed to create memory map for  lt filename gt        amp  Error Access denied  Vmemory  1  Obsolete   Cron    Do not modify this section if you ever use Database Manager    Database Manager uses the information in this section to schedule  database updates     Cron   CronEnable 1   Logfile    logs cron log   Logging 3   0 59   1 31      usr local mascot bin dbman_process tasks pl  end    CronEnable is set to 1 to enable cron functionality  0 to disable     Logfile specifies the path to the log for recording cron events  Logging  controls the verbosity     0   No logging   1   Log successful commands  return code 0    2   Log unsuccessful commands  return code not 0   3   Log successful and unsuccessful commands    The remaining lines in this section simulate a crontab file  Each line  contains six fields  separated by spaces or tabs  The first five are integer  patterns that specify the following  minute  0 59   hour  
174. guration MATRIX  Configure IIS web site settings  i CIEN CE     Please enter the name of an existing IIS web site that you want to use for Mascot   Usually the default web site is the most appropriate     Web Site  Default Web Site    Below you can modify the name of the Mascot virtual directory in IIS  However  we  recommend that you accept the default name  This value is added to the web site given  above to form the full Mascot URL  eg  you might type into your browser    http    EC VM64 mascot    Virtual Directory  ME eonan aa           For Apache  or any other web server  you need to confirm the local web  server hostname and port  Do not enter localhost as the web site if you  wish to access your Mascot server from other computers on your LAN  If  there are DNS problems  so that a hostname is not recognised across the  LAN  then enter an IP address     The default ports are 80 for http and 443 for https  The installer will test  that the web server responds using the specified hostname and port  number  If you have configured your Apache web server as a secure  server  https   check the box for    Use SSL TLS to access this web site     34 Mascot  Installation and Setup       ii EAE SEEE AEEA NEANS ERE IME ENSO ga B     Apache Configuration MATRIX  Configure Apache web site settings  SCIEN CE       Please enter the host name  or IP address  to be used for accessing the Apache web  server on this computer  You may optionally specify a port number after a colon  Usually  
175. gure  a file update schedule  so that new releases are downloaded automati   cally  For more information about Database Manager  refer to the Mas   cot HTML help pages     If you want to set up a custom database  such as the proteome or genome  of a single organism  download and configuration information can also be  found in the Mascot HTML help pages  Note that the HTML help pages  for your in house Server are only updated when you install a new version  of Mascot  so for the latest information  go to the help pages on the  Matrix Science public web site  http   www matrixscience com help   seq_db_setup html      This chapter contains reference material  most of which is only impor   tant if you choose not to use Database Manager     The Fasta Format    Mascot can search any Fasta format sequence database as long as it can  parse a unique identifier  accession string  from each entry in a consist   ent fashion  The accession string can contain any US ASCII printing  characters except comma and double quotes     The Fasta format is extremely simple  Each entry consists of a one line  title followed by one or more lines containing the contiguous sequence  string in 1 letter code  Fasta databases can contain either amino acid  sequences or nucleic acid sequences  but not both  Nucleic acid databases  are translated on the fly by Mascot in all six reading frames     58 Mascot  Installation and Setup    The Fasta title line begins with a    greater than    character  followed by
176. h is submitted from a browser and the connection is broken  before the search is complete  the search will be killed  The only known  workaround is to use a different web server  e g  Apache     Mascot requires Perl  together with several Perl library modules   ActivePerl 5 14 is recommended  ActivePerl 5 8  5 10  and 5 12 are also  supported     Active Perl 5 14 2  build 1402  from ActiveState Corporation is supplied  on the Mascot CD  You must install or upgrade Perl after installing the  Web server  and before installing Mascot     IMPORTANT  You cannot perform single step upgrades for ActivePerl   You must uninstall the old version before installing the new one  As a    precaution  it is also worth deleting the Perl application directory after  the uninstall step        To install ActivePerl from the CD  in Windows Explorer  double click on  the appropriate file     32 bit   ActivePerl1 5 14 2 1402 MSWin32 x86 295342 msi   64 bit   ActivePerl 5 14 2 1402 MSWin32 x64 295342 msi    It is recommended that you accept all the default options for the installa   tion  Full documentation for ActivePer  5 14 can be found here     http   docs activestate com activeperl  5 14 full_toc html    Chapter 3  Installation  Microsoft Windows 29    The Mascot installer uses the Windows file association for the   pl  extension to locate Perl  If you have more than one version of Per  in   stalled  ensure that the file association is for the correct version  You can  examine the current assoc
177. has a suitable version of Microsoft Windows  installed  Mascot requires Windows XP or later on Intel or AMD     Virus scanning software or Microsoft Outlook should not be running  during the installation    Install Web server software unless already installed     Install or upgrade Perl  unless a compatible version is already in   stalled     Run setup32 exe  32 bit  or setup64 exe  64 bit  off the Mascot CD    It is essential that steps 4  5  and 6 are performed in that order       24 Mascot  Installation and Setup    System Requirements    Disk Space    A typical installation of the Microsoft Web server requires about 150  MB  A typical installation of ActiveState Perl requires about 120 MB  A  full installation of Mascot requires approx 3 6 GB    The hard disk must be formatted for NTFS  FAT32 has a file size limit of  4GB  which would prevent the use of large sequence databases  It is  advisable that NTFS file compression is not used for the compressed  database files  There are reports that NTFS compression is not fully  compatible with memory mapping  NTFS file compression can be used on  the FASTA and reference files if you wish     Memory    To get the best performance from Mascot  the database files need to be  memory mapped  It is recommended that you have at least 4 GB of  RAM  On a 64 bit system  12 GB or more will help ensure best perform   ance     Microsoft Windows versions    XP    Mascot will run under Windows XP Professional  Windows XP Home is  not supporte
178. he  Matrix Science Mascot Service  and allows it to be stopped and started   It is normally accessed from the start menu      Chapter 7  Program Reference 143    Programs  Mascot  config  Show Mascot Service Status  Programs  Mascot  config  Start Mascot Service  Programs  Mascot  config  Stop Mascot Service    These options run the program x cgi ms service exe with the first  parameter set to the service name  MatrixScienceMascotService   and the second parameter being 0  1  or 2 respectively     It is also possible to run this program as a CGI script by entering the  following URL in the browser     http   your host mascot x cgi       amp  ms service exe MatrixScienceMascotService 0    Where your  host is replaced by the host name of the Mascot server   This CGI script can be run from any computer on the network  However   it is not usually possible to start and stop the service from another  computer using the default access rights     There is a final option  which will allow removal of the service  This may  be required for a manual de installation and will not normally be re   quired  If this option is used  Mascot will not run again without re   running the installation program  The command to enter is     ms service MatrixScienceMascotService remove    Compress    Compress is a utility for compressing FASTA files independently of  Mascot monitor     The executable  bin ms compress exe is executed from a shell or  command prompt     ms compress exe db name fasta   wher
179. he operating system   Windows consumes  approximately 60 MB   anything from tens to hundreds of MB for each  Mascot search  and space for any other applications which might be  running     Chapter 6  Configuration  amp  Log Files 77    If you try to lock databases into RAM when there isn   t room  this will not  be a major problem  The locking will fail  generate an error message  and  Mascot will carry on regardless  A more serious problem is when there is  just sufficient RAM to lock the databases  but none left over for searches  or other applications  In this case  the whole system will slow down and  the hard disk will be observed to be    thrashing     Eventually  the system  is likely to hang or crash     10  Local ref file  Flag to indicate whether a local reference file is  available  1  or not  0   For certain databases  e g  SwissProt  it is possi   ble to have a local reference file  from which full text information can be  taken for a    Protein View    report     11  AccessionParseRule  Index of the regular expression in the PARSE  section that can be used to parse an accession string from a FASTA file  title line     12  DescriptionParseRule  Index of the regular expression in the  PARSE section that can be used to parse a description string from a  FASTA file title line     13  AccessionRefParseRule  Index of the regular expression in the  PARSE section that can be used to parse an accession string from a local  full text reference file  If there is no local r
180. hen  submitting a Mascot search  The HTML form executes a utility from  Thermo for generating a peak list from a centroided raw file  For more  details  see the Mascot HTML help page    http   www matrixscience com help instruments_xcalibur html EXTRACT     The script supplied with Mascot expects to find the executable in the  path C  Program Files Thermo ExtractMSn ExtractMSn  exe  If  the program is in another directory  e g  on a 64 bit system it might be in  Program Files  x86   or if the executable has a different name  open  the file mascot  cgi lcq_dta_shell p1in an editor such as Notepad   and modify the following line     my  lcqExe      C   Program  Files   Thermo  ExtractMSn  ExtractMSn exe       Note that the backslashes used as directory delimiters must be entered  in pairs  exactly as shown above     The script needs to create temporary files  and it uses a directory  C    TEMP  If this does not exist  you should either create it  or change the  following line to point to a suitable temporary directory     my S tempDir      c   temp        To use the leq_dta form  enter the filename and any other parameters  required  and press the    Generate  DTA Files    button  After a few sec   onds  the Mascot search screen will be displayed  Enter search param   eters and proceed as normal     Multiple web sites  Using IIS on Server versions of Windows  it is possible to create multiple    web sites  If multiple web sites exist when Mascot is installed  you will  be offer
181. her yes  if user selected an enzyme  or no  if  user selected enzyme type None     14  User IP address    At the top of each column is a checkbox and a radio button  Select the  radio button to sort the display on that column  Uncheck the checkbox to  hide that column     OCmornonaw  rwhd    Along the top of the screen are a series of controls     The Sort filter button updates the display to reflect changes in  parameters     If you have multiple log files  a specific file can be displayed by  entering its path into the Log File text field     Start can be used to page through a long listing in blocks of  entries specified by the number in the following field  Setting  start to  1 displays the list starting from the last entry in the  log file rather than the first    Finally  there is a field to specify a path to the data files  The  log file only contains a relative path  If the data files have been  moved  possibly to an archive directory or CD ROM  the path to  the new location can be specified here so as to restore the  validity of the relative path     An example of the Status display  filtered to show MS MS searches of  NCBInr  is shown below     Chapter 7  Program Reference 125    e C fi Owy a    MASCOT search log           3 after filters  Data dir  e Ty             2 a a          NCBI h  NCBlx my ka sp  NCBle h ch  NCBle si m  NCBle ch ch    197 2734  77 110 197 147  190 229 171 236  77 110 197 147            oo    LEF E E    s          GetTaxonomy    GetTaxonomy i
182. i   mum number of occurrences  The expression   m   matches exactly m  occurrences of the preceding BRE    m     matches at least m occur   rences and   m n   matches any number of occurrences between m and  n  inclusive     For example  in the string abababccccccd the BRE c  3   is matched  by characters seven to nine  the BRE    ab     4    is not matched at  all and the BRE c  1 3  d is matched by characters ten to thirteen     The behaviour of multiple adjacent duplication symbols produces unde   fined results     Expression Anchoring    A BRE can be limited to matching strings that begin or end a line  this is  called anchoring  The circumflex and dollar sign special characters will  be considered BRE anchors in the following contexts     A circumflex   is an anchor when used as the first character of an entire  BRE  The circumflex will anchor the expression to the beginning of a  string  only sequences starting at the first character of a string will be  matched by the BRE  For example  the BRE    ab matches ab in the  string abcdef  but fails to match in the string cdefab     A dollar sign   is an anchor when used as the last character of an entire  BRE  The dollar sign will anchor the expression to the end of the string  being matched  not including a final newline character  if present      A BRE anchored by both   and   matches only an entire string  For  example  the BRE  abcdef  matches strings consisting only of abcdef     228 Mascot  Installation and Setup    22
183. iation by opening a command window and  entering    ftype Perl       ActiveState Marketing Requirements    The following statements are included to comply with the ActiveState  Redistribution Agreement     Commercial support for ActivePerl is available through ActiveState at   http   www activestate com enterprise edition   For peer support resources for ActivePerl issues see   http   community activestate com forums activeperl support    The ActiveState Repository has a large collection of modules and exten   sions in binary packages that are easy to install and use  To view and  install these packages  use the Perl Package Manager  PPM  which is  included with ActivePerl     ActivePerl is the up to date  quality assured ActivePerl binary distribu   tion from ActiveState  Current releases and other professional tools for  open source language developers are available at http     www activestate com    Mascot Installation  From    My Computer    or Windows Explorer  select the Mascot CD and  double click on setup32  exe  for 32 bit  or setup64  exe  for 64 bit      Before the installation of Mascot begins  required Microsoft Visual C    libraries will be installed     The following window will be displayed     30 Mascot  Installation and Setup       rver Setup    Welcome to the Mascot Server  Setup Wizard       O  O    m        Wy  aly    The Setup Wizard will install Mascot Server 2 3 241 RC1 on  this computer  Click Next to continue or Cancel to exit the  Setup Wizard     N
184. if a fatal error  occurs  edit the registry key    HKEY LOCAL MACHINE Software Microsoft DrWatson    Set the value of VisualNotification to 0  When the Mascot node  service starts on a Windows system  it sets a Dr  Watson registry entry  to ensure that Dr  Watson log files are written to the node logs direc   tory     Registry Settings    Two registry entries are used on each search node to record the root  directory of the mascot file structure and the port number used for  communication  For example      HKEY_ LOCAL MACHINE SOFTWARE MatrixScience Mascot 1 00      MascotNodeFolder      C  mascotnode bin        MascotNodePort     5001     Very large Mascot clusters    Very large clusters   gt  30 nodes  pose certain special problems     e Even with reliable hardware  node failures can be expected  relatively frequently    e LAN communication can become a bottleneck    e Need to avoid mixing processors with different speeds  because  the slower processors become a bottleneck    Mascot allows large clusters to be divided into sub clusters  Each sub   cluster uses identical databases and configuration files  but operates  independently of the other sub clusters  An incoming search can be  directed to a specific sub cluster or the first available sub cluster     Should a node go down  only the sub cluster is affected  Ideally  there  will be one or more    spare    nodes defined  Mascot will reconfigure the  sub cluster using a spare node and re start  If there are no spare nodes
185. indows Firewall          Choose Windows Firewall then Advanced Settings  Select Inbound Rules  in the left hand panel and New Rule in the action panel     In the wizard  choose Port  Next  TCP  Specific Local Ports  5001  Next   Allow Connection  Next  Clear the checkbox for Domain and Public   Next  Enter the name as MascotNodePort5001  Finish  The new rule will  be added to the list of Inbound Rules     Nodes belonging to a Workgroup    The steps in this section are not required if all the nodes belong to a  Windows domain     For XP and Server 2008  from the Control Panel  select Administrative  tools  Choose Local Security Policy item and double click on it  Go down  the following path  Security settings  gt  Local Policies  gt  Security Options   On the right side panel select Network access  Sharing and security  model for local accounts    Chapter 11  Cluster Mode 195      Local Security Settings  File Action View Help  vel ny    2  S8 Security Settings   C  Account Policies  Re Interactive logon  Require Domain Controller authentication to unlock worksta     Disabled  a Local Policies  R8  Interactive logon  Require smart card Not defined  H   rms ei Re Interactive logon  Smart card removal behavior No Action  a a h ide    aomen RE  Microsoft network client  Digitally sign communications  always  Disabled  2 as TEET Microsoft network client  Digitally sign communications  if server agrees  Enabled  E   E Software Restriction Policie Microsoft network client  Send unen
186. ing Mascot on a single  multiprocessor server  leave the Enable cluster mode checkbox clear     The next step is your last opportunity to cancel the installation        re Mascot Server Setup toa el  MATRIX  Ready to install Mascot Server SCIENCE    Click Install to begin the installation  Click Back to review or change any of your  installation settings  Click Cancel to exit the wizard     Back       Cancel          Copying the program files takes only a few minutes    36 Mascot  Installation and Setup                Installing Mascot Server i  MATRIX  SCIENCE   Please wait while the Setup Wizard installs Mascot Server   Status  Copying new files  Back Next   Cancel          Unpacking the SwissProt files takes longer  and a command window will be dis   played at this point  Please be patient and don   t try to close the command Window                       E Mase tu Tol  gt     MAS C OT Completed the Mascot Server Setup  Wizard     NY    10 NS Click the Finish button to exit the Setup Wizard   x  S  You will not be able to perform any searches against a   amp  database  eg  SwissProt  until the status of that database  2 changes to    In Use  in the Mascot server status page  If the  E  5 database needs to be compressed by Mascot then this may  take some time to complete   9  50     Open Mascot server status page  Back Finish Cancel          Chapter 3  Installation  Microsoft Windows 37    If you are using Apache  model entries for the Apache configuration file  can be found
187. ion  if any  accompanying the Software provided  that the original and each copy is kept in your possession and that your  installation and use of the Software does not exceed that allowed by this  agreement     modify the HTML and Perl documents for sole use by yourself in connection  with the Software     3 Restrictions of Use    You may not     3 1    3 2    3 3    3 4    3 5    3 6  3 7    4 Title    load the Software into two or more computers at the same time  If you wish  to transfer the Software from one computer to another  you must erase the  Software from the first system before you install it onto a second system     sub license  assign  rent  lease or transfer the licence or the Software or  make or distribute copies of the Software     translate  reverse engineer  decompile  disassemble  modify or create de   rivative works based on the Software except as permitted by Law     make copies of the Software except for backup or archival purposes as  permitted hereunder     use any backup copy of the Software  or allow anyone else to use such  copies  for any purpose other than to replace the original copy in the event it  is destroyed or becomes defective     distribute copies of modified HTML or Perl documents  or    copy the written materials  except as provided by this agreement  accompa   nying the Software     As licensee  you own only the medium on which the Software is recorded  We shall  at all times retain ownership of the Software     5 Warranty    We warr
188. ions  but note that only Profes   sional and Enterprise support remote desktop    It is advisable to ensure that the latest service pack has been installed   Check the following URL for current information     http   windows microsoft com en US windows downloads windows 7    The Microsoft web server for Windows 7 is IIS 7 5  By default  this is not  installed  To install IIS  from the Control Panel  choose Programs and  Features  Turn Windows features on or off  Expand the node for Internet  Information Services  then follow the configuration notes under the  Windows Vista section  above    Web Server    Mascot for Windows is tested with IIS and Apache     28 Mascot  Installation and Setup    The Mascot installation has been fully automated for Microsoft Internet  Information Server 5 0 and later  A good starting point for IIS support  information is http   www 1is net     IMPORTANT  If you are using IIS 7 x  Vista  Server 2008  Windows 7   you must configure it as described in the Windows Vista section  above     before proceeding with the installation  Otherwise  the Perl and Mascot  installations will fail        If IIS is configured as a secure server  SSL TLS   you must change it  temporarily to non secure mode  http  on port 80   Once the installation  is complete  you can change back to secure mode     If you wish to use Apache as your web server  you will need to perform  some manual configuration  as described in Appendix D     IIS 6 0 and later    Perl    If a searc
189. istribute copies of free software  and charge for  this service if you wish   that you receive source code or can get it  if you want it  that you can change the software or use pieces of it  in new free programs  and that you know you can do these things     To protect your rights  we need to make restrictions that forbid  anyone to deny you these rights or to ask you to surrender the rights   These restrictions translate to certain responsibilities for you if you  distribute copies of the software  or if you modify it     For example  if you distribute copies of such a program  whether  gratis or for a fee  you must give the recipients all the rights that  you have  You must make sure that they  too  receive or can get the  source code  And you must show them these terms so they know their  rights     We protect your rights with two steps   1  copyright the software  and   2  offer you this license which gives you legal permission to copy   distribute and or modify the software     Also  for each author   s protection and ours  we want to make certain  that everyone understands that there is no warranty for this free  software  If the software is modified by someone else and passed on  we    viii Mascot  Installation and Setup    want its recipients to know that what they have is not the original  so  that any problems introduced by others will not reflect on the original  authors    reputations     Finally  any free program is threatened constantly by software  patents  We
190. istribution and modification are not covered by this  License  they are outside its scope  The act of running a program using the Library is  not restricted  and output from such a program is covered only if its contents constitute  a work based on the Library  independent of the use of the Library in a tool for writing  it   Whether that is true depends on what the Library does and what the program that  uses the Library does     xviii Mascot  Installation and Setup    1  You may copy and distribute verbatim copies of the Library   s complete source code as  you receive it  in any medium  provided that you conspicuously and appropriately  publish on each copy an appropriate copyright notice and disclaimer of warranty  keep  intact all the notices that refer to this License and to the absence of any warranty  and  distribute a copy of this License along with the Library     You may charge a fee for the physical act of transferring a copy  and you may at your  option offer warranty protection in exchange for a fee     2  You may modify your copy or copies of the Library or any portion of it  thus forming a  work based on the Library  and copy and distribute such modifications or work under  the terms of Section 1 above  provided that you also meet all of these conditions     a  The modified work must itself be a software library     b  You must cause the files modified to carry prominent notices stating that you  changed the files and the date of any change     c  You must cau
191. itions for copying  distributing or modifying  the Program or works based on it     6  Each time you redistribute the Program  or any work based on the    End User Licence Agreements xi    Program   the recipient automatically receives a license from the  original licensor to copy  distribute or modify the Program subject to  these terms and conditions  You may not impose any further  restrictions on the recipients    exercise of the rights granted herein   You are not responsible for enforcing compliance by third parties to  this License     7  If  as a consequence of a court judgment or allegation of patent  infringement or for any other reason  not limited to patent issues    conditions are imposed on you  whether by court order  agreement or  otherwise  that contradict the conditions of this License  they do not  excuse you from the conditions of this License  If you cannot  distribute so as to satisfy simultaneously your obligations under this  License and any other pertinent obligations  then as a consequence you  may not distribute the Program at all  For example  if a patent  license would not permit royalty free redistribution of the Program by  all those who receive copies directly or indirectly through you  then  the only way you could satisfy both it and this License would be to  refrain entirely from distribution of the Program     If any portion of this section is held invalid or unenforceable under  any particular circumstance  the balance of the section is int
192. iversity  SULW F7M9 TYGH 3GJ3 R3VJ  Licence Info  1 Intel processor  No hyper threading in cpu  single core    0 searches running      Search log monitor log  error log Error message descriptions Do not auto refresh this page       Name SwissProt Family   C  inetpub mascot sequence SwissProt current SwissProt_     SwissProt_2012 03 fasta Pathname   C  inetpub mascot sequence SwissProt current SwissProt_  Creating compressed files 20  complete   Sat Apr 21 05 33 56   searches   0   Mem mapped   NO Request to mem map   YES Request unmap   NO Mem locked   NO   Number of threads    1 Current   NO    Filename  Status    ohm ot    State Time          If an error occurs  use the links to the monitor log and the error log to  investigate the cause  If all is well  you will see the following messages  displayed on the status line for SwissProt     Creating compressed files  Running 1st test   First test just run OK  Trying to memory map files  Just enabled memory mapping  In Use    You can begin exploring and using Mascot  However  do not try to run  searches or view results reports until the relevant sequence database is     In Use        Windows Firewall    If Windows Firewall is enabled  you may be blocked from accessing the  Mascot server from other computers  If so  you need to open up port 80   From the Control Panel  choose Windows Firewall  In the case of Win   dows XP  on the exceptions tab  check the box for    World Wide Web  Services  HTTP         40 Mascot  Installation a
193. l Setup  1  Choose a suitable location for the database files    The default location for database directories is under the Mascot se   quence directory  but database files can be located on any local drive  If  you decide to put the files in a different location  you will need to change  the path in step 3     Create a directory called NCBInr and under this  create three directories  called incoming  current  and old     2  Unpack the files from the DVD archive    Linux  Unpack the files using gzip and tar  If the Databases DVD is  mounted as  mnt dvdrom  typical command lines would be    cd  usr local mascot sequence    gzip  dc  mnt dvdrom NCBInr 20120419 fasta gz  gt   NCBInr current NCBInr 20120419 fasta    cd    taxonomy  gzip  dc  mnt dvdrom taxonomy tar gz   tar xvf      Windows  Many people will prefer to use a graphical utility  such as  WinZip or 7 Zip to unpack these archives  Make sure you use a recent  version that can cope with files larger than 4 GB     Extract NCBInr_20120419 fasta gzinto NCBInr current and  extract the files in taxonomy tar gz into the Mascot taxonomy direc   tory   It isn   t sufficient just to de compress taxonomy tar  make sure you  extract the files inside the tar archive      62 Mascot  Installation and Setup                          liz     host Shared Folders scratch taxonomy tar gz taxonomy tar      lee    File Edit View Favorites Tools Help   a   b   vyv       x      Add Extract Test Copy Move Delete Info   e D di taxonomy tar gz ta
194. lay the  file contents as raw text        Installation  Linux       Release Notes    Mascot 2 4 is compiled for 32 bit and 64 bit Linux  Refer to the release  notes for last minute additions to documentation and the Matrix Science  web site support page for patches and known issues     http   www matrixscience com mascot_support html    Cluster Mode    If you have a licence to run Mascot on multiple processors  and plan to do  so on a networked cluster of machines  then please familiarise yourself  with the material in Chapter 11  Cluster Mode  before proceeding with  the installation     System Requirements    Web Server    Perl    Mascot is compatible with most web servers  Appendix D provides con   figuration information for Apache     If a web server is being installed for the first time  in connection with the  installation of Mascot  it is essential to verify that it is serving docu   ments correctly before attempting to install Mascot     Mascot requires Perl  Perl 5 14 is recommended  Perl 5 8  5 10  and 5 12  are also supported     Mascot scripts assume that Perl can be found at  usr local bin   perl  If Perl is installed in a different path  just add a symbolic link     ln  s  actual location of perl  usr local bin perl    6 Mascot  Installation and Setup    If any library modules are missing  this will be identified during the  installation procedure  Binary packages of Perl and most Perl modules  are available for most Linux distributions  The mechanism for  downl
195. le     210 Mascot  Installation and Setup    Windows Manual Configuration    The following configuration steps on each search node are performed  automatically as part of the Windows installation    MascotNodeService    Under Windows  ms mascotnode  exe is configured to run as a service   This should be taken care of automatically  If there are any problems   service creation or deletion requires the Microsoft utility sc exe  which  can be found in the mascot  cluster Windows_NT directory     The command to create the service is     sc create MascotNodeService       amp  type  own      amp  binpath  c  mascotnode bin ms mascotnode exe         amp  start  auto    You may need to change the path to the executable  and note that the  spaces after the equals signs are significant     To verify that the service has been created successfully  from the Control  panel  open the Services control panel and choose MascotNodeService   Select Startup    and the following dialog should be displayed     Matrix Science Mascot Service Properties  Local Comp       General  Log On   Recovery   Dependencies    Log on as     Oloc    C Allow service to interact with desktop    O Ihis account     You can enable or disable this service for the hardware profiles listed below        Chapter 11  Cluster Mode 211    To delete the service  first stop it  close the services control panel  then  enter     sc delete MascotNodeService    Dr  Watson    To prevent  invisible  dialog boxes from being displayed 
196. le form under the terms of Sections 1 and 2 above provided  that you accompany it with the complete corresponding machine readable source code   which must be distributed under the terms of Sections 1 and 2 above on a medium  customarily used for software interchange     If distribution of object code is made by offering access to copy from a designated  place  then offering equivalent access to copy the source code from the same place  satisfies the requirement to distribute the source code  even though third parties are not  compelled to copy the source along with the object code     5  A program that contains no derivative of any portion of the Library  but is designed to  work with the Library by being compiled or linked with it  is called a    work that uses the  Library     Such a work  in isolation  is not a derivative work of the Library  and therefore  falls outside the scope of this License     However  linking a    work that uses the Library    with the Library creates an executable  that is a derivative of the Library  because it contains portions of the Library   rather  than a    work that uses the library     The executable is therefore covered by this License   Section 6 states terms for distribution of such executables     When a    work that uses the Library    uses material from a header file that is part of the  Library  the object code for the work may be a derivative work of the Library even  though the source code is not  Whether this is true is especia
197. le loading taxonomy nodes     Parameters   messages     more detailed error information     Failed to register job  Please inspect mascot error log         A POST request is submitted with zero content length        Cannot find boundary string        First line was not a boundary        Corrupted input   possibly a binary file is submitted        Corrupted input or incompatible browser        Invalid accession format for ms gettaxonomy exe        Too large POST request        Invalid taxID format for ms gettaxonomy exe        Standard input stream error     Parameters   bytesread     number of bytes already read    lengthofdata     total size of input data in the stream    Chapter 7  Program Reference 133    Non fatal errors   461    Sequence not found     Parameters   accession     accession string  470    Cannot find taxonomy id     Parameters     accession     accession string  empty if non fatal error  can be non   empty only in warning section for accession requests     taxid     taxonomy id  Warnings that are only reported in the end of XML document   400    Missing or invalid gencode id  Table 1 is used for translation     Parameters     accession     accession string  empty if non fatal error  can be non   empty only in warning section for accession requests     taxid     taxonomy id  470    Cannot find taxonomy id     Parameters     accession     accession string  empty if non fatal error  can be non   empty only in warning section for accession requests     taxid   
198. leRelPath    data   MascotCmdLine    cgi nph mascot exe   MascotControlFile    data mascot control   MascotJobIdFile    data mascot job   MascotNodeControlFile    data mascotnode control   MonitorLogFile    logs monitor log   SearchLogFile    logs searches log   TestDirectory    data test   UniqueJobStartNumber 001234       These entries determine local paths  not URL   s   ErrorLogFile   MascotCmdLine  MonitorLogFile  SearchLogFile  and  TestDirectory are self explanatory     Get SeqJobIdFile contains the next available job number for the ms   getseq exe utility  These numbers wrap around at 999 and do not    86 Mascot  Installation and Setup    appear in the search logs  If this file is deleted  the next job number will  be reset to 1 and a new jobId file created automatically    Mascot output files are written to a path given by   InterFileBasePath InterFileRelPath yyyymmdd Fnnnnnn  dat    Where yyyymmdd is the current ISO date  and nnnnnn is a sequential  job number with a minimum of 6 digits  The path is split into a base path  and a relative path as seen by the CGI scripts so that the search engine  can pass a file path to  say  master_results plas     InterFileRelPath yyyymmdd Fnnnnnn dat    TestDirectory contains the input files used by Monitor to test new  sequence databases     MascotControlFile contains critical internal parameters  This file  must be memory mapped and locked to provide interprocess communica   tion between different Mascot components  MascotNodeCo
199. lements within msgt tree is essential   e inmsgt tree    root    element is not listed but always assumed   e msgt translation_table_id element may not be available     e Any of the elements msgt db_entry  msgt tax_from_id can be  missing or repeated several times depending on request     Error messages    All errors have unique codes and are logged to both the XML output and  the Mascot error log   but only the first 10 instances of any particular  error number   The XML output contains a full set of error messages in a  structured format that can be processed automatically     Fatal Errors  no database entry is going be retrieved   403    Error while reading mascot dat     Parameters   errstring     error message as generated by ms parser  463    db    parameter is missing     465    POST request to ms gettaxonomy is empty     440    Invalid session or session ID       Parameters     132 Mascot  Installation and Setup    443    27    251    469    462    460  270  55  56  259  72  466  468  467  54    errstring     error message as returned by security objects     Not allowed to search the database     Parameters   db     database name that was requested     Database is not available or not active     Parameters   db     database name that was requested     No taxonomy indexes for this database     Parameters   db     database name that was requested     Failed to load species file     Parameters   messages     more detailed error message     One or more errors happened whi
200. les    On the search screen  find out what caused the error by clicking on the  Error log link  fix the fault   possibly out of disk space   and then click  on retry     50 Mascot  Installation and Setup    51       Validation       CGI Operation    To verify that the search engine is functioning correctly when executed  as a CGI application  launch a JavaScript aware web browser and load  the Mascot home page   http   your_server mascot    Select Mascot from  the main menu and then choose the    Peptide Mass Fingerprint    link  near the top of the page  This will load the search form for a peptide  mass fingerprint     Enter your name and email address into the fields at the top of the form  and type a number  say 1234  into the Query field  Then press the Start  Search    button     The search form will be replaced by the search progress screen  This has  a few lines of text at the top  ending in the line    Searching         Addi   tional lines will appear showing the percentage of the search that has  been completed     Once the search is complete  the Master Results page will appear  Unless  you went to the trouble of entering some real mass values  the results  will be meaningless     Monitor Test    When Mascot Monitor is started  it runs a test search against each  sequence database  It also runs this same test search against any update  to the database as part of the exchange procedure  If the test search fails   an error message will be displayed in the Mascot Stat
201. list txt  If you edit this file while Mascot  is not running  these values can be deleted     subcluster ID number  0 based   node within subcluster  0 based   status  0 unknown status  1 attempting to bring into use  2 no response to ping  3 failed to start service  4 in use    number of CPU   s actually being used  File Replication    The configuration files  such as mascot  dat  that are on the Mascot  master are automatically replicated to the nodes  So  it is only necessary  to update a file on the master  The ms monitor exe program  run as  the Matrix Science Mascot Service under Windows   continually looks to  see if a file has been updated  and will distribute new versions to the  nodes as required  The dates  times and lengths of the distributed files  should be identical on all systems     The same process is used for updates to executable programs  except that  these updates will only be made when the ms monitor  exe service first  starts     The Status screen will indicate if any executable files need updating     208 Mascot  Installation and Setup    Files required on each Mascot Node    Target File name and  directory relative to  node home directory Notes      bin ms mascotnode exe Updated at start up      bin nph mascot exe      config enzymes      config mascot dat Updated at start up     config unimod xml       config mascot license Updated at start up     config taxonomy      config fragmentation_rules     config quantitation xml     taxonomy nodes dmp     
202. ll  be similar in spirit to the present version  but may differ in detail to  address new problems or concerns     Each version is given a distinguishing version number  If the Program  specifies a version number of this License which applies to it and    any   later version     you have the option of following the terms and conditions   either of that version or of any later version published by the Free   Software Foundation  If the Program does not specify a version number of  this License  you may choose any version ever published by the Free Software  Foundation     10  If you wish to incorporate parts of the Program into other free  programs whose distribution conditions are different  write to the author  to ask for permission  For software which is copyrighted by the Free  Software Foundation  write to the Free Software Foundation  we sometimes  make exceptions for this  Our decision will be guided by the two goals  of preserving the free status of all derivatives of our free software and  of promoting the sharing and reuse of software generally     NO WARRANTY    11  BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE  THERE IS NO WAR   RANTY  FOR THE PROGRAM  TO THE EXTENT PERMITTED BY APPLICABLE LAW  EXCEPT  WHEN  OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND OR OTHER PARTIES  PROVIDE THE PROGRAM    AS IS    WITHOUT WARRANTY OF ANY KIND  EITHER EX   PRESSED  OR IMPLIED  INCLUDING  BUT NOT LIMITED TO  THE IMPLIED WARRANTIES OF  MERCHANTABILITY AND FITNESS FOR A P
203. llation and Setup    string  e g  STY  If there was a neutral loss  the delta mass is given by  the value of FixedModNeutralLoss1     FixedModn delta  Name  FixedModResiduesn   A Z  C_term N_term  FixedModNeutralLossn mass    Fixed modifications cannot have peptide neutral losses  multiple neutral  losses and cannot be protein terminal or residue terminal  In all these  cases  fixed modifications are automatically converted into variable ones     Variable modifications are reported in deltal  delta2  etc  Each entry  defines the difference in mass introduced by the modification together  with the name of the modification  separated by a comma  If a variable  modification suffers a neutral loss on fragmentation  the delta is speci   fied by a NeutralLossn entry  By definition  this is always a master  neutral loss  If there are multiple neutral losses  then two more lines  appear     NeutralLossn master mass   mass        NeutralLossn slave mass   mass          The first neutral loss  defined by NeutralLossn  has an implicit index  number of 1  Any additional neutral losses  defined by  NeutralLossn_master or followed by NeutralLossn_slave  have implicit  index numbers of 2 and up     If a modification includes a required or optional neutral loss from the  precursor  this is recorded as follows     ReqPepNeutralLossn mass   mass        PepNeutralLossn mass   mass          Error tolerant modifications are not listed in masses section     Quantitation       gc0p4Jq0M2Yt08jU534c0
204. llowing files would be created for the  database SwissProt_2012_03    SwissProt_2012 03 a00   SwissProt_2012 03 100   SwissProt 2012 03 s00   SwissProt _2012 03 stats    Chapter 7  Program Reference 113    SwissProt_2012 03 NoTaxonomyMatch txt  SwissProt 2012 03 t00    The final two files are only created if taxonomy is specified in the  database configuration  Compressed files are a proprietary format   which is unlikely be useful for other applications     3  If a serious error occurs while creating these files  then the  conversion to the new database stops  an error is put into the error  log and  optionally  the error message is emailed to the administra   tor  Also  if the status screen is shown  the existence of the error is  shown on that screen  Searches on the existing database will con   tinue until the problem is resolved     4  A test search is performed on the new database  The test uses the  appropriate file in the    data test directory  If the test is suc   cessful  then a file with the name  lt database_name gt   lt unique hash  key gt  fasta testedOk is put into the     data test directory  If the  test fails  then an error is put into the error log and  optionally  the  error message is emailed to the administrator  Also  if the status  screen is shown  the existence of the error is shown on that screen   Searches on the existing database will continue until the problem is  resolved     5  Any new searches submitted by users will now use the new  database 
205. lly significant if the work  can be linked without the Library  or if the work is itself a library  The threshold for this to  be true is not precisely defined by law     If such an object file uses only numerical parameters  data structure layouts and  accessors  and small macros and small inline functions  ten lines or less in length    then the use of the object file is unrestricted  regardless of whether it is legally a  derivative work   Executables containing this object code plus portions of the Library will  still fall under Section 6      Otherwise  if the work is a derivative of the Library  you may distribute the object code  for the work under the terms of Section 6  Any executables containing that work also fall  under Section 6  whether or not they are linked directly with the Library itself     xx Mascot  Installation and Setup    6  As an exception to the Sections above  you may also combine or link a    work that uses  the Library    with the Library to produce a work containing portions of the Library  and  distribute that work under terms of your choice  provided that the terms permit  modification of the work for the customer   s own use and reverse engineering for  debugging such modifications     You must give prominent notice with each copy of the work that the Library is used in it  and that the Library and its use are covered by this License  You must supply a copy of  this License  If the work during execution displays copyright notices  you must inc
206. lowed by a list of the residues that identify the cleavage site     Optionally  a block can include a line starting with the keyword Re   strict  followed by a list of the residues which prevent cleavage if  present adjacent to the potential cleavage site     Finally  the block must include either the keyword Cterm or Nterm to  define whether cleavage occurs on the C terminal or N terminal side of  the specified residues     This syntax can be extended to support multiple cleavage specificities   enabling enzyme mixtures to be modelled  or mixed C term and N term  cutters  This is achieved by appending zero based index numbers in  square brackets to the keywords Cleavage  Restrict  Cterm  and  Nterm  For example     Title  CNBr Trypsin  Cleavage  0   M  Cterm 0   Cleavage  1   KR  Restrict  1   P  Cterm 1   Independent   0         The use of index numbers is optional when only one specificity is defined   but required when there are multiple specificities  as in this example     For a definition with multiple specificities  if the keyword Independent  appears and is given a value of 1  this means that the specificities should  be treated as if independent digests had been performed on separate  sample aliquots and the resulting peptide mixtures combined  Thus  any  given peptide will conform to the specificity of one cleavage type only  In  the case of CNBr Trypsin  if Independent was set to 1  you would not  find any peptides resulting from cleavage after K or R at one end  
207. lude  the copyright notice for the Library among them  as well as a reference directing the  user to the copy of this License  Also  you must do one of these things     a  Accompany the work with the complete corresponding machine readable  source code for the Library including whatever changes were used in the work   which must be distributed under Sections 1 and 2 above   and  if the work is  an executable linked with the Library  with the complete machine readable     work that uses the Library     as object code and or source code  so that the  user can modify the Library and then relink to produce a modified executable  containing the modified Library   It is understood that the user who changes the  contents of definitions files in the Library will not necessarily be able to  recompile the application to use the modified definitions      b  Use a suitable shared library mechanism for linking with the Library  A suitable  mechanism is one that  1  uses at run time a copy of the library already present  on the user   s computer system  rather than copying library functions into the  executable  and  2  will operate properly with a modified version of the library  if  the user installs one  as long as the modified version is interface compatible  with the version that the work was made with     c  Accompany the work with a written offer  valid for at least three years  to give  the same user the materials specified in Subsection 6a  above  for a charge no  more than the co
208. luster    Mascot system limits    The following are relevant to very large clusters  Other system limits are  listed in Appendix C of manual    e Maximum number of processors per machine 64  e Maximum number of sub clusters in a cluster 50  e Maximum number of machines in a sub cluster 1024    e Maximum number of processors in a sub cluster 65536    Chapter 11  Cluster Mode 213    e Maximum number of nodes in nodelist txt 4096  Directing jobs to a sub cluster    The SUBCLUSTER search parameter is used to direct jobs to a sub   cluster  This can be added to the web browser search form as a hidden  field by editing the Perl script     To use the next free sub cluster   SUBCLUSTER  1    If all of the sub clusters have searches running  and the search has been  submitted from a browser  then the following will be displayed in the  browser until a sub cluster becomes free     Waiting for sub cluster to become available        To use a specific sub cluster  e g sub cluster 2  SUBCLUSTER 2    The default value is 0 so  if this parameter is not specified  a search will  go to the first sub cluster     Specifying which sub cluster a particular job goes to usually implies  some third party job queuing system is being used  For example    e Job gets submitted to PBS and PBS decides which sub cluster to  run the search on    e PBS adds a SUBCLUSTER x to the search parameters    e PBS creates a task_id using ms searchcontrol exe      create_task_id    e PBS submits the search  passing the
209. m specific executables   for distribution to the nodes in a cluster    config contains configuration files    data contains Mascot results files  By default  a new sub directory  is created for each day   s results files  The name of each sub direc   tory is that day   s date in ISO format  yyyymmdd     htm1 is the root directory for documents    logs contains search and error logs  etc     8 Mascot  Installation and Setu                      mascot bin   auto msparser  GD    modules  cgi etc   cluster      lt platform gt       config  test  om data 20100203   SS Ss J    rtm  fH   downloads    H help    images  pdf vendor                                                                                                templates    master_results 2                                                                                 xmins schema      logs    m sequence SwissProt m incoming  sessions current  taxonomy old  unigene NCBInr incoming  X Cgi i current  Key old    not mapped to a URL A sess  etc  l          mapped to a URL  mapped and executable J       Figure 2 1 Mascot Directory Structure                      Chapter 2  Installation  Linux 9    sequence contains a sub directory for each FASTA database  As  illustrated  for each database there are 3 sub directories to organise  the FASTA files into new downloads  incoming   active databases   current  and the most recently replaced files  01d      sessions contains security session files  taxonomy contains taxonomy resources
210. mbols  and non alphanumeric charac   ters should be URL encoded using the  nn notation     A reminder to Windows users  Do not use backslashes as path delimiters   because these will be interpreted as escape characters     Most parameters are entered as literal strings  with two exceptions    ACCESSION  is a place holder that will be replaced by an actual acces   sion string   FRAME  is a place holder that will be replaced by the  number of the reading frame used to translate a nucleic acid sequence   Obviously  this last parameter is only used with NA databases     80 Mascot  Installation and Setup    The syntax for calling ms getseq  exe is described in Chapter 7  In the  examples shown above  the full text report for Trembl is taken from an  external URL because the full text file for Tremb  is huge  40 GB   The  default configuration for SwissProt uses a local full text reference file     Processors    Mascot licensing is physical CPU or socket based  For each CPU covered  by the licence  Mascot will fully utilise up to 4 logical processors or cores     If the number of processors available is the same as the number licensed   then it is best not to include a PROCESSORS section  You can include  one  if you wish  but this may have a negative impact on system perform   ance     If the number of processors available is greater than the number li   censed  you can use a PROCESSORS section to force specific cores to be  used     Logical processor  core  numbers generally star
211. me  these values  set the limits on the amount the mass is allowed to increase   MaxEtagMassDelta  or decrease  MinEtagMassDelta  in order to reach  the first available cleavage point     MaxEtVarMods 2    The maximum number of variable mods allowed in the first pass of an  automated error tolerant search  global default  can be over ridden for a  group in security     MaxNumPeptides    The maximum number of peptides that can be expected from the  enzymatic digest of a single entry  The default is MaxSequenceLen   4    MaxPepNumVarMods 5  The maximum number of variable mods allowed for a PMF  MaxQueries 10000    The maximum number of MS MS spectra allowed in a single search   Note that the maximum number of mass values in a PMF is hard coded  to 1000    92 Mascot  Installation and Setup    MaxSearchesPerUser 0    Sets the maximum number of concurrent searches from a single IP  address  A value of 0 means no limit   global default  can be over ridden  for a group in security     MaxSequenceLen 50000    The maximum length of a database entry in characters   bases for NA or  residues for AA   The default is 50 000  The length of the longest se   quence in a database can be found in the   stats file  created by Mascot  Monitor when the database is compressed  The larger the value of  MaxSequenceLen  the more memory mascot uses  So  if you need to  increase it  make it just a little greater than the length of the longest  sequence  On a 32 bit system  try not to exceed 3 million  
212. memory     Each search node requires sufficient free disk space for the Mascot  application software and the compressed FASTA databases  The master  also requires sufficient disk space for the original FASTA databases and  the accumulating search result files  The amount of space required for  the results files depends on how heavily the system is used and how  often the files are backed up and deleted     For best performance  it is advisable for the nodes to have local hard  disks  If you prefer to use shared storage  then each node must have its  own dedicated directory structure     Mascot nodes may have any number of processors  but the number of  cores in each node should be a multiple of 4 to make maximum use of the  number of CPUs in the licence     A search node does not require a keyboard  monitor or mouse  If you are  running Windows on the nodes  and want to be able to    see    the indi   vidual desktops  you might consider using a KVM switch so that a single  keyboard  monitor and mouse can be shared between all the nodes   Alternatively  Windows Remote Desktop or VNC can be used     http   www realvnc com   Operating System Requirements    For nodes running Windows  it is not necessary to use a    Server    version  of Windows on the search nodes     Chapter 11  Cluster Mode 183    For Linux clusters  it must be possible for the master to communicate  with the search nodes using ssh or rsh without quoting a password or  passphrase     Search nodes do not requir
213. modification is a single base change in the primary se   quence  the two mass fields will be set to zero  and one of the keywords  NA_INSERTION  NA_DELETION  or NA_SUBSTITUTION will appear in the  description field  The additional parameter hn_qm_na_diff is then used  to record the    before    and    after    nucleic acid sequences     If the search includes a quantitation method and the search parameter  MULTI_SITE_MODS is set to 1 then a single site can carry two modifi   cations  When this occurs  a second modifications string  e g   h1_pl_summed_mods  is used to record the additional modification s      Ion series is a string of 19 digits representing the ion series     a  place holder  att   b   place holder  b     y   place holder  y     c   c     X   x     Z   z     z H   z H      158 Mascot  Installation and Setup    z 2H  z 2H      A digit is set to 1 if the corresponding series contains more than just  random matches and 2 if the series contributes to the score     Multiplicity means number of peptide mass matches for a query in a  protein    For each sequence tag  four colon separated values are output  1 based  tag number  1 based residue position where tag starts  1 based residue  position where tag ends  ion series into which the tag was matched      1 means no matches for the tag                            0o    a    series  single charge    1    a NH3    series  single charge    2    a    series  double charge    3    b    series  single charge    4    b NH3 
214. mon ossi isssssisssssesosrissreeisssre isens essises sitein cc seseiivaasees 179  Cluster Moderisano vi vaca ain eisi iu nasaat 181  SCCUVUY ER ERR RE nT RPE RTT TNT Pen EY Teme Pe Rare 215  Basic Regular Expressions        cssccccsccossccsscccssccessccescccscccscees 225  Error MCSSABES sisisessdcsasiecsnssvecseigecssdeuessdeeecsessccsesereedosadsecsabecsess 229  System LiM tS isco cere casa ied Dei ceas oe tie a eee iaae iia ra sisa 231    Web Server Configuration        csccccsccossccossccescccscccsscccecccescccescs 233    xxx Mascot  Installation and Setup    Typographical Conventions    Description Example    Filenames  pathnames  directory names   2Ph mascot exe   folders   programs  and commands are  printed in italic fixed pitch font  In Unix     these names are case sensitive      usr local httpd     gt owl  100K_RAT 100 KD PROTEIN    gt       EC 6 3 2           The contents of text files are printed in  fixed pitch font on a grey background   Where it is necessary to break a long line   this is indicated by an indent and the  symbols  gt   amp   Text omitted from a line is  indicated by an ellipsis      while missing  lines are indicated by 3 vertical periods            gt owl  100K_RAT    NORVEGICUS  RAT               Text which should be entered literally is   C  TEMP gt  ftp ncbi nlm nih gov  shown in bold fixed pitch font on a grey   background  Control characters are   kill PID   shown in angle brackets  apart from    lt return gt   i e  carriage return  newline
215. mp    PIR     General  Log On   Recovery   Dependencies    Log on as     O Local System account        MATRIX_SCIENCE Administrat     Password  eeccccce          Confirm password   eeeeeeee       You can enable or disable this service for the hardware profiles listed below        Press OK  and you will be returned to the Services dialog     198 Mascot  Installation and Setup       u Services    File Action  e  gt   m  B RME m  Sia Services  Local  i    view       Help    Name  SRA IPSEC Services  By iTechnology iGateway 4 2  Sy Java Quick Starter  Bs Logical Disk Manager    Sa Mascot Daemon Service   g Matrix Science Mascot Service   Sy Message Queuing   By message Queuing Triggers   Ss Messenger   ams Software Shadow Copy Provider  Sa MySQL   Sy Net Logon   Sa Net  Tcp Port Sharing Service   Sy NetMeeting Remote Desktop Sharing  Sa Network Access Protection Agent  e etork Connections      Extended A Standard       Bs Logical Disk Manager Administrative S       Description  Manages IP security policy and st       Allows iSponsors to publish and ru     Prefetches JRE files for Faster sta     Detects and monitors new hard di       Configures hard disk drives and v       Manages local Mascot databases        Provides a communications infrast     Associates the arrival of incoming      Transmits net send and Alerter se     Manages software based volume        Supports pass through authentic     Provides ability to share TCP port       Enables an authorized user to acc       Allow
216. n only be locked by root  Before a    Failed to  lock memory for file xxx    error is given  Mascot Monitor will try and  increase the amount of RSS available by calling    Chapter 2  Installation  Linux 21    setrlimit  RLIMIT_ RSS  xxx     with the current value plus the size of the file to be locked  Under  Solaris  the RLIMIT_AS value is used      rather confusing use of    AS    by  Sun      If the resource limit cannot be increased  then error M00114    Error  calling setrlimit RLIMIT_RSS   memory requested     error  detailed  error message     will be put into errorlog txt    If the memory cannot be locked  then the error M00073    Failed to lock  memory for file  file name   Error  detailed text     will be put into the  errorlog txt file     If Mascot Monitor   ms monitor exe   is a 32 bit executable  the 3 or 4 GB  limit can quickly be reached by having several large databases locked  into memory  To work around this limit  a separate ms lockmem exe  program is provided     this is fork   d   exec   d from ms monitor exe when  the flag    SeparateLockMem 1    is added to the options section of  mascot dat     Physical memory    If the amount of memory locked gets close to the amount of physical  memory  the system will grind to a halt  The error M00073    Failed to  lock memory for file  file name   Error  detailed text     will also probably  be put into the errorlog txt file     Data segment size  This amount does not include the space used by memory mapped files
217. n security for NTLM SSP based  including sec    No minimum   R   Recovery console  Allow automatic administrative logon Disabled   Recovery console  Allow floppy copy and access to all drives and all Folders Disabled    RE  Shutdown  Allow system to be shut down without having to log on Enabled          If the current setting is Guest only  double click on the item to change  the setting     196 Mascot  Installation and Setup    Network access  Sharing and security model for local    PIX     dA  Local Security Setting       e Network access  Sharing and security model for local accounts           Classic   local users authenticate as themselves  Classic   local users authenticate as themselves  Guest only   local users authenticate as Guest          Select Classic     local users authenticate as themselves and press OK   Close the Local Security Settings window     For Vista  Server 2008  and Windows 7  a registry change is required to  allow administrator rights when logging in using a local  SAM  account   This procedure is taken from Microsoft KB article 951016    1  Click Start  click Run  type regedit  and then press ENTER  If the  start menu does not have a Run    option  then open a Command  Prompt window from the Accessories program folder and use this  instead     2  Locate and then click the following registry subkey   HKEY LOCAL MACHINENSOFTWAREMMicrosdf  Windows Current Verson Polices System     3  If the LocalAccountTokenFilterPolicy registry entry does not  exis
218. n the same system as  Mascot Daemon  Voyager DAT files can be processed     If a copy of ExtractMsn exe or similar is installed on the same  system as Mascot Daemon  Thermo Xcalibur RAW files can be  imported     A utility called TS2Mascot can be used to import peak lists  from an AB SCIEX 4000   5000 series database    Several Mascot Daemon clients can submit searches to a single Mascot  Server  and can even share a common task database  If you have several  mass spectrometers  you can choose whether to install separate copies of  Daemon on each instrument data system or whether to have a single  copy of Daemon somewhere on the LAN  marshalling searches for all  instruments     User Help    Mascot Daemon includes comprehensive  context sensitive on line help   Press F1 at any time to jump to the relevant topic     Installation    After Mascot Server has been installed  go to your local home page for  links to a help page that describes how to install  upgrade or  troubleshoot Mascot Daemon  All the required installation files are  hyperlinked from this page     181       Cluster Mode       Introduction    Mascot has been designed and implemented to work efficiently on a  cluster of computers  A cluster of single or dual processor boxes provides  a highly cost effective solution for high throughput protein identification   Mascot can be run in cluster mode on all supported hardware platforms  and operating systems                                                               
219. nd Setup       WP Wino Fes Sings E  General   Exceptions   Advanced    Exceptions control how programs communicate through Windows Firewall  Add a  program or port exception to allow communications through the firewall           Windows Firewall is currently using settings for the private network location   What are the risks of unblocking a program     To enable an exception  select its check box        Program or port   Secure World Wide Web Services  HTTPS    CI SNMP Trap   O Windows Collaboration Computer Name Registration Service   O Windows Firewall Remote Management   O Windows Management Instrumentation  WMI   O Windows Media Player   C Windows Media Player Network Sharing Service  O Windows Meeting Space   Windows Peer to Peer Collaboration Foundation   C Windows Remote Management   O Wireless Portable Devices   World Wide Web Services  HTTP               F  Notify me when Windows Firewall blocks a new program                      For Windows 7  the appearance is slightly different    Chapter 3  Installation  Microsoft Windows 41       e    hP  lt    All Control Panel items    Windows Firewall    Allowed Programs    File Edit View Tools Help    Allow programs to communicate through Windows Firewall  To add  change  or remove allowed programs and ports  click Change settings     What are the risks of allowing a program to communicate     Allowed programs and features     Name Domain Home Work  Private  Public     Stylus Studio   O Windows Collaboration Computer Name R
220. nk sequences  including EST   s  which represent a unique gene  It  is not an attempt to produce a consensus sequence  UniGene can be used  to simplify the results of a Mascot search of dbEST     An index file must be downloaded for each species of interest  For each  species  the fully qualified path to the index file is associated with the  species name     UniGene   human C  Inetpub MASCOT unigene human current Hs data   mouse C  Inetpub MASCOT unigene mouse current Mm data  mosquito C  Inetpub MASCOT unigene mosquito current Aga data    To add a UniGene report option to Mascot for a particular sequence  database  add a line containing the name of the database followed by a  list of the available species names     EST human human   EST mouse mouse   EST others mosquito  end    Options    The Options section is used for miscellaneous parameters  which are  listed here in alphabetical order  If a parameter is shown with  argument s   these are the default s  that apply if the parameter is  missing     AutoSelectCharge 1    Controls how MS MS queries are treated when the CHARGE parameter  specifies more than one charge state  e g  1   2   and 3    This is usually  because no charge information was available for a query  so the search  form defaults applied     If set to 0  a query is generated for each charge state and these queries  are searched and reported independently  This is the default setting  because this was the behaviour in earlier versions of Mascot     82 Mascot
221. nodelist txt  A full description of these files can be found below in the     Reference    section  Then  start the Mascot service     Windows Firewall on Search Nodes    Windows XP and later includes a software firewall called Windows  Firewall  You can avoid the configuration steps in this section by turning  off Windows Firewall on the search nodes  If the search nodes are on a  separate subnet  that can only connect to the master node  having a  firewall enabled on a search node is of little use  It is redundant until the    Chapter 11  Cluster Mode 187    master node has been compromised  by which time it is too late  If the  search nodes are not on a separate subnet  or if you simply want to  enable Windows Firewall because the operating system keeps nagging  you to do so  it is necessary to run through the following steps on each  search node     Windows firewall configuration varies across the different editions of  Windows and also according to whether it was part of the original instal   lation or added in a service pack     Windows XP and Server 2003     On each search node  log in as a user with local administrator rights  Go  to Control Panel and launch Windows Firewall  On the Advanced tab   make sure the network connection to the master is checked       Windows Firewall      General    Exceptions   Advanced    Network Connection Settings    Windows Firewall is enabled for the connections selected below  To add  exceptions for an individual connection  select it
222. npacked after  mascot tar  so as to over write the 64 bit files in the mascot tar  archive     bzip2  d mascot 32 tar bz2  tar xvf mascot 32 tar    Alternatively  you can combine decompression and tar into a single  command  for example     bzip2  dc  dvdrom mascot tar bz2   tar xvf      This will create the directory structure illustrated in Figure 2 1  Ensure  that the ownership of the files matches the user ID that your web server  is configured to use  The mascot  tar file has been created using  root root  The required ID when Apache is installed from a RedHat RPM  will be apache apache  and when installed on Ubuntu or Debian  it will  be www data www data    chown  R apache apache  usr local mascot       If this is not acceptable  then the logs  config  sessions  and data  directories  plus the file logs errorlog txt must be made writeable  by the web server process      Create URL mappings  If this is a clean installation  add the following mappings to your web    server configuration   substituting your actual disk path to the new  mascot directory      Disk path URL Executable   usr local mascot cgi  mascot cgi Yes   usr local mascot html  mascot No   usr local mascot x cgi  mascot x cgi Yes    You may wish to restrict access to the administrative programs by  setting a password or IP address restriction on  mascot x cgi     Chapter 2  Installation  Linux 11    Notes on web server configuration can be found in Appendix D  Example  configuration entries for Apache can
223. ntrolFile is  a similar  additional file used in cluster mode    MascotJobIdFile contains the next available job number  If this file is  deleted  the next job number will be initialised to the value given by  UniqueJobStartNumber  and a new jobId file created automatically   NB UniqueJobStartNumber must never be set lower than 1000     ErrTolMaxAccessions 0    The maximum number of database entries allowed for a manual error  tolerant search  Default is 0  meaning no limit     ExecAfterSearch n flag num flag num   title string  command string    Defines a command to be run after a search is complete  N is one or two  digits in the range 1 to 10  The Mascot installer creates the following two  entries which provide Percolator integration     ExecAfterSearch 1 waitfor 0 logging 0  Creating percola   tor input     bin ms createpip exe  i  sresultfilepath  o  percolator pip    ExecAfterSearch 2 waitfor 1  logging 1  Percolating       bin percolator exe  PercolatorExeFlags    The following flags may be specified     Chapter 6  Configuration  amp  Log Files 87    flag num description    waitfor 0  10 The command should wait for completion of the  command specified by num  A value of 0 means don   t wait  equivalent to  omitting the flag    logging 0  3 0     no logging 1     log successful commands  return  code 0  2     log unsuccessful commands  return code not 0  3     log suc   cessful and unsuccessful commands    percolator 0  1 0     no dependency on Percolator 1     command
224. nucleic acid database  then the  length returned will depend on the translation frame number specified     If the keyword title is supplied  the FASTA title line is returned  begin   ning with a right angle bracket     If the keyword pI is supplied  the calculated iso electric point is returned     116 Mascot  Installation and Setup    Batch mode    Request format    GET request always means single entry mode  POST request automati   cally means batch mode  A batch mode request should use UTF 8 encod   ing and be of    multipart form data    enctype  for example     41184676334  Content Disposition  form data     SwissProt  41184676334  Content Disposition  form data        RL19 YEAST         G3P2 YEAST        ERROR YEAST     41184676334   Content Disposition  form data        TRY1 BOVIN     41184676334  Content Disposition  form data     on  41184676334  Content Disposition  form data     on  41184676334  Content Disposition  form data     on  41184676334  Content Disposition  form data     on  41184676334  Content Disposition  form data     off  41184676334  Content Disposition  form data     123456  41184676334       name     db       name  accession       name  accession       name  showpi       name   showtitle       name   showlen       name   showsequence       name   showreference       name  sessionID       Chapter 7  Program Reference 117    Maximum number of accession strings submitted at once shouldn   t be  more than 100 000 and the total size of request shouldn 
225. oading and installing new modules and updates is distribution  specific  For example  to install the non core module Bundle  LWP on  some common distributions     Red Hat CentOS Linux    yum install perl libwww perl  Debian Ubuntu Linux    aptitude install libwww perl  SUSE Linux    yast  i perl libwww perl    If a module has missing dependencies  you will be prompted to install  these  The required non core modules are     Module Debian Ubuntu Red Hat CentOS SuSE  GD libgd gd2 noxpm perl perl GD  Bundle   LWP libwww perl perl lbwww perl    Algorithm  Diff libalgorithm diff perl _perl Algorithm Diff  XML  Simple libxml simple perl perl XML Simple    If other applications require a version of Perl not supported by Mascot or  if you have difficulty compiling Perl or one of the required modules  the  Mascot DVD includes ActivePerl 5 14 for Linux  32 bit and 64 bit   You  can install this for general use or for use by Mascot only  A typical  installation might use the following commands     cd  tmp    gzip  dc  dvdrom ActivePerl 5 14 2 1402 x86 64 linux   glibc 2 3 5 295342 tar gz   tar xvf      cd ActivePerl 5 14 2 1402 x86 64 linux glibc 2 3 5   295342     sudo   install sh    Follow the installation script instructions  and choose to install into an  appropriate location  If ActivePerl was being installed for Mascot use  only  we might install into  usr local  ActivePer  5 14  and create a  symbolic link as follows     Chapter 2  Installation  Linux 7    ln  s  usr local Activ
226. observed peptide  mass in Da  absDMppm Absolute value of calculated minus observed peptide  mass in ppm  isoDM Calculated minus observed peptide mass  after  eliminating possible isotope errors up to 2 Da  in  Da  isoDMppm Calculated minus observed peptide mass  after  eliminating possible isotope errors up to 2 Da  in  ppm  mc Number of missed cleavages  always 0 if no enzyme   varmods Number of modified sites divided by number of modi   fiable sites  varmodsCount The number of variable mods used in the peptide   That is  if there are 10 Met and 5 of these are  oxidised  this counts as 1  A peptide with Met Ox   phosphoS  deamidation  and acetylation  would count  as 5   modifiable Total number of modifiable sites  modified Total number of modified residues and terminii  totInt Log total ion intensity  The 20 most intense peaks  in each 100 Da bin are used for all features  and  totInt reports this value  intMatchedTot Log total matched ion intensity  reliIntMatchedTot Total matched ion intensity divided by total ion  intensity as a percentage  no logs involved   fragDeltaMed Median value of all matched fragment errors in Da  fragDeltalqr Interquartile range value of all matched fragment  errors in Da  fragDeltaMedPPM Median value of all matched fragment errors in ppm  fragDeltaIgqrPPM Interquartile range value of all matched fragment  errors in ppm  fragDeltaPolyFit 2nd order polynomial fit to m z vs delta  Result is  RSquared  multiplied by the number of points divided by 
227. ocessor with hyper threading  enabled     Troubleshooting    Check the Mascot Server Support Page    There may be a fix listed on the Matrix Science Web Site  From the  menu  choose Support  Mascot Server and scan down to see if your  problem is described     The installation program doesn   t recognise Perl    To test whether Perl is correctly installed  you can open a command  window  and type     perl  v    The version number should be displayed  If this seems to be functioning  correctly  and the Perl version is either 5 8  5 10  5 12 or 5 14  re start  the computer and then re run the Mascot installation program  If it still  fails  contact Matrix Science technical support   support matrixscience com      If  when you type perl  v you see the text     The name specified is not recognized as an  internal or external command  operable program or  batch file     then Perl is not installed or is not on the path  If you have just installed  it  you should try restarting the computer and performing the test again   If that fails  try re installing Perl  making sure that you choose the  option to add it to the path     The status screen shows an error    If the Mascot Monitor service fails to start  then the following text or  something similar will be displayed in the status screen     Chapter 3  Installation  Microsoft Windows 47             lolx    Ele Edt View Favorites Tools Help Ea    esak   gt    O A A  Reach Favorites  lt Bristory   G5  S  B       Address http   hast123
228. ocumentation  and any subsequent updates and supplements  the     Software         By installing or using the Software  you agree to be bound by the terms of  this agreement  If you do not agree to the terms of this agreement  we are  unwilling to license the Software to you  In this case  do not install or use  the Software  Return the Software to Matrix Science Limited or their  authorised distributor within 30 days of receipt for a full refund     1 Licence    Matrix Science Limited owns the copyright in the Software contained within this  package and all other copies which you are authorised by this agreement to make     This licence is personal to you  either an individual or a single corporate entity  as  the purchaser of a licence to use the Software and the licence granted herein is for  your benefit only     You may not use the Software in any way that permits unlicensed access to the  Software  In particular  individuals who are not party to this licence or the general  public must not be permitted access to the Software through a public network such  as the Internet     2 Permitted Users    As purchaser of a licence to use the Software  you may  subject to the following  conditions     2 1 load the Software onto and use it on a single computer  of the type identi   fied on the package  which is under your control  and    ii Mascot  Installation and Setup    2 2    2 3    copy the Software for backup and archival purposes and make up to two  copies of the documentat
229. on   ionquery4 2167 784350 from 1084 900000 2   query  4      daemon   score4 39 11          Chapter 7  Program Reference 137       daemon   Sigscore4 47     daemon   Selectpeptides 1    If the job is incomplete  or has failed  then an error will be returned   unknown_id  searchcontrol error nnn    with values of    nnn    as for     status     ms searchcontrol exe   xmlresults   taskID  lt number gt     reporttop  FILE AUTO num hits     sessionID   lt string gt      If the job is complete  then this will return the results formatted as an  XML instance document that conforms to the schema    mascot html xmIns schema DistillerMascotSearch_1 DistillerMascotSearch_1 xsd  If the job is incomplete  or has failed  then an error will be returned   unknown_id  searchcontrol error nnn    with values of    nnn    as for      status     ms searchcontrol exe   create task id    sessionID   lt string gt      On failure  this will return  searchcontrol error nnn  with values of    nnn    as for   status   And on success it will return     taskID nnn    ms searchcontrol exe   mascot job number   taskID   lt number gt     sessionID  lt string gt      This will return either the job number   mascotjobnumber nnnn   or   searchcontrol error nnn    with values of    nnn    as for     status     138 Mascot  Installation and Setup    ms searchcontrol exe   kill job   taskID  lt number gt       sessionID  lt string gt      If the task is successful  this will return the text    job killed   If the
230. on pertaining to    NCBlInr  Simply enable the predefined definition for NCBInr in Data   base Manager and the latest files will be downloaded automatically       If your Mascot Server is not connected to the Internet  download the  required files on a PC with Internet access and copy them to your Mascot  Server  Download URLs and configuration information for popular  databases can be found on the Matrix Science web site at http     www matrixscience com help seq_db_setup html    As a convenience for users who have no access to the Internet  a copy of  NCBInr  a comprehensive protein sequence database  is included with  Mascot on a separate DVD  together with the required taxonomy files  As  with the copy of SwissProt  installed with the Mascot program files   these files were current at time of release  but will become increasingly  out of date     The procedure to make use of the NCBInr files on the DVD depends on  whether you use Database Manager or not     Chapter 5  Sequence Database Setup 61    Using Database Manager    e Choose    Create New       e Enter NCBInr as the database name  choose    Use predefined defini   tion template     and select NCBInr from the list     e If necessary  modify the location for the sequence database direc   tory  then choose    Create       e Unpack the Fasta file into the specified location and unpack the  taxonomy files into the Mascot taxonomy directory as described  below under Manual Setup step 2    e Choose    Activate       Manua
231. onitor exe must be running at all times     Once the new licence file is in place  follow the hyperlink to Database  Status  You should see a display similar to the following        Mascot search status page    e C fi Obogong mascot_2 64 x cgi ms status exe Show MAIN_PAGE  MASCOT search status page    Version  2 3 241   MSL  XQ5P TFRR 3APW FB33 7H6X  Licence Info    8 logical  2 physical Intel processors  hyper threading disabled in bios  quad core   CPUs  0 7 2 3 4 5 67 available  using  0 2 2  34567   0 searches running           Search log  monitor log  error log  Error message descriptions  Do not auto refresh this page                                              Name SwissProt Family    usr local mascot_2_4 0 64 sequence SwissProt current   Filename SwissProt_2012_03 fasta Pathname    usr local mascot_2_4 0 64 sequence SwissProt curr  Status Creating compressed files 63  complete   State Time Fri Apr 20 17 30 30   searches   0   Mem mapped NO Request to mem map   YES Request unmap   NO Mem locked   NO   Number of threads    1 Current   NO          If an error occurs  use the links to the monitor log and the error log to  investigate the cause  If all is well  you will see the following messages  displayed on the status line for SwissProt     16 Mascot  Installation and Setup    Creating compressed files  Running 1st test   First test just run OK  Trying to memory map files  Just enabled memory mapping  In Use    You can begin exploring and using Mascot  However  do not 
232. onitor program     ms monitor DEBUG    Any error messages should be displayed on the screen  If possible  correct  the faults  and then start the Mascot Service from the start menu  Note  that the mascot service should never be running at the same time as  ms monitor exe is being run from the command line     48 Mascot  Installation and Setup    International Versions Of Windows    If Mascot is installed on a version of Windows that is not in the English  language  then when the ms status screen is displayed  it may have the  error    Failed to initialise memory map       To correct this fault  the following procedure is required     1     You will need to find the names of the    groups    that your version of  Windows uses for Administrators and Users  In German  for example   these names are    Administratoren    and    Benutzer    respectively  To  see a list of User names  from the start menu  select Programs  Ad   ministrative Tools  common   User Manager  The section at the  bottom of the screen displays the group names  Make a note of the  two names     From the start menu  select  Programs   Mascot   Config   Stop Mascot Service    From the start menu  select  Programs   Mascot   Config   Mascot Configuration File    Scroll down to near the bottom of the file and find the line   NTIUserGroup Users   and change this to  for example  for German   NTIUserGroup Benutzer    Find the line  NTMonitorGroup Administrators  and change this to  for example  for German   NTMonit
233. or the Software     7 2 In no event will we be liable to you for any indirect or consequential dam   ages even if we have been advised of the possibility of such damages  In  particular  we accept no liability for any programs or data made or stored  with the Software nor for the costs of recovering or replacing such program  or data     7 3 Nothing in this clause limits our liability to you in the event of death or  personal injury resulting from our negligence     8 Termination    8 1 The agreement and the licence hereby granted to use the Software auto   matically terminates if you     8 1 1 fail to comply with any provisions of this agreement  or  8 1 2 voluntarily return the Software to us     8 2 In the event of termination in accordance with clause 8 1 you must destroy  or delete all copies of Software from all storage media in your possession     9 Severability    In the event that any provision of this agreement is declared by any judicial or  other competent authority to be void  voidable  illegal or otherwise unenforceable or  indications of the same are received by either you or us from any relevant compe   tent authority we shall amend that provision in such reasonable manner as  achieves the intention of the parties without illegality  or at our discretion such  provision may be severed from this agreement and the remaining provisions of this  agreement shall remain in full force and effect     iv Mascot  Installation and Setup    10    11    12    Entire Agreem
234. orGroup Administratoren    Save the mascot dat file    Delete the files   c  inetpub mascot sequence SwissProt   current SwissProt  a0d0o  c  inetpub mascot data mascot control   Note that these files may be in a different directory if you did not  install mascot in the default location       From the start menu  select    Programs   Mascot   Config   Start Mascot Service      Re load the status page     Programs   Mascot   Search Status   You may need to re fresh   re load the page     Wait until the files have been compressed and a test search has been  done  Mascot is now ready for use     Chapter 3  Installation  Microsoft Windows 49    The site search facility does not work    The local Mascot web pages are indexed using a product called ht   Dig  A  log file is made as the indexes are built during the installation  The log  file mascot  htdig build log may contain an error message indicat   ing the nature of the problem     If the web server was not operational during Mascot installation  it will  not have been possible to build the keyword index  To build or rebuild it   open a command window and enter the following commands  If Mascot  was installed into a different path  you may have to modify the first two  lines    C    cd  inetpub mascot htdig  bin htdig exe  v  bin htmerge exe  v    Once the commands have completed  keyword search using the control at  the top right of the web pages should be operational    Search status shows a failure to create compressed    fi
235. oredown    In a peptide summary report  peptide matches that are not assigned to  protein hits are initially sorted by descending score  scoredown   Alterna   tives for SortUnassigned are ascending query order  queryup  and  descending intensity order  intdown   This global default can be overrid   den on an individual report URL by appending  amp _sortunassigned X   where X is scoredown  queryup  or intdown     SplitDataFileSize 10000000    Large searches are divided into    chunks     and no single chunk can exceed  this number of bytes     default 10 Mb  When a search is divided into  chunks  protein and peptide match data are no longer written to the  summary section of the result file  This means that a Protein summary  report cannot be generated     SplitNumberOfQueries 1000    Large searches are divided into    chunks     and no single chunk can exceed  this number of queries     default 1000  When a search is divided into  chunks  protein and peptide match data are no longer written to the  summary section of the result file  This means that a Protein summary  report cannot be generated     StoreModPermutations 1    If set to 0  only the highest scoring permutation of variable modifications  for each unique peptide sequence is retained in the list of the top 10 ions  scores  If set to 1  then different permutations of variable modifications  are treated as independent matches  creating the possibility that all 10  top ions scores correspond to the same primary sequenc
236. ork connections e  EC VM12 Network Internet  Diagnose and repair  This computer   A Network  Private network  Customize  Access Local and Internet  Connection Local Area Connection View status    lB Sharing and Discovery  Network discovery    On        File sharing    Off  a     When file sharing is on  files and printers that you have shared from this computer can be  accessed by people on the network      Turn on file sharing      Turn off file sharing          See also    Internet Options  Apply  Windows Firewall Public folder sharing    Off     e          EAEE ae onr    Select Administrative Tools and launch Windows Firewall with Ad   vanced Security  Select Inbound Rules in the left hand panel and New  Rule in the action panel     Chapter 11  Cluster Mode 193        amp  New Inbound Rule Wizard x   Rule Type  Select the type of firewall rule to create   Steps      Rule Type What type of rule would you like to create     Protocol and Ports    Action Program     Profile Rule that controls connections for a program     Name   Port  Rule that controls connections for a TCP or UDP port      Predefined   BITS Peercaching  Rule that controls connections for a Windows experience   Custom  Custom rule   Leam more about rule types  Back Next  gt    Cancel          In the wizard  choose Port  Next  TCP  Specific Local Ports  5001  Next   Allow Connection  Next  Clear the checkbox for Domain and Public   Next  Enter the name as MascotNodePort5001  Finish  The new rule will  be added
237. ot a sequence database directory  Under Windows   remember that the directory separator in Database Manager and in  mascot dat must be a forward slash  not a back slash     DVD read errors    File checksums and sizes  as reported by cksum  for the files on the  Databases DVD     3537046107 4218730222 NCBInr_20120419 fasta gz  657867792 149274785 taxonomy tar gz    65       Configuration  amp  Log Files       Configuration Files    Mascot configuration files are located in the mascot  config directory     unimod xm1 defines mass values and modifications  including  substitutions    enzymes defines enzyme cleavage specificity    fragmentation rules specifies which fragment ion series corre   spond to defined instrument types    mascot  dat contains general configuration information  If you use  Database Manager  do not modify the sequence database related  sections of mascot dat because any changes will be lost when Data   base Manager is next used     taxonomy specifies the taxonomy filter choices for the search form   described in Chapter 9     quantitation  xml1 defines quantitation methods    nodelist txt configures the systems belonging to a Mascot  cluster  described in Chapter 11     user xml  group xml  security_options xml  and  security _tasks xml are the configuration files for Mascot secu   rity  described in Chapter 12    mod_file  masses  and substitutions are obsolete configu   ration files that are created on the fly from unimod xm1 to support  third party appli
238. p  Content Type  application x Mascot  name  quantitation        lt  xml version  1 0  encoding  UTF 8  standalone  no      gt     lt quantitation majorVersion  1  minorVersion  0  xmlns  http     www matrixscience com xmlns schema quantitation_1  xmlns xsi       http    www w3 org 2001 XMLSchema instance    xsi schemaLocation  http     www matrixscience com xmlns schema quantitation_1 qu  antitation_1 xsd  gt     Chapter 8  I O File Formats 153     lt method constrain _search  false    description  15N metabolic label   ling    min_num_peptides  2  name  15N Metabolic  MD     pro  t_score type  mudpit    protein _ratio_type  weighted     report _detail  true    require bold red  true    show _sub_sets  0 5   sig th  reshold_ value  0 05  gt    lt component name  light    gt    lt isotope  gt    lt  component gt    lt component name  heavy     gt    lt isotope gt    lt old gt N lt  old gt    lt new gt 15N lt  new gt    lt  isotope gt     This section is an extract from quantitation xml containing the  quantitation method specified for the search  For more details and a link  to the schema  refer to the Mascot HTML help pages for quantitation     Unimod       gc0p4Jq0M2Yt08jU534c0p  Content Type  application x Mascot  name  unimod        lt  xml version  1 0  encoding  UTF 8  standalone  no      gt    lt umod unimod xmlns umod  http   www unimod org xmlns schema unimod_2   majorVersion  2  minorVersion  0  xmlns xsi  http   w  ww w3 org 2001 XMLSchema instance    xsi schemaLoc
239. ppy faces and the status line will  display the following messages     Creating compressed files  Running 1st test   First test just run OK  Trying to memory map files  Just enabled memory mapping  In Use    Once the database is    In use     you can begin exploring and using Mascot   Clicking on the links in the cluster node table will display more detailed  status information for individual nodes     200 Mascot  Installation and Setup    Linux    Communication    Under Linux  the master node communicates with the search nodes  using either ssh  preferred   or rsh  If communication can be established  using ssh  then scp is used for file copying  If rsh is used for communica   tion  then rcp is used for file copying     Whether ssh or rsh is used  it is essential that communication can be  established without requiring passwords or passphrases  In the case of  ssh  key based authentication is the preferred mechanism  A less secure  alternative for rsh is provided by file based authentication using  rhosts  or hosts equiv     A detailed description of the many ways to configure ssh or rsh is outside  the scope of this manual  For key based authentication  read the man  pages for  ssh  sshd  ssh keygen  ssh add  ssh agent  For file based  authentication  read the man pages for  rsh  rshd  rlogin  hosts equiv     The minimum procedure to set up key based authentication for ssh on a  clean Linux system  where there are no pre existing keys  is as follows     1  Login to the mast
240. propri   ate commands    e See if any files are missing or out of date  see above   and if neces   sary  update them  This is done though the TCP IP socket  so no  directory mapping   NFS mounts are required     Once all the Mascot nodes have been successfully initialised  then Mas   cot Monitor starts as normal     Licensing    The number of processors that the search is permitted to run on is  restricted by the number of mascot licenses  The Mascot master node is  not included in this list  since it merely distributes the search and col   lates the results  The number of processors to be used for Mascot will  never exceed the number specified in the licence     Error messages and emails    In the single server version of Mascot  selected warning messages can  optionally be emailed to the system administrator when something  critical  such as a database update  fails on the server  The following  additional messages  specific to a cluster  can also be emailed     M00323 One or more cluster nodes has stopped responding    M00316 Dr  Watson log updated  indicating a software crash  on one of  the cluster nodes     Who Am I     If the Mascot master is also being used as a node  when nph mascot exe  is run  it needs to know whether it is running as a node task or as master  task  Since the different mascot  dat files are identical  it determines  this from a file mascot  config iam  dat that is created by the Mascot  node service when it starts up  Do not copy or replace this fi
241. pyright  c  2004 2005 The University of Tennessee and The Univer   sity  of Tennessee Research Foundation  All rights  reserved   Copyright  c  2004 2005 High Performance Computing Center Stuttgart   University of Stuttgart  All rights reserved   Copyright  c  2006  2007 Advanced Micro Devices  Inc   All rights reserved    SCOPYRIGHTS   Additional copyrights may follow   SHEADERS   Redistribution and use in source and binary forms  with or without   modification  are permitted provided that the following conditions are   met      Redistributions of source code must retain the above copyright  notice  this list of conditions and the following disclaimer      Redistributions in binary form must reproduce the above copyright  notice  this list of conditions and the following disclaimer listed  in this license in the documentation and or other materials  provided with the distribution      Neither the name of the copyright holders nor the names of its  contributors may be used to endorse or promote products derived from  this software without specific prior written permission    The copyright holders provide no reassurances that the source code   provided does not infringe any patent  copyright  or any other intel    lectual property rights of third parties  The copyright holders dis   claim any liability to any recipient for claims brought against re   cipient by any third party for infringement of that parties intellec   tual property rights    THIS SOFTWARE IS PROVIDED BY THE 
242. r by append     88 Mascot  Installation and Setup    ing _featuretablelength X to the protein view URL  where X is the  length in bases     FeatureTableMinScore    By default  only matches with significant scores  p  lt  0 05  are output  A  different score threshold can be specified using the parameter  FeatureTableMinScore in the Options section of mascot dat or by ap   pending _featuretableminscore X to the protein view URL  where X is  the score threshold     ForkForUnixApache 0    If a user presses    Stop    or goes to another page in their browser when a  search is running  then the intended behaviour is that the search should  continue  and the user be emailed with their results  However  when  running some versions of Apache  the search is terminated by Apache  when the connection to the browser is lost  To stop this from happening   set this value to 1  Setting this parameter to 1 with other servers can  cause problems  so only use this setting if necessary   When set to 1  the  result is that nph mascot   exe ignores PIPE signals  does a fork  the  parent exits and the child then ignores HUP signals      FormVersion 1 01    Mascot users may save search forms off line  or submit searches using  scripts or private forms  When the search engine is upgraded  there is  the possibility that old scripts or forms may contain invalid or obsolete  parameters  If a search is submitted to Mascot without a version  number  or if the version number is lower than that specified b
243. r hard drive  It has 1 of 2  subfeatures selected  The  subfeatures require SKB on your  hard drive     Location  C  inetpub mascot   i  ae  Ga oes                       If IIS is installed and functional  the default selections will be as shown  above  with IIS being configured automatically  If you don   t have IIS  installed  the Apache option will be selected instead  A test for whether  Apache or some other web server is actually installed comes later     You can de select the Swiss Prot database  but if this is a clean install   you are advised not to do so  It is better to proceed with a full installa   tion  so that correct installation of Mascot can be verified  If you don   t  want SwissProt to be available  you can easily remove it later    The default location for the installation is  inetpub mascot on the  drive with most free space with the sequence databases in   inetpub mascot  sequence  You can change one or both of these by  selecting the component then choosing Browse  If there is insufficient  disk space on the selected drive s   the installation will not be able to  continue     The next step depends on whether IIS or Apache was selected as the web  server  For IIS  there will be a drop down list of all the available web  sites  In most cases  you should select    Default Web Site     If you select a  different web site  refer to the notes on multiple web sites later in this  chapter     Chapter 3  Installation  Microsoft Windows 33       boba   IIS Confi
244. rTestTimeout 1200    A time out can be applied to the test searches used to validate a new  database  If the test search on a new database does not produce a valid  result within the number of seconds specified by MonitorTestTimeout   the problem is assumed to be with the new database  and the exchange  process is halted     MoveOldDbToOldDir 1    After a successful database swap  the old Fasta file and old reference file   if any  are moved to the    old directory unless this parameter is  present and set to 0  Note that  if set to 0  the old files are not deleted   Some other application must take care of this or there will be problems  next time Monitor starts up     Mudpit 1000  Obsolete  see MudpitSwitch  MudpitSwitch 0 001    Mascot has two ways to calculate protein scores in a Peptide or Select  summary report  Standard scoring is used when the ratio between the  number of queries and the number of database entries   after any tax   onomy filter   is small  The standard score is the sum of the ion scores  after excluding duplicate matches and applying a small correction   Protein score calculation switches to large search mode when the ratio  between the number of queries and the number of database entries    after any taxonomy filter   exceeds the value specified by MudpitSwitch   Only those ions scores that exceed one or both significance thresholds  contribute to the score  so that low scoring  random matches have no  effect  The global default can also be over ridd
245. rcumflex   and specifies a  list that matches any character except for the characters in the list after  the leading circumflex  For example    abc  matches any one character  except the characters a  b or c  The circumflex will have this special  meaning only when it occurs first in the list  immediately following the  left bracket     Appendix A  Basic Regular Expressions 227    A range expression represents the inclusive set of characters between  two characters in the ASCII character set  The starting and ending  characters are separated by a hyphen  For example   A Z  will match to  any single upper case letter  while  0 9 A Za z  matches any single  alphanumeric character     Matching Multiple Characters    When a BRE matching a single character or a subexpression is followed  by the special character asterisk    together with that asterisk it  matches what zero or more consecutive occurrences of the character  For  example   ab    and  ab   ab  are equivalent when matching the  string ab  The expression ab c will match to ac or abc or abbbbbbce     When a BRE matching a single character or a subexpression is followed  by an interval expression of the format   m      m    or   m n     together with that interval expression it matches what repeated consecu   tive occurrences of the BRE would match  The values of m and n will be  decimal integers in the range 0  lt   m  lt   n  lt   255  where m specifies  the exact or minimum number of occurrences and n specifies the max
246. rd Mascot login screen will be displayed  but authentication is  performed using Mascot Integra    The Mascot Integra server details must be specified in the options sec   tion of the security administration utility     IP address    This    user    should only used for third party legacy applications  that do  not support Mascot security  Instead of a user name  enter the static IP  address of the computer that will access the Mascot server  Do not enter  a password     Computer name    Same as the IP address  but the computer name is used instead  A  computer name is more practical where dynamic IP addresses are being  used     Agent string    Should only be used as a last resort for third party applications that  haven   t implemented Mascot security and where the computer name   IP  address is not reliable  A case sensitive substring comparison will be  made with the HTTP_USER_AGENT environment variable     Use built in web server authentication  See description of    authentication    above     Mascot will never prompt these users for username and password  and  hence passwords and password expiry will be ignored     Mascot security session time outs do not apply     In Microsoft Internet Information Services   IIS   if anonymous access  and integrated authentication are both enabled  then users will gener   ally be    logged in    as anonymous until they try to access a file where  permission is denied  This almost certainly means that anonymous login  must be disabled 
247. rdless of who wrote it     Thus  it is not the intent of this section to claim rights or contest your rights to work  written entirely by you  rather  the intent is to exercise the right to control the distribution  of derivative or collective works based on the Library     In addition  mere aggregation of another work not based on the Library with the Library   or with a work based on the Library  on a volume of a storage or distribution medium  does not bring the other work under the scope of this License     End User Licence Agreements xix    3  You may opt to apply the terms of the ordinary GNU General Public License instead of  this License to a given copy of the Library  To do this  you must alter all the notices that  refer to this License  so that they refer to the ordinary GNU General Public License   version 2  instead of to this License   If a newer version than version 2 of the ordinary  GNU General Public License has appeared  then you can specify that version instead if  you wish   Do not make any other change in these notices     Once this change is made in a given copy  it is irreversible for that copy  so the ordinary  GNU General Public License applies to all subsequent copies and derivative works  made from that copy     This option is useful when you wish to copy part of the code of the Library into a  program that is not a library     4  You may copy and distribute the Library  or a portion or derivative of it  under Section 2   in object code or executab
248. re is an error  one of the following will be returned   unknown_id   job_not_ running   searchcontrol error nnn   with values of    nnn    as for   status     The    kill    is implemented by setting a flag in the mascot control memory  mapped file  The nph mascot exe task is responsible for    killing    itself     ms searchcontrol exe   pause job   taskID  lt number gt      sessionID  lt string gt      If the task is successful  this will return the text   job_paused   If there is an error  one of the following will be returned   unknown_id   job_not_ running   job_already_ paused   searchcontrol error nnn   with values of    nnn    as for   status     The    pause    is implemented by setting a flag in the mascot control  memory mapped file  The nph mascot exe task is responsible for    paus   ing    itself     ms searchcontrol exe   resume job   taskID  lt number gt      sessionID  lt string gt     If the task is successful  this will return the text    job_resumed   If there is an error  one of the following will be returned     unknown_id    Chapter 7  Program Reference 139    job_not_running  job_not_paused  searchcontrol error nnn  with values of    nnn    as for   status     The    resume    is implemented by setting a flag in the mascot control  memory mapped file  The nph mascot exe task is responsible for    resum   ing itself     ms searchcontrol exe   nice job   taskID  lt number gt       nice  lt integer gt      sessionID  lt string gt      The task ID need to 
249. required when running  a search     Species specific nucleic acid databases    Even a species specific database  such as EST_human  requires tax   onomy to be defined at the database level  so that the correct genetic  code can be chosen     For EST_human  the default taxonomy block in mascot dat is       TAXONOMY FOR EST human with TaxID  Taxonomy _ 10    Enabled 1   0 to disable it  SpeciesFiles NCBI   names  dmp   NodesFiles NCBI nodes dmp  NCBI merged dmp  Identifier All human with TaxID 9606  GencodeFiles NCBI  gencode dmp  MitochondrialTranslation 0   TaxID 9606   End    MitochondrialTranslation is set to 0   off   and TaxID is set to 9606   specifying that all database entries are homo sapiens  So  genetic code 1    standard   will be selected for all entries     HUPO PSI PEFF Format    The HUPO Proteomics Standards Initiative PEFF Fasta format is  described here http    www psidev info index  php q node 363    178 Mascot  Installation and Setup      TAXONOMY FOR PEFF  Taxonomy_14    Identifier HUPO PSI PEFF Format   Enabled 1   0 to disable it   FromRefFile 0   ErrorLevel 0   SpeciesFiles NCBI   names  dmp   NodesFiles NCBI  nodes dmp  NCBI merged dmp  DefaultRule EXPLICIT  CHOP       NcbiTaxId    0 9         end    The NCBI taxonomy ID can be parsed directly from the title line     179       Mascot Daemon       Overview    Mascot Daemon is a client application that automates the submission of  searches to a Mascot Server  Functionality includes     1  Batch mode  in 
250. residue  hl_ql_subst pos1 ambigl1 matched1      posn  ambign  matchedn  h1l_pl_summed_mods variable modifications string  hl _q2e             hl1_qme     h2e       hn_qme         gc0p4JqOM2Yt084jU534c0p    Chapter 8  I O File Formats 157    Where a parameter has multiple values  these are shown on separate  lines for clarity  In the actual result file  all values for a parameter are  on a single line and there are no spaces or tabs between values     Variable modifications is a string of digits  one digit for the N terminus   one for each residue and one for the C terminus  Each digit specifies the  modification used to obtain the match  0 indicates no modification  1  indicates deltal  2 indicates delta2 etc   in the masses section  If the  number of modifications exceeds 9  the letters A to W are used to repre   sent modifications 10 to 32  X is used to indicate a modification found in  error tolerant mode     neutral loss string is the same concept as the variable mod string   except each character represents the index of the primary neutral loss   one of the master NL   Any position that is not modified  or where the  mod has no neutral loss  is set to 0  hn_qm_primary_nl will only be  output if the string contains at least one non zero character     Ifa new modification is found in an error tolerant search  its position is  marked by X  and details are recorded in an additional entry   hn_qm_et_mods  If the error tolerant search is of a nucleic acid data   base  and the 
251. rograms  These disadvantages are    End User Licence Agreements xvii    the reason we use the ordinary General Public License for many libraries  However  the Lesser  license provides advantages in certain special circumstances     For example  on rare occasions  there may be a special need to encourage the widest possible  use of a certain library  so that it becomes a de facto standard  To achieve this  non free  programs must be allowed to use the library  A more frequent case is that a free library does  the same job as widely used non free libraries  In this case  there is little to gain by limiting the  free library to free software only  so we use the Lesser General Public License     In other cases  permission to use a particular library in non free programs enables a greater  number of people to use a large body of free software  For example  permission to use the  GNU C Library in non free programs enables many more people to use the whole GNU  operating system  as well as its variant  the GNU Linux operating system     Although the Lesser General Public License is Less protective of the users    freedom  it does  ensure that the user of a program that is linked with the Library has the freedom and the  wherewithal to run that program using a modified version of the Library     The precise terms and conditions for copying  distribution and modification follow  Pay close  attention to the difference between a    work based on the library    and a    work that uses th
252. running     A queued job will return    queued    when ms   searchcontrol exe is called with the    status argument     ms searchcontrol exe   version    sessionID     lt string gt      If the task is successful  this will return the version number    CreatePIP    Usage  ms createpip exe  OPTION   i filename    Options    h     help     f     features  exit       sessionID  lt id gt   mand line     o output_file     q  lt  queries gt   mascot dat     s  lt  sequences gt   mascot dat     a  lt feature gt    r  lt feature gt    C     p  lt interval gt   seconds       nocache       version    display this help page and exit    display list of features defined in mascot dat and    not normally used because this is run from com     default is filename pip    override minimum number of queries set in    override minimum number of sequences set in    add a feature to the list specified in mascot dat  remove a feature to the list specified in mascot dat  use cached results    progress reports of Process X  every  lt interval gt     do not use cached results    display version number and exit    Chapter 7  Program Reference 141    Features   retentionTime Retention time in seconds if available  dM Calculated minus observed peptide mass in Da  mScore Mascot score  always on   lgDScore Mascot score minus Mascot score of next best non   isobaric peptide hit  mrCalc Calculated Mr  charge Charge  dMppm Calculated minus observed peptide mass in ppm  absDM Absolute value of calculated minus 
253. ry   the recipient  automatically receives a license from the original licensor to copy  distribute  link with or  modify the Library subject to these terms and conditions  You may not impose any  further restrictions on the recipients    exercise of the rights granted herein  You are not  responsible for enforcing compliance by third parties with this License     11  If  as a consequence of a court judgment or allegation of patent infringement or for any  other reason  not limited to patent issues   conditions are imposed on you  whether by  court order  agreement or otherwise  that contradict the conditions of this License  they  do not excuse you from the conditions of this License  If you cannot distribute so as to  satisfy simultaneously your obligations under this License and any other pertinent  obligations  then as a consequence you may not distribute the Library at all  For  example  if a patent license would not permit royalty free redistribution of the Library by  all those who receive copies directly or indirectly through you  then the only way you  could satisfy both it and this License would be to refrain entirely from distribution of the  Library     xxii Mascot  Installation and Setup    If any portion of this section is held invalid or unenforceable under any particular  circumstance  the balance of the section is intended to apply  and the section as a  whole is intended to apply in other circumstances     It is not the purpose of this section to induce you 
254. s a utility for retrieving taxonomy details for an entry in  a database configured for use by Mascot  The utility can be used to  retrieve information for a single entry  ot in batch mode    Single entry mode    The executable  x cgi ms gettaxonomy  exe  can be called from the  command line  or via a URL as a CGI application     When calling as a CGI application  with arguments appended to the  URL  the parameter list must be URL escaped   Spaces replaced by       and characters other than letters or numbers replaced by a     xx    where  xx is the ASCII code for the character as a hexadecimal number      When running from a command line  the accession string should be  enclosed in single or double quotes  This is essential for accession strings  beginning gi     because the pipe character has special meaning in Linux  and Windows     In the table below  the first argument supplied to ms gettaxonomy exe  is an integer to specify the mode  The remaining arguments are selected  from     database Mascot database name  e g  NCBInr    accession accession string  e g  gi  7633482    126 Mascot  Installation and Setup    tax_ID taxonomy ID number  e g  9606    species name of species  e g     homo sapiens            File Edit View Favorites Tools Help Ea    epak     gt      A   Qsearch  Syravorites CHristory   B  Sp  H    Address je http    dellSO00 mascot x cgi ms gettaxonomy exe 4 MSDB CCHU  gt    Go    Links                   Taxonomy for CCHU    CCHU Homo sapiens  human  man    
255. s also need to be able to  read and write to these files  For example  with the Microsoft Web server   IIS   a new user with the name IUSR_ lt name_of_pc gt  is created when the  server is installed  and the scripts are run using this user name  The  installation program sets these values appropriately  Other Web servers  may use different user names  with different permissions     NTIUserGroup is the name of a group that the user name of the process  to run CGI scripts belongs to  NTMonitorGroup is the name of the local  Administrators group     If not using IIS  check the documentation that comes with the server to  find out which user name is used for running scripts  then from the start  menu  choose  Programs  administrative tools  common   and User  Manager  Double click on the user name  and press the groups button to  find out which groups this user name belongs to  This is the name to put  in mascot  dat for NTIUserGroup     Chapter 6  Configuration  amp  Log Files 95    Failure to put the correct group name will generally result in one of two  error messages     Failed to open memory mapped file  lt filename gt         QR Error  access denied    or  Failed to create memory map for  lt filename gt          amp  Error Access denied    After changing either of these entries  the Mascot service will need to be  stopped   from the start menu  choose Programs  Mascot  config   Stop Mascot service   All compressed database files must be de   leted  Then the Mascot service 
256. s normally generated by the web browser  If an   other application is used to generate an input file  simply ensure that it  conforms to the MIME format standard     The Mascot Monitor test searches use    captured    input files  Hence  an  example of a file can be seen by opening mascot  data test   SwissProt  asc in any text editor     243501029130836  Disposition  form data     243501029130836  Disposition  form data     243501029130836  Disposition  form data     243501029130836  Disposition  form data     243501029130836  Disposition  form data     243501029130836  Disposition  form data     243501029130836  Disposition  form data     243501029130836  Disposition  form data     Program Test    name     INTERMEDIATE       name   FORMVER       name   SEARCH       name   PEAK       name  REPTYPE       name  ErrTolRepeat       name   SHOWALLMODS       name   USERNAME       243501029130836  Content Disposition  form data     MS MS Test Search  243501029130836  Content Disposition  form data     SwissProt  243501029130836  Content Disposition  form data     Trypsin P  243501029130836  Content Disposition  form data     1  243501029130836  Content Disposition  form data     None  243501029130836  Content Disposition  form data     All entries  243501029130836  Content Disposition  form data     Carbamidomethyl  C   243501029130836  Content Disposition  form data   Oxidation  M   243501029130836  Content Disposition  form data     100  243501029130836  Content Disposition  form
257. s to the Apache configuration  Add the  following ScriptAlias entry  immediately before the ScriptAlias for    mascot cgi     ScriptAlias  mascot cgi htsearch  usr lib cgi bin   htsearch    On Red Hat CentOS   usr 1ib cgi bin should be replaced with    var www cgi bin    You may also need to add the following if you get 403 errors  especially if  you have Mascot defined in a separate virtual host      lt Directory  usr lib cgi bin gt   Order allow deny  Allow from all   lt  Directory gt     Finally  build an index of the Mascot web site documents     rundig  v    This may need to be run by the web server user or root  depending on   how htdig has been installed and configured  Indexing will only take a  minute or two  Use of the  v flag causes verbose progress reports to be  generated     18 Mascot  Installation and Setup    Miscellaneous    Hyper threading    Intel only  Hyper threading is a technique used by Intel to improve the  performance of multi threaded programs  Hyper threading does not  double performance because pairs of cores share other resources  such as  the on chip cache  On some systems  a BIOS setting can be used to  enable and disable hyper threading     Hyper threading is detected automatically  Each CPU in the Mascot  licence enables up to 4 cores to be used for searches  Hyper threading is  ignored when counting cores  so that you may see a 1 CPU licence using  8 threads on a system with a quad core processor with hyper threading  enabled     File System
258. s windows clients to particip       Manages objects in the Network a       Status  Started  Started  Started  Started    Started  Started    Started  Started    Started    Started          Highlight the entry for Matrix Science Monitor Service and press Start   If the service fails to start  the cause must be investigated and the  problem fixed before proceeding     Monitor progress using the Database Status page on the master  Choose  Monitor log and watch for error messages as the program files and  database files are copied to the search nodes     Completion    The installation is now complete  There will be a lot of disk activity while  the Mascot service compresses the SwissProt sequence database   Searches on the database cannot be performed until the files have been  compressed  You should open up the status screen in a web browser   Start menu  Programs  Mascot  Search Status  and verify the cluster    status     If this is a clean installation or a version update  you will need to follow  the links to register a product key as described in Chapter 2  Linux  installation  or Chapter 3  Windows installation   Once the licence file  has been saved to config licdb on the master node  you will be able to  proceed to Database Status     Chapter 11  Cluster Mode 199    2  Mascot search status page   Mozilla Firefox    File Edit View History Bookmarks Tools Help       X     Sas cA B File     C  Documents and Settings johnc MATRIX_SCIENCE Deskkop ms status exe htm    hl Sugar
259. script can be edited     Chapter 11  Cluster Mode 205    SubClusterSet X Y    Large clusters can be divided into sub clusters  X is a unique integer  value  0 based  used to identify the sub cluster  Y is the maximum  number of processors in the sub cluster  A single cluster must have a  single entry with X set to 0     IPCTimeout    The timeout in seconds for inter process communication  IPCLogging    0 for no logging of inter process communication  1 for minimal logging  2 for verbose logging    IPCLogfile  The relative path to the inter process communication log file    CheckNodesAliveFreq    The interval in seconds between    health checks    on the nodes  SecsToWaitForNodeAtStartup    At startup  if a node is not available within this time  the system will  continue to startup without that node  If the value is set to 0  then the  system will wait indefinitely  Default is 60  seconds      This timeout is also used if a node fails while the system is running  The  system will wait for this number of seconds before re initialising ms   monitor exe  This means that a short lived interruption in network  communication doesn   t create a major service interruption     MascotNodeRebootScript    Path to an optional CGI script to re boot a cluster node  If this parameter  is defined  there will be a link at the bottom of each Mascot Cluster Node  status page  Clicking on this link will execute the specified CGI script  with the host name of the specified node as an argument     
260. se     3  You may copy and distribute the Program  or a work based on it   under Section 2  in object code or executable form under the terms of  Sections 1 and 2 above provided that you also do one of the following     a  Accompany it with the complete corresponding machine readable  source code  which must be distributed under the terms of Sections  1 and 2 above on a medium customarily used for software interchange  or     x Mascot  Installation and Setup    b  Accompany it with a written offer  valid for at least three   years  to give any third party  for a charge no more than your   cost of physically performing source distribution  a complete  machine readable copy of the corresponding source code  to be  distributed under the terms of Sections 1 and 2 above on a medium  customarily used for software interchange  or     c  Accompany it with the information you received as to the offer  to distribute corresponding source code   This alternative is  allowed only for noncommercial distribution and only if you  received the program in object code or executable form with such  an offer  in accord with Subsection b above      The source code for a work means the preferred form of the work for  making modifications to it  For an executable work  complete source  code means all the source code for all modules it contains  plus any  associated interface definition files  plus the scripts used to   control compilation and installation of the executable  However  as a  special ex
261. se the whole of the work to be licensed at no charge to all third  parties under the terms of this License     d  Ifa facility in the modified Library refers to a function or a table of data to be  supplied by an application program that uses the facility  other than as an  argument passed when the facility is invoked  then you must make a good faith  effort to ensure that  in the event an application does not supply such function  or table  the facility still operates  and performs whatever part of its purpose  remains meaningful      For example  a function in a library to compute square roots has a purpose  that is entirely well defined independent of the application  Therefore   Subsection 2d requires that any application supplied function or table used by  this function must be optional  if the application does not supply it  the square  root function must still compute square roots      These requirements apply to the modified work as a whole  If identifiable sections of  that work are not derived from the Library  and can be reasonably considered  independent and separate works in themselves  then this License  and its terms  do not  apply to those sections when you distribute them as separate works  But when you  distribute the same sections as part of a whole which is a work based on the Library   the distribution of the whole must be on the terms of this License  whose permissions  for other licensees extend to the entire whole  and thus to each and every part  rega
262. second column is the parent taxonomy ID  Note  that the    parent    of Arabidopsis thaliana  3702  is Arabidopsis  3701      Chapter 9  Taxonomy 169    3700   3699   family     1     ae be 8s  i     3701   3700   genus        1       1   2  0      3702   3701   species   AT   4  1  21  21 4212  21 40401    4 1 1 0  4 1 1 0    Both files can be obtained from the NCBI ftp site   ftp   ftp ncbi nih  gov pub taxonomy taxdump tar gz    For NCBInr  you will also need gi_taxid _prot dmp gz  For NCBI  EST databases  you will need gi_taxid_nucl dmp gz     You should not modify the names  dmp and nodes  dmp file in the tax   onomy directory  If you wish to add more entries  a new file should be  made with just the new entries  Mascot will load multiple files as speci   fied below  Most Mascot updates will contain the updated names  dmp  and nodes  dmp files     PDBeast File    This file contains a list of entries that are derived from the Brookhaven  Protein databank  PDB   The file is available at     ftp   ftp ncbi nih gov mmdb pdbeast table  SwissProt File    SwissProt also supplies a file  speclist txt that is similar to the NCBI  names  dmp file  except that it gives the NCBI taxonomy ID for the  SwissProt Code  A regular expression is used to extract the    Code    and     Taxon Node    from the file  The regular expression should be defined in  any Taxonomy_ x section that uses speclist txt and is defined as     SWISSPROTRegex        A Z0 9       ABEV      0 9            Code T
263. sion   Content Type     oth Mascot search input files and results output files are in  MIME format  This is a text file which can be viewed easily for  inspection or debugging purposes     The MIME format is defined in various    request for comment    docu   ments  The following are the most relevant     ftp   ftp isi edu in notes rfc2045 txt  ftp   ftp isi edu in notes rfc2046 txt  ftp   ftp isi edu in notes rfc2388 txt    Very briefly  a unique boundary string is used to divide the file into  sections  each of which contains data in a format defined by a Content   type     Each section begins with two hyphens followed by the boundary string   The next line contains the content definition and name  followed by a  blank line  Then data  until the beginning of the next section For exam   ple     1 0  Generated by Mascot version 1 0   multipart mixed  boundary gc0p4Jq0M2Yt08jU534c0p      gc0p4Jq0M2Yt084jU534c0p    Content Type     first value    application x Mascot  name  first_ name         gc0p4Jq0M2Yt08jU534c0p    Content Type     another value    application x Mascot  name  another name         gc0p4Jq0M2Yt084jU534c0p      gc0p4Jq0M2Yt08jU534c0p    146 Mascot  Installation and Setup    Content Type  application x Mascot  name  final_ name       final value      gc0p4Jqo0M2Yt08jU534c0p      Search Input File    Content     Content     1 01    Content     MIS    Content     AUTO    Content     peptide    Content     Content     Content     Monitor    The search input file i
264. sk  will then appear in the lower list  Similarly  to remove a task  click on  the check box in the lower list  and click on the    Remove    button  To get  further information about any task  hold the mouse over the task in the  lower window and further details will appear in the help box     222 Mascot  Installation and Setup    No changes to a group are saved until the    Save changes    button is  pressed     Session files    Session files are created in the mascot sessions directory  Sessions that  have expired will be deleted automatically by ms monitor     Log file    The log file    security log     in the mascot logs directory contains informa   tion about all security changes  The file is not available from any web  based application for security reasons  The level of logging can be con   trolled from the security administration utility     Configuration Files    Security information is saved in three configuration files in the mascot   config directory     security _options xml  security_tasks xml  group xml  user  xml  The schema for these files is mascot_security_1_0 xsd     Use the security administration utility or Mascot Parser rather than  editing these files manually     Automating addition of new users    Mascot Parser users have access to all of the documentation for the lower  level functions to administer Mascot security programmatically  The  security administration utility uses some of these functions     To simply to add a large number of users  then
265. soft com en US windows downloads windows   vista    26 Mascot  Installation and Setup       Turn Windows features on or off        To turn a feature on  select its check box  To turn a feature off  clear its check box  A  __ filled box means that only part of the feature is turned on     Indexing Service   m Internet Information Services  E FTP Publishing Service  o m  Web Management Tools  o MJ 1S6 Management Compatibility  VIJe 156 Management Console        156 Scripting Tools  MI j  1S 6 WMI Compatibility  ea IS Metabase and IIS 6 configuration compatibility  MIJ IS Management Console  wie IS Management Scripts and Tools  EJ Is Management Service  o MM  World Wide Web Services  8 m Application Development Features  EJE  NET Extensibility  DR asp   Hb ASP NET  MIJ CG  m ISAPI Extensions  J ISAPI Filters  EJ  Server Side Includes  Common Http Features  B Default Document  Lb Directory Browsing   JE HTTP Errors  J HTTP Redirection  B Static Content  Health and Diagnostics  J Custom Logging  J HTTP Logging  J Logging Tools  Ji ODBC Logging  J Request Monitor  Tracing  Performance Features  B Http Compression Dynamic  p  Static Content Compression  Security  E Basic Authentication  J Client Certificate Mapping Authentication  Digest Authentication  p IS Client Certificate Mapping Authentication  J IP Security  J Request Filtering  B URL Authorization  Windows Authentication    tno murrr Lan                                                 Oe o0000 s80 q q008o s0eTRE        ox   
266. ss  Max mass 700 000 700 000 700 000 700 000 700 000 700 000 700 000 700 000 700 000 700 000 700 000 7  Delete Delete Delete Delete Delete Delete Delete Delete Delete Delete  Edit Edit Edit Edit Edit Edit Edit Edit Edit Edit Edit    New Instrument Main menu             gy Local intranet    The INSTRUMENT search parameter is used to select the set of ion  series used for scoring MS MS matches     Chapter 6  Configuration  amp  Log Files 73    File format  fragmentation_rules     Each instrument is defined by a block of lines  Blocks are delimited from  one another by a line containing an asterisk     The first line of each block must start with the Title  keyword  followed  by a text string that is used to identify the instrument in forms and  reports  The definition should be short and self explanatory  It should  only include alphanumeric characters and hyphens  The following lines  start with an integer  each of which represents an ion series or a rule to  be included in the definition  Refer to the file header for a list of avail   able integers  Anything following a hash    symbol is treated as a  comment     A block can also specify mass range limits for internal ions  The default  range is 0 to 700 Da  and could be changed as in this example                    title MALDI QIT TOF   1   singly charged   4   immonium   5   a series   6   a   NH3 if a significant and fragment includes RKNQ  7     a   H20 if a significant and fragment includes STED  8   b series   9   b   N
267. st data  This means that a results file  contains everything necessary to generate a report  repeat the search at  a later date  or act as the self contained input file to a project database or  LIMS     Mascot Parser provides an object oriented Application Programmer  Interface  API  to Mascot result files and configuration files  making it  easy for programs written in C    Java  Perl or Python to access Mascot  results     We strongly recommend that anyone writing software to process Mascot  results uses Mascot Parser  just like all of the Mascot result report  scripts     e It makes application development much faster  e It makes your code simpler and easier to debug    e You don   t have to worry about updating your code every time a  new version of Mascot is released    The Mascot Parser package  which includes object libraries  header files   binary executables  extensive documentation  and example code for  many functions  is available as a free download  For more information   go to http   www  matrixscience com msparser html    For reference  the result file contents are divided into logical sections     1  Search parameters   2  Mass values   3  Quantitation method  if used    4  Unimod extract   5  Enzyme definition   6  Taxonomy  if a taxonomy filter was used    7  Misc  header information   8  Summary results  for Protein Summary    9  Mixtures  if PMF    10  Summary of decoy results  if automatic decoy    11  Summary of error tolerant results  if automatic ET  
268. st of performing this distribution     d  If distribution of the work is made by offering access to copy from a designated  place  offer equivalent access to copy the above specified materials from the  same place     e  Verify that the user has already received a copy of these materials or that you  have already sent this user a copy     For an executable  the required form of the    work that uses the Library    must include  any data and utility programs needed for reproducing the executable from it  However   as a special exception  the materials to be distributed need not include anything that is  normally distributed  in either source or binary form  with the major components   compiler  kernel  and so on  of the operating system on which the executable runs   unless that component itself accompanies the executable     It may happen that this requirement contradicts the license restrictions of other  proprietary libraries that do not normally accompany the operating system  Such a    End User Licence Agreements xxi    contradiction means you cannot use both them and the Library together in an  executable that you distribute     7  You may place library facilities that are a work based on the Library side by side in a  single library together with other library facilities not covered by this License  and  distribute such a combined library  provided that the separate distribution of the work  based on the Library and of the other library facilities is otherwise permitted
269. start with the Title  keyword  fol   lowed by a text string that is used to identify the species in forms and  reports  The definition should be short and self explanatory  To show the  tree structure  indentation can be used  Unfortunately  it is not possible  to use tabs or multiple spaces for indentation in an html form  so a full  stop  period  and a space are used to indent the list  Internal spaces are  significant  and there should never be two or more spaces together     This should be followed with a definition line starting with the In   clude  keyword  followed by one or more NCBI taxonomy IDs separated  with commas     This should be followed with a definition line starting with the Ex   clude  keyword  followed by one or more NCBI taxonomy IDs separated  with commas  Any sequence with a taxonomy ID that passes the    include  test  may then be rejected by any entry in the exclude list     Finally  each entry must end with a      There are two ways of finding the NCBI taxonomy ID for a given species   The first is to open the file names  dmp in the mascot taxonomy directory   Under Windows  from the start menu  choose Programs  Mascot  Config   NCBI taxonomy names dmp file   and search for the species name  The   ID is the number on the left  For example  the ID for Filicophyta is 3263     3263   Filicophyta     scientific name    3263   ferns     preferred common name    Alternatively  the NCBI taxonomy browser can be used   http   www ncbi nlm nih gov Taxonomy  
270. sted in pmf_ queries used    Summary results      gc0p40q0M2Yt084jU534c0p  Content Type  application x Mascot  name   summary     qmass1l Mr  qexpl m z for query 1   charge  qintensityl intensity value for queryl  if available   qmatchl Total number of peptide mass matches for queryl in database  qplugholel Threshold score for homologous peptide match  MIS only   qmass2e     qexp2      qintensityl   qmatch2      qplughole2        qmassne       qexpne      gqintensityn    qmatchne       qplugholene      num_hits number of hits in the summary block   lt   max_hits     156 Mascot  Installation and Setup    hl accession string   total protein score   obsolete   intact protein mass  hl_text title text  hl _frame frame_number  between 1 and 6  for nucleic acid only     hl_ql missed cleavages    1 indicates no match   peptide Mr   delta   start   end     number of ions matched   peptide string   peaks used from Ionsi   variable modifications string   ions score   multiplicity   ion series found   peaks used from Ions2   peaks used from Ions3   total area of matched peaks  h1l_ql_et_mods modification mass   neutral loss mass   modification description  hl ql_et_ mods master neutral loss mass   neutral loss mass         h1l_ql_et_ mods slave neutral loss mass   neutral loss mass         hl_ql_ primary _nl neutral loss string  hl_ql_na_diff original NA sequence   modified NA sequence  hl_ql_tag tagNum startPos endPos seriesID        h1l_ql_ drange startPos endPos  hl_ql_terms residue  
271. t  follow these steps     A  On the Edit menu  point to New  and then click DWORD  Value     B  Type LocalAccountTokenFilterPolicy  and then press EN   TER     4  Right click LocalAccountTokenFilterPolicy  and then click  Modify     5  In the Value data box  type 1  and then click OK   6  Exit Registry Editor     Repeat this entire procedure on every search node     Chapter 11  Cluster Mode 197    Vista  Server 2008  and Windows 7    On each search node  from the Control panel  Administrative Tools  open  the Services dialog and select Remote Registry  Unless already set to  Automatic  right click and choose Properties  On the General tab  set  Startup type to Automatic and also start the service  Choose OK     Starting the Mascot service for the first time    On the master node  from the Control panel  Administrative Tools  open  the Services dialog and select Matrix Science Mascot Service  Right click  and choose Properties  Go to the Log On tab and choose This account   Enter the user name and password for a domain account with local  Administrator rights on each search node   not the local administrator  account on the master   You could use a domain administrator account   but this might be considered risky     If the nodes do not belong to a domain  all nodes  including the master   must have a user defined with administrator rights and the same user  name and password  The service must be set to log in as this user     Matrix Science Mascot Service Properties  Local Co
272. t at 0  but see your com   puter documentation  The ProcessorSet  line specifies the complete  set of logical processors  cores  to be used  Separate processor values  with acomma The number in this list must be less than or equal to four  times the number of physical CPU licensed  or the system will not run     Following this  the processors to be used for each database are specified   These numbers must be a subset of the numbers in the ProcessorSet  and  there must be the same number of values as the number of threads  specified earlier in the database section  For example  if you had a 1 cpu  licence and the physical processor had 6 cores  and you wanted to avoid  using cores 0 and 1  you could specify this as follows     PROCESSORS  ProcessorSet 2 3 4 5  SwissProt 2 3 4 5  end    The PROCESSORS section must be after the Databases section in  mascot dat  and ProcessorSet  must come before the other entries in  this section     Taxonomy  Do not modify this section if you ever use Database Manager  The syntax of the taxonomy blocks is fully described in Chapter 9   Cluster    The syntax of the cluster block is fully described in Chapter 11     Chapter 6  Configuration  amp  Log Files 81    UniGene    Do not modify this section if you ever use Database Manager    UniGene is an index created by automatically partitioning GenBank  sequences into a non redundant set of gene oriented clusters   http     www ncbi nlm nih gov UniGene    Each UniGene cluster is a list of the  GenBa
273. t file  so cannot be recovered by changing  this parameter in a repeat search     ProteinFamilySwitch see NoResultsScript    Chapter 6  Configuration  amp  Log Files 97    ProteinsInResultsFile 2    Determines the number of protein title lines saved to each results file     1 As in Mascot 1 7 and earlier  only proteins that appear in the  Summary section will appear in the Proteins section    2 Include proteins with at least one top ranking peptide match toa  peptide of length greater than MinPepLengthInPepSummary    3 Include all proteins    proxy _password  proxy server  proxy_username    These entries support a proxy server between the Mascot server and the  outside world  A typical entry might be    proxy server http   our cache 3128    If there is no proxy_server entry  scripts will look for proxy informa   tion in the server environment  The proxy _username and   proxy password parameters are only required if the proxy server  requires authentication  Remote host authentication should be included  directly in the URLs specified in mascot dat  e g  http      username   password hostname     RemoveOldiIndexFiles 1    After a successful database swap  the compressed files in the current  directory are deleted unless this parameter is present and set to 0    ReportBuilderColumnArrangement    Set the column arrangement at the given index  Column arrangements  are used by Report Builder  introduced in Mascot 2 4  to provide a  default list of columns to show  These can be sel
274. tabase     Parameters   db     database name that was requested    462    One or more errors happened while loading taxonomy  nodes       Parameters     messages     more detailed error information    460    Failed to register job  Please inspect mascot error log      270    A POST request is submitted with zero content length     55    Cannot find boundary string       56    First line was not a boundary       120 Mascot  Installation and Setup    259    Corrupted input   possibly a binary file is submitted     72    Corrupted input or incompatible browser     458    Invalid accession format for ms getseq exe     459    Too large POST request     54    Standard input stream error     Parameters     bytesread     number of bytes already read  lengthofdata     total size of input data in the stream  Non fatal Errors   461    Sequence not found     Parameters   accession     accession string    frame     frame number  0 if not supplied in the input or  missing if AA database     Warnings that are only reported in the end of the XML document     400    Missing or invalid gencode id  Table 1 is used for transla   tion       Parameters   accession     accession string    frame     frame number  0 if not supplied in the input or  missing if AA database     470    Cannot find taxonomy id     Parameters   accession     accession string    frame     frame number  0 if not supplied in the input or  missing if AA database     104    Sequence is too long for translation     Parameters   
275. taxonomy usernodes dmp Not required by most users  Note that  names dmp  is not required on the Mascot Nodes     Start up of ms monitor exe    The following sequence occurs  for each node  when ms monitor exe  starts for the first time on the master system   items marked   are for  Windows clusters only      See if the computer is available by opening a socket to the ping port   port 7     If there is an entry    StopMascotNodeCmd    in the mascot   dat file   then run that command to stop the Mascot node daemon   or     See if there is a MascotNodeService installed on the computer   if  there is  then stop that service    If there is no ms mascotnode  exe or if it is out of date on the  Mascot node  then copy update the file from the cluster  lt OS gt  direc   tory on the Mascot Master system to the specified directory on the  Mascot Node       Tf the service is not installed  then install the service  and adda  registry entry for the directory to be changed to at start up    Chapter 11  Cluster Mode 209    e Makea logs and config directory and copy mascot dat and  mascot license    e   Start the MascotNodeService on the Mascot Node computer   With  a Linux based system  the ms mascotnode  exe daemon will be  started      e Check that the service   daemon now communicates through TCP IP  sockets     if it fails  then a message indicating which Mascot node it is  waiting for is displayed in the ms status screen     e Initialise the MascotNodeService   daemon by sending the ap
276. ter is also a search node  and will execute Mascot  searches in addition to running Mascot Monitor and the web server  it  must be added as a search node using this dialog     Use the Add  Edit  and Delete buttons to specify the complete cluster        Cluster Setup x  Node Address Port Processors UNC Node Path Node Directory         192  168  10 7  plat pus 5001 2   platypus cS mascotnode c   mascotnode          add   edit   Delete               Press OK to return to the installation wizard  and file installation will  begin  Copying the files and configuring the system may take some time     Once complete  you will be presented with a message requesting that you  configure and start the Mascot Monitor service  This has to be done  manually  The Monitor service on the master needs to be run under an  account that has local Administrator rights on each node because it  needs to write to the registry  install  start and stop services on each  node   If you later change the password for this account  remember to  change it in the Logon tab of the Matrix Science Mascot Service proper   ties  also      Very large clusters    Defining a very large cluster using the Add node    dialog can be tedious   It is usually faster to define a small cluster  let the installation program  run to completion  then edit the configuration files using a text editor     From the Program menu  stop the Mascot service  and edit the cluster  and sub cluster configuration details into mascot  dat and  
277. the Windows start menu   choose Programs  Apache HttpServer 2 2  Control Apache Server  Re   start    You should now be able to view Mascot pages in a web browser and  proceed with licence registration     If Windows Firewall is enabled  you will probably have to open up port  80 as described in Chapter 3  before the Mascot Server can be accessed  from other computers     Keyword Indexing    The keyword index required for site search will not have been built  during Mascot installation because the web server mappings were not in  place     To build the keyword index  open a command window and enter the  following commands  If Mascot was installed into a different path  you  may have to modify the first two lines    Appendix D  Web Server Configuration 237    Gs   cd  inetpub mascot htdig  bin htdig exe  v  bin htmerge exe  v    Once the commands have completed  keyword search using the control at  the top right of the web pages should be operational    Using shebang under Windows  The configuration file created by the installer includes this directive   ScriptInterpreterSource Registry    This enables Windows style registry file associations  Assuming Perl has  been installed correctly  the extension p1 will be associated with  perl  exe     Without this directive  Apache uses the shebang line at the top of each  Perl script to associate the script with the Perl interpreter  The default  shebang line is        usr local bin perl    If you want to use this on a Windows system 
278. the contributors or copyright holders not be used in  advertising or publicity pertaining to distribution of the software    without specific prior permission     THE CONTRIBUTORS AND COPYRIGHT HOLDERS OF THIS SOFTWARE DISCLAIM ALL  WARRANTIES WITH REGARD TO THIS SOFTWARE  INCLUDING ALL IMPLIED  WARRANTIES OF MERCHANTABILITY AND FITNESS  IN NO EVENT SHALL THE  CONTRIBUTORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY SPECIAL  INDIRECT    OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM  LOSS    OF USE  DATA OR PROFITS  WHETHER IN AN ACTION OF CONTRACT  NEGLIGENCE  OR OTHER TORTIOUS ACTION  ARISING OUT OF OR IN CONNECTION WITH THE USE  OR PERFORMANCE OF THIS SOFTWARE           xxviii Mascot  Installation and Setup    XX  X       Contents   End User Licence Agreements        ccsscccsscccssccsscccscccsscccesccesccceecs i  Introdu cti  n seiscientos inin aiin inaia iee 1  Installation  LiNuX 55 secs sicctnnicdsnniervecboeGeviceressebasvieebasinwasevecbuescedey 5  Installation  Microsoft Windows            cccscccssccsscccescccscccesccees 23  Validation niise ieaiaia RNa ts isase ssia 51  Sequence Database Setup         cccccccsscccssccrscccssccssccsscccsccccsccceecs 57  Configuration  amp  Log Files    essssssssssesssecseessseoeeoesosecseeoseeoseoee 65  Program Reference   sssssssssesssesosessscoseoseosecseeesecseeoesoseeseecseose 109  UO File Formats waxisieccascsuesssasanhavadervansaesessssiansosacqonasenssiaeveeeunens 145  Taxonomy csee ina i a ii ia 165  Mascot Dae
279. the default value is the most appropriate     Web Site  femmes oOo    Use SSL TLS to access this web site    Below you can modify the name of the Mascot virtual directory in Apache  However  we  recommend that you accept the default name  This value is added to the host name  given above to form the full Mascot URL  eg  you might type into your browser    http    EC VM64 mascot    Virtual Directory   mascot             The virtual directory name can be changed  but remember that users are  more likely to guess the correct URL if you stick with mascot  Also  some  third party software may incorrectly assume the directory name is  always mascot           H va Nn TH UR ST TR T bol o Ea J   Cluster Configuration MATRIX  Choose whether to use Mascot cluster mode  SCIENCE        Please read the cluster mode chapter of the installation manual before using this  feature  For a standard installation of Mascot on a single computer  this feature should  not be enabled  Otherwise  if you wish to enable cluster mode then please select the  option below and then click the Configure button to specify the nodes that will be in the  duster     Enable Mascot cluster mode      Configure j At least one node must be defined in the duster              Chapter 3  Installation  Microsoft Windows 35    If you have a multi CPU licence  you can configure Mascot for execution  on a networked cluster  If you intend to do this  refer to Chapter 11 for  further details before proceeding  If you are install
280. the new database will display the follow   ing messages     Creating compressed files  Running 1st test   First test just run OK  Trying to memory map files  Just enabled memory mapping  In Use    Troubleshooting    Proxy Server    Several databases do not have a local reference file  The default configu   ration is to retrieve full annotation text as required from the remote web  sites  If there is a proxy server between your Mascot server and the  Internet  this may fail unless you define your proxy server in the Options  section of mascot   dat  The relevant parameters are proxy server   proxy username  and proxy password  Unless your proxy server  uses authentication  you only need the first of these  and a typical entry  in mascot  dat will look like this    proxy server http   our cache 3128    64 Mascot  Installation and Setup    Permissions   Security    Mascot monitor will need to create the compressed database files in the  database current directory  and may need to move old database files to  the old directory  Mascot searches  running as CGI processes with very  restricted privileges  need to read the files  Make sure Linux permissions  or Windows security settings don   t prevent this     Files not where they are supposed to be    When you enable the database  if nothing happens  double check that the  sequence database files are exactly where the Path definition specifies   Note that the taxonomy files are shared  and go into the Mascot tax   onomy directory  n
281. the username    mickey      the command would be       htpasswd  c  usr local mascot config  passwd mickey    The  c argument tells htpasswd to create new users file  You will be  prompted to enter a password for mickey  and confirm it by entering it  again  Other users can be added to the existing file in the same way   except that the  c argument is not needed  The same command can also  be used to modify the password of an existing user     Specifying the password protected resources    Having created a password file  the next step is to modify the configura   tion mapping for the x cgi directory  Instead of the mapping shown  earlier  you would use a directive like this      lt Directory  usr local mascot x cgi gt   AllowOverride None  Options None  AuthType Basic  AuthName Restricted  AuthUserFile  usr local mascot config  passwd  require valid user   lt  Directory gt   ScriptAlias  mascot x cgi  usr local mascot x cgi     You will need to stop and restart Apache  or send a kill  HUP to the  parent process  to activate the new configuration  For further informa   tion on restricting access to the server  see the    Authentication and  Access Restrictions    section of the Apache FAQ documentation     
282. this service must be running at all times     Once the new licence file is in place  follow the hyperlink to Database  Status  You should see a display similar to the following                 me a AAM  6  J    http   ec vm64 mascot x cgi ms status exe Shc O      X    G Licence information E Ls tos                      Mascot Server Licence Information    Register anew productkey View database status Reload this page       Please include ail the contents of this page when requested to provide this information to technical support     Mascot Server version  2 3 241  Licence path  c  inetpub mascot config licdb                                                                         Licence s  found    Product Key Start End Status   Active  E  SULW F7M9 TYGH 3GJ3 R3VJ 2012 04 20   OK    Feature  Mascot Server   Core functionality  v2 4   Feature  Mascot Server   CPU units  2    Company  Edman University   User  L  Scene   Distributor  Matrix Science Ltd    Inactive                                                  Node info     M 000c296cf4d5  V 943cb7e2  B EC VM64       End of page X       Follow the link at the top centre to view the status of the SwissProt  sequence database    Chapter 3  Installation  Microsoft Windows 39          aioe x      PHR x       GC       2 je ttp   ec vm64 mascot x cgi ms status exe Shc O      X      Mascot search status page  Eile Edit View Favorites Tools Help    han te see                    MASCOT search status page    Version  2 3 241   Edman Un
283. tides           Length of a sequence  number of residues  user definable  MaxSequenceLen   Number of seq   comp    and ions   type qualifiers per query 20  Maximum number of tags and etags in a search 100  Number of peptide masses  MS MS search  unlimited  Number of peptide masses  PMF search  1000  Number of enzymes in the enzymes file 100  Number of protein hits saved in the results file summary section  PMF  50  Number of peaks per MS MS spectrum 10 000  Number of lines with name  in MIME format file 1 000 000  Maximum mass of any peptide in standard Mascot  Daltons  16 000  Minimum mass of any peptide  Daltons  100  Maximum mass of an unmodified amino acid residue 300  Length of any peptide in residues in standard Mascot 254  Length of name  TITLE   for any query when    escaped    30 000  Length of database name 19  Length of enzyme name 50  Length of modification name 50  Simultaneous variable modifications 9  Number of missed cleavage sites in a peptide 9  Maximum number of cleavage rules per enzyme 20    Number of active sequence databases user definable  MaxDatabases     232 Mascot  Installation and Setup       Number of threads per search 1024  Number of concurrent jobs per database 100  Number of parse rules 256  Length of parse rule 128  Maximum length of an accession string 200  Maximum number of processors per server 64  Maximum number of sub clusters in a cluster 50  Maximum number of machines in a sub cluster 1024  Maximum number of processors in a sub cluster
284. tion  Microsoft Windows 45           Advanced Multiple Web Site Configuration x     m Multiple identities for this Web Site       IP Address TCP Port   Host Header Name      Y  All Unassigned         Add      Remove IE    aipe ESEE OT S ENSE    IP Address SSL Port                   Remove          Eda   Edif    OR   Cancel   Help         To memory lock databases totalling more than 2 GB    For 32 bit editions of Windows  there is a 2 GB limit on the address space  for any single process  Mascot Monitor   ms monitor exe   can easily  reach this limit by trying to lock several large databases into memory  To  work around the 2GB limit  a separate ms lockmem exe program is  provided     this is fork exec   d from ms monitor exe when the flag     SeparateLockMem 1    is added to the options section of mascot dat   Further details can be found in Chapter 7     Hyper threading    Intel only  Hyper threading is a technique used by Intel to improve the  performance of multi threaded programs  Hyper threading does not  double performance because pairs of cores share other resources  such as  the on chip cache  On some systems  a BIOS setting can be used to  enable and disable hyper threading     Hyper threading is detected automatically  Each CPU in the Mascot  licence enables up to 4 cores to be used for searches  Hyper threading is  ignored when counting cores  so that you may see a 1 CPU licence using    46 Mascot  Installation and Setup    8 threads on a system with a quad core pr
285. tire BRE  after an initial    if any   as the first character of a subexpression  after an  initial    if any      The circumflex is special when used as an anchor  or as the  first character of a bracket expression       The dollar sign is special when used as an anchor   Matching Single Characters    Any character that is not a special character is an ordinary character  An  ordinary character  or a special character preceded by a backslash   matches to itself     A period  used outside a bracket expression  matches to any single  character  including a newline character     A bracket expression  a list of characters enclosed in square brackets        matches any single character from the enclosed list  The following  rules and definitions apply to bracket expressions     A bracket expression is either a matching list expression or a non   matching list expression  The right bracket   loses its special meaning  and represents itself in a bracket expression if it occurs first in the list   after an initial circumflex    if any   Otherwise  it terminates the  bracket expression  The special characters           period  asterisk   left bracket and backslash  respectively  lose their special meaning  within a bracket expression     A matching list expression matches any one of the characters in the list   The first character in the list must not be the circumflex  For example    abc  matches any one of the characters a  b or c     A non matching list expression begins with a ci
286. to infringe any patents or other  property right claims or to contest validity of any such claims  this section has the sole  purpose of protecting the integrity of the free software distribution system which is  implemented by public license practices  Many people have made generous  contributions to the wide range of software distributed through that system in reliance  on consistent application of that system  it is up to the author donor to decide if he or  she is willing to distribute software through any other system and a licensee cannot  impose that choice     This section is intended to make thoroughly clear what is believed to be a consequence  of the rest of this License       If the distribution and or use of the Library is restricted in certain countries either by    patents or by copyrighted interfaces  the original copyright holder who places the  Library under this License may add an explicit geographical distribution limitation  excluding those countries  so that distribution is permitted only in or among countries  not thus excluded  In such case  this License incorporates the limitation as if written in  the body of this License       The Free Software Foundation may publish revised and or new versions of the Lesser    General Public License from time to time  Such new versions will be similar in spirit to  the present version  but may differ in detail to address new problems or concerns     Each version is given a distinguishing version number  If the Libr
287. to run searches as the    customer    in  a service or core lab environment    Some third party applications require helper scripts to be installed on  the Mascot web server  If Mascot security is enabled  you should be  aware that such scripts may create security holes     Enabling security    When Mascot is first installed  the security system is disabled  To enable  security  open a command prompt or shell on the Mascot server and  change to the mascot bin directory  Enter the command     perl enable security pl    216 Mascot  Installation and Setup    The Mascot service  ms monitor exe  must then be stopped and re   started     Disabling security    To disable security  open a command prompt or shell on the Mascot  server and change to the mascot bin directory  Enter the command     perl disable security pl    The Mascot service  ms monitor exe  must then be stopped and re   started     Authentication    There are two different ways in which users can be authenticated     1  Mascot authentication  The passwords are stored and maintained  by the Mascot security libraries and or by Mascot Integra     2  Web server authentication  Available with any web server that  supports authentication  Refer to your web server documentation  for details on how to set up authentication    The type of authentication is set up at the user level  and not as a global  setting  Even if the server has web authentication switched on  it may be  useful to set some users to be authenticated using
288. to use this option     IIS user names generally include the Domain name  e g  matrix_science   charles  The comparison will be with everything after the last forward or  back slash  So  in this case  you would enter    charles    as the user name     Chapter 12  Security 219    Groups    Access rights can are assigned to groups  not users  Therefore  a user has  no effective rights unless they belong to one or more groups  If a user  belongs to more than one group  then their rights are the combination of  the rights in both groups     There are 5 special built in groups   Guests    By default  the guest user is the only member of this group and the guest  group can only submit PMF searches against any database  This can  easily be changed using the security administration utility     Administrators    The admin user always belongs to this group  Members of the group can  perform any administration task  but cannot submit searches     PowerUsers    Members of this group can submit all types of searches and perform  some administration  They cannot access the security administration  utility     Daemons  The daemon user belongs to this group by default   MascotIntegraSystem    The  system  user is the only member of this group     Using the security administration utility    When the security administration utility is started for the first time  you  will need to login as admin admin  You are then forced to change the  password     The main page lists the current users and gro
289. try to run  searches or view results reports until the relevant sequence database is     In Use        Usually  you ll want to add ms monitor exe to the system boot process  so  that it is started automatically  An example Linux init script called  mascot can be found in the Mascot bin directory  For RHEL CentOS    move this to the  etc init d directory with permissions rwxr xr x and  owner root root  As root  type     chkconfig    add mascot    Security    Mascot security is disabled on installation  To enable Mascot security   refer to Chapter 12    Keyword Indexing    Users of Mascot may wish to be able to search the help text by keywords  or phrases  The web pages are designed to work with an indexing tool  called ht   Dig  This is standard in several Linux distributions  If not  installed  we recommend stable release  3 1 6      Red Hat CentOS Linux    yum install htdig  Debian Ubuntu Linux    aptitude install htdig  SUSE Linux    yast  i htdig    A few binary packages are also available at http   www htdig org files   binaries     Alternatively  if you have a working development system with a C    compiler  you can download the source code from http   www htdig org     Once installed  you   ll need to edit the following values in the ht   Dig  configuration file  htdig  conf    Chapter 2  Installation  Linux 17    start_url  http   your_host mascot home html  limit_urls_ to   mascot   exclude_urls   pl  exe  gif  jpg  pdf  msi  png    It is also necessary to add an alia
290. tting  For example     http   your_server mascot x cgi ms status exe   Show RESULTFILE amp DateDir 20031231 amp ResJob F006983 dat    For security reasons  the following characters are not allowed in the  DateDir or ResJob           The argument MS_USERS returns a list of users that can be spoofed by  the user whose session ID was supplied  This may be an empty list     Output format is     username       user id    user type       full name       email  address     E g            guest       1   1   Guest user        guest localhost         admin        2   1   Administrator        admin localhost         daemon        4   1   Mascot Daemon        daemon localhost         system        6   2   Mascot Integra system account        integra localhost       MS_STATUSXML returns an XML formatted document equivalent to  the main status page  The schema is  html xmlns schema msstatus_1 msstatus_l xsd    124 Mascot  Installation and Setup    Review    Mascot Review  x cgi ms review  exe  provides similar functionality  to Status  but takes its input from searches  log  The tabular display  can be filtered and sorted to locate specific searches by title  user name   or any one of the following log fields     1  Mascot job number      Process ID     Sequence Database     User name     User email address     Search title     Results file path     Start time and date     Duration in seconds   10  Completion Status   11  Job priority   12  Type of search  PMF  SQ  or MIS  13  Enzyme  Eit
291. tware in a product  an acknowledgment in the product  documentation would be appreciated but is not required     3  Altered source versions must be plainly marked as such  and must  not be misrepresented as being the original software     4  The name of the author may not be used to endorse or promote  products derived from this software without specific prior written  permission     THIS SOFTWARE IS PROVIDED BY THE AUTHOR    AS IS    AND ANY EXPRESS   OR IMPLIED WARRANTIES  INCLUDING  BUT NOT LIMITED TO  THE IMPLIED  WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE  ARE DISCLAIMED  IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY  DIRECT  INDIRECT  INCIDENTAL  SPECIAL  EXEMPLARY  OR CONSEQUENTIAL  DAMAGES  INCLUDING  BUT NOT LIMITED TO  PROCUREMENT OF SUBSTITUTE  GOODS OR SERVICES  LOSS OF USE  DATA  OR PROFITS  OR BUSINESS  INTERRUPTION  HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY   WHETHER IN CONTRACT  STRICT LIABILITY  OR TORT  INCLUDING  NEGLIGENCE OR OTHERWISE  ARISING IN ANY WAY OUT OF THE USE OF THIS  SOFTWARE  EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE     Julian Seward  Cambridge  UK   jseward acm org  bzip2 libbzip2 version 1 0 3 of 15 February 2005    xiv Mascot  Installation and Setup    SWIG    Simplified Wrapper and Interface Generator  SWIG     SWIG is distributed under the following terms     This software includes contributions that are Copyright  c  1998 2002  University of Chicago   All rights reserved     Redistribution and use in
292. uest unmap   NO Mem locked   NO   Number of threads   2 Current   YES       Name  Filename  Status  State Time    Mem mapped   lt     IPI bovine Family    usr local mascot_2_3_02_64 sequence IPI_bovine cur  IPI_bovine_3 73 fasta Pathname    usr local mascot_2_3_02_64 sequence IPI_bovine cu  In use Statistics   Mon Apr 16 11 05 38   searches   0   YES Request to mem map   YES Request unmap   NO Mem locked   NO       By clicking on a database hypertext link  a page is displayed showing the  activity on that particular database     22 Mascot  Installation and Setup     3  Mascot search status page        C   fi Olocalhost 3090      5  ms status exe Autorefresh true  OBLIST E Y A    Mascot database status   SwissProt       Current jobs   Job PID Start time Status UserID Title   1277 8552 Sun Apr 22 Searching     0 MS MS Example  1276 8547 Sun Apr 22 Searching     0 MS MS Example       Completed jobs   PID Start time Status User Title   756 Wed Apr 18 User read Mascot Daemon Submitted from bug 6   6531 Sun Apr 22 f User read GETSEQ Getting sequence  8526 Sun Apr 22 User read GETSEQ Getting sequence  10165Sat Apr 21 7 User read GETSEQ Getting sequence  1194 Wed Apr 18   User read GETSEQ Getting sequence  1144 Wed Apr 18   User read GETSEQ Getting sequence  1143 Wed Apr 18 User read GETSEQ Getting sequence  31718Wed Apr 18   User read GETTAX Getting taxonomy  31675Wed Apr 18 User read GETSEQ Getting sequence  31671Wed Apr 18   User read GETSEQ Getting sequence    oo000000000    Back
293. umber of Hits    o    100 150 200  Probability Based Mowse Score                If the installation cannot proceed  a message box will be displayed   Typical problems include     You do not have Administrator privileges  Log out and log in  as a user with local Administrator rights    Perl is not installed  Install Perl as described above    Unsupported Windows platform  Refer to the system require   ments at the beginning of this Chapter    Any problem s  must be fixed before the installer will proceed  Pressing  Next displays the Mascot End User Licence Agreement     Chapter 3  Installation  Microsoft Windows 31       ie lascot Server Setup  ecom     fae              End User Licence Agreement MATRIX   Please read the following licence agreement carefully   as CE   MASCOT PROTEIN IDENTIFICATION SYSTEM S  End user Licence Agreement E    IMPORTANT   PLEASE READ CAREFULLY  This End User Licence  Agreement is a legally binding contract between you  either an  individual or a single corporate entity  and Matrix Science Limited for the  product identified above  which includes computer software  electronic  documentation  any printed documentation  and any subsequent updates  and supplements  the    Software         By installing or using the Software  you agree to be bound by the terms  of this agreement  If you do not agree to the terms of this agreement   we are unwilling to license the Software to you  In this case  do not  install or use the Software  Return the nackane that
294. ups and has buttons for  deleting adding and editing users and groups  The global security options  can also be modified from this page  On all the pages  there is a help  window that gives details about specific options     just position the mouse  over the relevant hyperlink to see the help     220 Mascot  Installation and Setup      Mascot Security Administration Utility   Microsoft Internet Explorer    File Edit View Favorites Tools Help             x  x  a O p gt   Search Sie Favorites    B    ia Powermarks Tif Ab    Address      http   slaveoz mascot x cai security_admin pl     Ej co snet    Logged in as Administrator       Groups                      PowerUsers  Daemons  icarstenp    it  MascotIntegraSystem                   Options  Option Option  Security enabled Verify IP address                Session timeout Logging level    Mascot Integra server    Default password expiry URL     http   integra 8080 topaz    perag password oooO Mascot Integra database  integra                Use session cookies Integra Oracle server  integra              Help window  i i    Use this configuration application to add delete edit users and groups     For further help on any input parameter  hold the mouse over the blue text                   2 Local intranet       To add a new user  click on the Add    button     F Mascot Security Administration Utility   Microsoft Internet Explorer    File Edit View Favorites Tools Help   x       Q pak  Q  x  a A J Search she Favortes    2  A 2 LW 
295. us screen and the  database will not be available for searching  Error messages from Moni   tor are logged to errorlog txt in the mascot logs directory  Both  this file and monitor  log can be viewed using links on the Mascot  Status page     The input file which defines a test search can be found in the mascot    data test directory  The filename is constructed from the name of the    52 Mascot  Installation and Setup    database together with the extension  asc  For example   SwissProt asc     Note  Test files for new databases are generated by modifying  do_not_delete asc  Never delete this file     The output of the test search may change slightly with each new update  to a database  Sequences may be corrected or descriptions modified   Quite often  a new entry appears which is very homologous with one of  the matched proteins so that it appears on the hit list     Using SwissProt 2012_03  the report from running the standard test  search is shown on the following pages     Chapter 4  Validation 53    MATRIX   USCIENCES    Mascot Search Results       User   Monitor Test DB 0   Email   Search title MS MS Test Search   MS data file test_search mgf   Database SwissProt 2012 03  535248 sequences  189901164 residues      Timestamp 14 Apr 2012 at 18 46 03 cur    Protein hits    CH6O HUMAN 60 kDa heat shock protein  mitochondrial OS Homo sapiens GN MSED1 PEwl SW 2   CH6O DROME 60 kDa heat shock protein  mitochondrial OS Drosophila melanogaster GNeHsp60 PE 1 3V 3  CH6O_CAREL Chap
296. used at the top of a search  progress report  You can customise this by substituting the URL of your  own logo  For optimum appearance  the image should be 88 pixels wide  and 31 pixels high     MailTempFile see EmailErrorsEnabled  MailTransport see EmailErrorsEnabled  MascotCmdLine see ErrorLogFile  MascotControlFile see ErrorLogFile    MascotJobIdFile see ErrorLogFile       MascotMessage    A text string to be displayed ahead of the progress reports when a search  is run    MassDecimalPlaces 2    Mascot calculates all masses to an accuracy of 1 65535 Daltons  The  number of decimal places used to display peptide mass values in reports  can be altered by changing this value     MaxAccessionLen  Obsolete    MaxConcurrentSearches 10    Chapter 6  Configuration  amp  Log Files 91    This parameter limits the maximum number of concurrent searches so  as to avoid overloading the Mascot server  Default is 10    MaxDatabases 64    The maximum number of concurrently active sequence databases  In   creasing this value uses more RAM  so don   t set unnecessarily high   There is no upper limit to this value  You need to restart the Mascot  service after changing this value     MaxDescriptionLen 100    Description text parsed from the FASTA title line will be truncated at  this number of characters   Note  There is no need to recompress a  database if this parameter is changed      MaxEtagMassDelta 1770  MinEtagMassDelta  130    In an error tolerant tag search with a fully specific enzy
297. which an arbitrary group of files can be de   fined for searching  either immediately or at some pre set  time     2  Real time monitor mode  in which new files on a pre defined  path are searched as they are created     3  Data dependent follow up tasks  For example  automatically  repeating an unsuccessful search at a later date or against a  different sequence database     Tasks    The functional unit of Mascot Daemon is a task  A task can be created or  modified in the Task Editor  A task is defined by     1  The data source  e g  a file list or a file path     2  How the data are to be searched  an associated set of search  parameters     3  When the searches are to take place  4  Any follow up activities  such as conditional repeat searches     Tasks can be in one of four states  running  paused  completed  or can   celled  A paused task can be resumed  A paused or completed task can be  cancelled or deleted     Data Files    Data files can be any of the peak list formats supported by Mascot  Other  types of file  such as binary data  can be specified if an appropriate data  import filter is available     180 Mascot  Installation and Setup    Flexibility    A wide range of native file formats can be processed using the  Mascot Distiller library   requires an additional licence      If AB SCIEX Analyst is installed on the same system as  Mascot Daemon  Analyst WIFF data files can be processed  using the mascot dll    script        If AB SCIEX Data Explorer is installed o
298. xample  192 168 114 201 192 168 114 201  255 255  255 0             Now press OK buttons in this and in the previous window in order to  return to the Windows Firewall dialog  which should now look like this    Chapter 11  Cluster Mode 191      Windows Firewall    General Exceptions   Advanced       Windows Firewall is blocking incoming network connections  except for the  programs and services selected below  Adding exceptions allows some programs  to work better but might increase your security risk        Programs and Services        Name   File and Printer Sharing  MascotNodePort5001 TCP  Remote Assistance   O Remote Desktop   O UPrP Framework          Add Program        Add Port            Display a notification when Windows Firewall blocks a program    What are the tisks of allowing exceptions              Press OK  Repeat this entire procedure on every search node   Vista and Server 2008     On each search node  log in as a user with local administrator rights  Go  to Control Panel  Network Status and ensure the network connection to  the master node is described as Private  If it shows as Public  choose   customise to change it  Under Sharing and Discovery  Enable File Shar     ing     192 Mascot  Installation and Setup       GO  SS  gt  Control Panel    Network and Sharing Center    file Edit View Tools Help            Network and Sharing Center    View computers and devices  os View full map  Connect to a network    Set up a connection or network L A     Manage netw
299. xonomy tar  bd    Name Size Packed Size Modified Mode User Group Link    D estdmp 15185 15360 2000 11 15 16 13 Orw rw rw  root root      gencode dmp 3377 3584 2012 04 21 10 20 Orw rw rw  root root     gi_taxid_prot    815 798 994 815799296 2012 04 21 10 23 Orw v  root root    _  merged dmp 436 371 436736 2012 04 21 10 20 Orw rw rw  root root    LJnames dmp 81 830 934 81831424 2012 04 21 10 20 Orw rw rw  root root   LJ nodes dmp 67 249 279 67 249664 2012 04 21 10 20 Orw rw rw  root root   LJ owl dmp 7643 7680 1999 09 21 15 50 Orw rw rw  root root    L   usernodes d    264 512 2000 07 31 11 17 Orw rw rw  root root   H speclist bt 1700 361 1700864 2012 04 21 14 08 Orw rw rw  root root  E w r  9 object s  selected 967 042 408 15185 2000 11 15 16 13          3  Edit mascot dat    The first time you use Database Manager  database related configura   tion information is moved to an XML file  and the database related    sections of mascot  dat are re written whenever changes are saved  So   either use Database Manager all the time or edit mascot  dat by hand  all the time  You cannot swap between the two        The format of mascot dat is described in Chapter 6  For NCBInr  configu   ration information is already present in mascot   dat  but commented  out to make the database    inactive     Once all the files are in place  all you  need to do is check the path is correct then remove the comment charac   ter   and any leading space at the start of the NCBInr line in the  Databases s
300. y  FormVersion  a warning will be included in the results file and in the  master results report     GetSeqJobIdFile see ErrorLogFile    ICATQuantitationMethod ICAT    For backward compatibility  if a search is submitted from an old client  with ICAT ON  then the specified quantitation method will be used     IgnoreDupeAccessions EST others    A comma separated list of database names  For any database in this list   don   t check for duplicate accession numbers when creating the com   pressed files  A database should only be added to this list if it has a very    Chapter 6  Configuration  amp  Log Files 89    large number of sequence which may causes the system to run out of  memory when creating the compressed files     IgnorelIonsScoreBelow 0 0    When a report is generated  any ions score lower than this value will be  set to zero and ignored  The parameter is a floating point number  de   fault 0 0  Values greater than 0 and less than 1 act as an expect value  threshold  and the scores for any peptide matches with higher expect  values are set to 0  This global default can be over ridden on an indi   vidual report URL by appending  amp _ignoreionsscorebelow X  where X is  the cut off value     IntensitySigFigs 2  The precision of intensity values written to the result file     InterFileBasePath see ErrorLogFile  InterFileRelPath see ErrorLogFile    IonsDecimalPlaces 2    Mascot calculates all masses to an accuracy of 1 65535 Daltons  The  number of decimal places used to 
301. y purpose  on any computer system  and to alter it and redistribute it  subject  to the following restrictions     1  The author is not responsible for the consequences of use of this  software  no matter how awful  even if they arise from flaws in it     2  The origin of this software must not be misrepresented  either by  explicit claim or by omission  Since few users ever read sources   credits must appear in the documentation     3  Altered versions must be plainly marked as such  and must not be  misrepresented as being the original software  Since few users    ever read sources  credits must appear in the documentation     4  This notice may not be removed or altered        ve      Copyright  c  1994  The Regents of the University of California  All rights reserved                   Redistribution and use in source and binary forms  with or without    modification  are permitted provided that the following conditions    are met      1  Redistributions of source code must retain the above copyright    notice  this list of conditions and the following disclaimer      2  Redistributions in binary form must reproduce the above copy       notice  this list of conditions and the following disclaimer in  the    documentation and or other materials provided with the distribu     End User Licence Agreements XXV      3  All advertising materials mentioning features or use of this  software     must display the following acknowledgement       This product includes software develop
    
Download Pdf Manuals
 
 
    
Related Search
    
Related Contents
2000 (12 [v]) 4000 (230 [v])  Tascam VL-X5 User's Manual  User manual  Henny Penny HDS-200 User's Manual  Jet Tools JML-1014 User's Manual  etiquetado propuesta cofepris acuerdo por el que se emiten los  Téléphone mains libres à deux lignes Mémoire de 32 numéros  No Drilling Required GB32018-SS-NDR Installation Guide    Mode d`emploi    Copyright © All rights reserved. 
   Failed to retrieve file