Home

Otterlace • Zmap • Blixem • Dotter user manual

image

Contents

1. File Add feature PolyA site 154945 Rev 154944 Delete PolyA signal 154970 Rev 154965 Delpte Reload Ctrl R PolyA site 172400 Rev 172399 0 5 Pelete Gave Ctrl 5 PolyA site 191014 Rev 191013 Delete PolyA signal 191033 Rev 191028 Delete Ctrl W Pseudo PolyA sfenal 284410 Fwd 284415 0 5 Delete Pseudo PolyA Renal 287261 Fwd O87 266 Delete PolyA signal Fwd Delete Once the coordinates have been entered select Save from the main window to see the features in Zmap Clone File SubSeq Launch ZMap Lemire Launch In A ZMap Genomic Features Ctrl G Dotter Zmap hit Berta On The Fly OTF Alignment Ctrl x Rename locus CEGL toni rer Re authorize Ctrl Shift A N Load column data Re authorize allows you re establish connection to the database if login expires This can occur if session has been running for a few days X Select column data to load Name Description E augustus Ab initio gene predictions from augustus CCDS_Coding CCDS transcripts mapped from the Ensembl database E CCDS_Gene CCDS transcripts from the CCDS database E ChIP_PET_ditags Paired end tags J Comparalons regions conserved across 10 species from the Ensembl Compara group E cpg CpG islands das_2waycons_Pseudogene Pseudogenes agreed between Yale and UCSC das_Aspic ASPIC
2. 202 AC096570 1 RP11 368018 completed 2008 01 30 jmi2 partof novel gene 2 novel genes 203 AC096561 1 RP11 257D7 completed 2008 01 30 jmi2 nothing 204 AC074233 6 RP11 77022 completed 2008 01 30 jmi2 nothing 205 AC110300 3 RP11 541A12 completed 2008 01 30 jmi nothing 206 AC127378 3 RP11 608P22 completed 2008 01 30 jmi2 nothing 207 4C104807 5 RP11 616124 completed 2008 01 30 jmi2 nothing 208 AC016768 10 RP11 560C7 completed 2008 02 07 jmi2 3 novel gene CpG 209 AC018467 9 RP11 398317 completed 2008 02 07 jmi 5 novel gene 210 AC110925 1 RP11 121E2 completed 2008 02 07 jmi2 nothing 211 AC012506 9 RP11 498022 completed 2008 02 07 jmi2 4 novel genes 5 KLHL29 CpG 212 AC011239 5 RP11 414D15 completed 2008 02 07 jmi2 part ofKLHL29 novel protein 213 AC009242 6 RP11 557N21 completed 214 AC079924 6 RP11 469L8 completed 2008 02 07 jmi2 part of ATAD2B 215 AC066692 3 RP11 424C23 completed 2008 02 07 jmi 5 ATADZB UBXD4 RP513 pseudo SDHC pseudo 2 CpGs 216 AC104665 3 RP11 724N3 completed 2008 02 07 jmi2 novel protein C2orf44 5 FKBP1B CpG eile AC008073 4 RP11 507M3 completed 2008 02 07 m2 3 FKBP1B TP5313 PFN4 novel protein novel gene 5 novel protein 3 ITSN 5 CpGs 216 AC009228 4 RP11 219F1 completed 2009 02 11 jv2 Contains the 5 end of the ITSN2 gene for intersectin 2 a ribosomal protein L3ba RPL364 pseudogene a high mobility group nucleosomal binding domain 2 HMGN2 pseudogene and a CpG island 219 AC078975 6 RP11
3. Selecting coordinates You can select a nucleotide peptide by middle clicking on it in the detail view This selects the entire column at that index and the coordinate number on the reference sequence is shown in the feedback box The coordinate on the match sequence is also shown if a match sequence is selected By default the display will centre on the selected base when you middle click To select a base without scrolling hold down Ctrl when you middle click For protein matches when a peptide is selected the three nucleotides for that peptide for the active reading frame are highlighted in the header in blue The active reading frame is whichever alignment list currently has the focus click in a different list to change the reading frame Darker blue highlighting indicates the specific nucleotide that is currently selected i e whose coordinate is displayed in the feedback box lt lt lt gt gt gt Q a 103596 QSVIYS 2 1 469 181 rtctaatagacgaggaaatattataggaaagatggattttcctc tcgattatactatgatggagtcttgacctctcctgtettcatc igtactatattataccattttcaagatcatagtattagctctet Figure 6 The 3 nucleotides for the currently selected amino acid in reading frame 3 Selected nucleotide 103596 is shaded in darker blue You can move the selection to the previous next index using the left and right arrow keys In protein mode you can move the selected nucleotide by a single base rather than an entire codon holding Shift whil
4. b Or select existing exon s and paste to create copies to edit c Paste over existing coordinate to replace old with new d Or select coordinate and use up and down cursor key to change value e Or select coordinate and delete numbers with backspace key and type in new numbers Note Pasting is done by pressing the middle mouse button often the scroll wheel The next section describes these menu z AC008073 1 001 options File Exon Tools Attributes 1931 CDS 15273 1795 dh 1967 gt ag 5983 Ap 6030 gt ag 12895 Ab 13007 gt ag 15145 db 15762 a Transcript Name AC005073 1 001 Type Known_CDS Starts Found End Found Found LOS not roung 21 EMS Sg je COS MNoc gt rouna en UTR this text is transcript visible remark Annotation this text is transcript annotation Remarks incomplete E Locus Symbol FKBP1B x l Known Full name FK506 bindingN rotein 1B 12 6 kDa Aliasies Remarks this text is locus visikle remark this text is locus annotation uw Annotation Icemark Status of translation start CDS only number indicates translation off set The UTR incompleie de ik Locus notes Click on red annotation button to make a comment private so that it does not appear in the EMBL file set if the transcript is cut off within the UTR For example if not all of an mRNA used as evidence can be aligned due
5. ttgcacccgattcttg d Click the buttons with the left mouse to operate the DNA and 3 Frame translation options Right click over the buttons for further options To remove these displays from Zmap click on the button again gtagga tgaggattcgqcattaa aggggatctggggagt tggggatctgggatca gcactctgecctccac ctccaccttcacccac tcacagaccactactg ctttgactctacagcc oom In Zoom Out DNA Show All Hide All IOMA Show Al Hide All Block 2 24124294 24462305 gt Back 031 12894 6864 TRA Hide All ow All File Edit View Raise ticket Help p EZ Stop Reload H Split V Split Unsplit Unlock Revcomp 3 Frame DNA Columns Zoom in Zoom out Back 1 338012 Data loaded AC008073 1 001 1795 15462 13968 6031 12894 6864 Transcript curated known_cds oom In Zoom Out Back Show All 31 12894 6864 Hide All Align 2 24124294 24462305 2 Show All 113 Frame LONA alumn Translation menu 3 Frame Translation gt H H Hide All z Block 2 24124294 24462305 Show All Hide All T a pia DL UI IL no irr ddadayyLtaLLLLaLLL FRHVES LOMWNL TCGI gt tagacatgtggaatct ELGVYS NAEFTH IGSLLM gaattgggagtttact POLYC DHSSIA TTALLH catgaccacagctcta TRFLL PDSCFD PILALT gt ttgcacccgattcttg QECSKM RNAPKW GMLONG GRSLIH EEV FI KKFDSS E PETETN QROKQT RDRNKP LSSSEL FOVONA FKFRIG gt ANRKSS QTGSHOQ KQEVIK D KVLKRV RF RGC GFEEGA O QPR DE SPGRHR AQVG G D
6. 100 80 60 40 E Sort by Identity QQea lt lt gt gt A E BU739888 1 386 Name Score ld Start Sequence BU739888 1 BX503083 1 4179 99 ENSES lt gt 308610 1 1 sr sra BM720074 1 508 DA689205 1 239 ELSS553 6 2 267 BU159504 1 472 GENSC lt gt 046298 1 ES La Figure 1 Nucleotide mode There are two panes in the detail view one for each strand The active strand is shown at the top The active strand can be changed by hitting the Toggle button or the t shortcut key 46 aA eA X Blixem Variations prot fwd offset gff alignment chr4 04_210623 364887 Zoom out Whole 120000 120500 121000 oe SO WIE 500 eT ee lS ae hos lo coro EEE ic S 7 20 100 80 60 40 20 s 1 44 LAAD 9959 lt lt lt gt gt gt A E 4 Q8IW36 3 172 Q8IW36 3 ctgtgagtcgcgccgtgaggccaactgggcatgtaccccacgttaggtcgtgggtggc ggggccacgacccgccggctaccgtgaagtgccggtgcggaatggaacgctaaggctg cgacgccctatttcgagatgcatgctgca cc caggtggattgacggaagtgt End E Sort by Identity Name Score ld Start ENSES lt gt 308610 EREEREER Q8 IW36 3 3 Q8 IW36 3 40 57 62 Q8IW36 2 3 40 57 62 Figure 2 Protein mode There are three panes in the detail view one for each reading frame of the active strand The other strand can be activated by hitting the Toggle button or the t shortcut key Active Strand The act
7. 6k Bk 9 miran Wau 34 mediated decay Help 1 338012 Data loaded Edit Ctrl E y Close all F4 3 Now either use the key stroke Copy Ctrl C short cut or click on Variant You Paste Ctrl V 3 z A will see a new object appear in your main window New Ctrl N Variant GERIET Delete Geran 4 The evidence is X Recovered human chr2 04 clones 217 X AC008073 1 007 ee En eea File Exon Tools Attributes attached automatical ly genscan 1 AC008073 2 001 ESTT1353 AC008073 4 012 M t th AC008073 2 002 PFO0036 2 3 CCDS 2659 1 PF00036 1 1931 CDS 6567 O e new gene AC008073 6 004 57713533 ESTT13544 AC008073 1 004 761113534 AC008073 4 007 1812 sp 1967 gt object PFOS240 1 AC008073 8 001 AC008073 4 003 ag 5983 P 6030 gt PFOO254 1 AC008073 2 006 AC008073 8 005 augustus 2 ag 6185 dh 6273 gt CCDS1706 1 AC008073 2 963 AC008073 8 003 ESTT13545 augustus 5 PF00107 1 genscan 3 ENST408053 ag 6554 ap 6621 gt CCDS33153 1 augustus 3 PFOOO18 5 ag 12895 dE 13007 gt ESTT13529 PFO7653 5 ESTT13528 AC008073 9 001 PFO0018 4 ag 15100 dr 15762 Y AC008073 1 006 PFO7653 4 E is ESTT13527 ESTT13546 PFO0018 3 R AC008073 1 005 gt AC008073 4 009 PF07653 3 Transcript AC008073 1 007 D ng A
8. Getting Started Running Dotter As a minimum Dotter takes the following required arguments dotter lt horizontal sequence gt lt vertical sequence gt where lt horizontal_sequence gt and lt vertical sequence gt are the path names of FASTA files containing the two input sequences Dotter will assume that the sequences both start at coordinate 1 unless you use the q and s arguments to set an offset for the query horizontal and subject vertical sequences respectively Run dotter without any arguments to see further usage information Sequence versus itself Dotter can be run on a sequence versus itself This can be useful to analyse internal repeats If you re comparing a sequence against itself you ll notice that the main diagonal scores maximally since it s the 100 perfect self match Input files The sequence input files are in FASTA format Comparisons are allowed between two nucleotide sequences two protein sequences or one nucleotide and one protein sequence note that when comparing a nucleotide and a protein sequence the nucleotide sequence must be passed first i e as the horizontal sequence Additional features can be passed to Dotter in a GFF file using the argument Relevant features include alignments which can be viewed using Dotter s HSP mode and transcripts which are shown at the bottom of the Dotter window FASTA file A FASTA file has a header line that starts with gt a
9. UnCompress Columns Shift C Export Whole View it Export Marked List All Colurfin Features List This Name Column Features Feature Search Window DNA Search Window Peptide Search Window The Compress function removes excess white space by hiding columns that have no features in them apart from those that have been set to Show in the Columns menu 29 RN aiPASSAAAZRPPPALS The Y that the column is bumped Select again unbump it Name Interleave Name No Interleave Note the red diamonds warning of missing sequence that cannot be aligned Help i soba i yea rian X ZMap pfetch Em BX442352 2 BX442352 SV 2 linear mRNA EST HUM 954 BP BX442352 23 APR 2003 Rel 75 Created 11 MAR 2004 Rel 79 Last updated Version 2 Pfetch returns the EMBL eu flatfile for that vail sequence 1 954 Genoscope human full length cDNA 5 PRIME end of clone CSODFO29YF20 of FETAL BRAIN of Homo sapiens human EST Homo sapiens human Eukaryota Metazoa Chordata Craniata Vertebrata Euteleostomi Mammalia Eutheria Euarchontoglires Primates Haplorrhini Catarrhini Hominidae Submitted 22 APR 2003 to the EMBL GenBank DDBJ databases Genoscope Centre National de Sequencage BP 191 91006 EVRY cedex FRANCE E mail seqref qgenoscope cns fr Web wew genoscope cns fr 2 Contact Feng
10. H Split V Split Unsplit Unlock Revcomp 3 Frame DNA Columns Zoom in Zoom out Bact 8012 Data loaded AC008073 1 001 1795 15762 13968 6031 12894 6864 Transcript curated known_cds When any of the features are clicked on information about AC008073 1 is a them will be displayed in the panels along the top of the curated transcript screen e g the feature name or accession number coordinates with type known_cds length of match identity exon length etc Place the mouse over the buttons to get further information about its function such as to reverse complement your sequence Use the Back button to undo the last marking or zooming action File Edit View Raise ticke Help v Reload H Spiit V Split Des Revcomp 3 Frame DNA Columns Zoom In Zoom Out Back 1 338012 Data loaded AC008073 1 001 Reverse complement sequence view O curated known_cds Zoom menu Some buttons en Max 1 bp line have further Max 1 bp line 10 bp line 10 bp line 100 bp line options when 100 bp line 1000 bp line you right click 1000 bp line All DNA All DNA Min whole sequence over them Min whole sequence 26 The DNA button will show the nucleotide sequence If you click on an exon the sequence is highlighted in red You can select a DNA sequence by clicking with the left button and dragging a selection which you can then paste with the middle mouse ay tt ct ct DEE zum m Tog ta
11. The 3 nucleotides for selected codon A in Darker blue indicates the reading frame 2 are highlighted in blue nucleotide whose coordinate is shown here a 120239 O5RB30 1 769 3 ggggaggacattgtggttggaagcttaggagcccaccgcc agacaagctccgatacggtaccatttcctgacccgccgcg atacacggctcaattcccgccccagtagcgcgccacaagc Select an alignment or click EGEAKDOTLTS EFAAECVDTTDLLFTAVRDPPPRPPGPR anywhere in an alignment list to set the active reading frame Figure 9 Selected reading frame and codon 51 The toolbar The detail view toolbar contains the following functions Note that the Help and Settings buttons are included in the detail view toolbar even though they apply to Blixem as a whole E Sort by Identity v ES QQ Bk Bl lt lt lt gt gt gt 120381 BU739888 1 386 379 Figure 5 Detail view toolbar B Help 2 Sort by Ed Settings A gt Zoom in gt Zoom out Go to lel First match Previous match gt Next match Last match lt lt Back one page lt Back one index gt Forward one index gt Forward one page a Find 1 BU739888 1 Hs Homo sapiens human Show help about how to use Blixem Select which column to sort the match sequences by Show the Settings dialog Increase the font size in the detail view Decrease the font size in the detail view Go to a particular coordinate Go to the first coordinate of the first al
12. same as for multi select on the Mac Windows etc This option will highlight a single exon at a time for each feature but the accession numbers of each feature and the individual exon coordinates are held in the paste buffer This is a particularly useful way of selecting Zmap hits to use in the OTF alignment tool as all selected homologies will be held in the paste buffer and automatically pasted into the OTF accession window Each of the exon coordinates can also be pasted into the transcript editing window in Otterlace ps O I Once you have selected your HSPs i click on Fetch from clipboard in OTF to paste in the accession numbers J E ill I E Il pa X On The Fly OTF Alignment brought to you by exonerate Query sequences Accessions Em BC050998 1 Em CR616275 1 Em CR626535 1 Fetch from clipboard Clear Fasta file Browse Fasta sequence E Ez EJ j Parameters E Clear existing alignments of same type Number of transcript alignments to report 0 for alld 1 om FP Only search within marked region Maximum intron length 200000 Launch Close 3 You can remove selected features in Zmap by pressing Delete on the keyboard and restore them by pressing Shift Delete note on the Mac you need to press Fn Delete and Shift Fn Delete This is a particularly useful way of removing evidence that you have already assigned to a transcript object 33 Rapid varia
13. 002 AC008073 5 002 PFO0076 1 ESTT13550 ESTT13549 genscan 5 augustus CCDS1707 1 AC008073 5 001 AC008073 6 010 AC008073 6 003 AC008073 6 001 AC008073 6 002 AC008073 2 005 CCDS1708 1 ENST313482 AC008073 2 001 AC008073 2 002 AC008073 6 004 PFO8240 1 AC008073 2 006 AC008073 2 003 PFOO10 1 AC008073 2 004 AC008073 3 003 augustus 4 PFOO235 1 CCDS1709 1 ESTT13548 ESTT13547 A c008073 3 002 AC008073 3 001 AC008073 6 009 AC008073 6 011 AC008073 6 005 ESTT13532 ENST 444504 ENST 420135 ENST 454150 AC008073 6 008 ESTT13531 ESTT13530 AC008073 6 007 AC008073 7 002 AC008073 7 001 ESTT13536 AC008073 8 002 ESTT13535 AC008073 8 004 CCDS42659 1 ESTT13533 ESTT13534 AC008073 8 001 AC008073 8 005 AC008073 8 003 genscan 3 augustus 3 AC008073 9 001 ESTT13546 AC008073 4 009 AC008073 4 011 PFOO168 1 CCDS1711 2 CCDS1710 2 genscan 4 augustus 8 AC008073 4 010 ENST445614 ENST415660 ENST380883 ENST380868 AC008073 4 001 A4C008073 4 002 AC008073 4 005 AC009228 1 002 AC009228 1 001 AC009228 1 003 PFO0036 3 ENST416224 AC008073 4 006 ESTT13543 AC008073 4 012 PFOOO36 2 PFOOO36 1 ESTT13544 AC008073 4 007 AC008073 4 003 augustus 2 ESTT13545 PF90018 PF97653 PF90018 PF07653 PF90018 PF07653 PF90018 PF07653 PF90018 PF07653 AC008073 4 008 ENST449230 CCDS46230 1 AC008073 4 004 genscan 2 augustus 6 gt Mmn0wAa AO ESTT1353 PFOO935 2 PFOO935 1 RP11
14. 13 13 13 13 13 13 13 13 13 13 13 13 13 30 32 32 32 32 32 32 32 32 32 32 32 32 34 34 Otterlace software and or database problems are shown in the Error Log 01 11 21 21 22 22 22 22 22 22 40 40 4 24 27 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 2009 log file human GET http www sanger ac uk 80 cgi bin otter S2 nph get_otter_config clie GET http www sanger ac uk 80 cgi bin otter S2 nph get_datasets client o Authorized OK 200 OK GET http wew sanger ac uk 80 cgi bin otter S2 nph get_datasets client o get_datasets client received 7751 bytes from server Creating a pipeline DataFactory for human DESTROY has been called for AceDatabase pm with home var tmp lace_S2 7791 Not cleaning up var tmp lace_S2 7791 1 because error flag is set Can t use an undefined value as a HASH reference at software anacode otte uthorized OK 200 OK GET http wew sanger ac uk 80 cgi bin otter S2 nph get_otter_config clie get_otter_config client received 27706 bytes from server GET http www sanger ac uk 80 cgi bin otter S2 nph get_sequencesets clie get_sequencesets client received 23815 bytes from server El Email anacode X Email anacode sanger ac uk r_rel52 ensembl otter tk otterlacel error log nfs team f1 analysis ca X SequenceSet chr2 04 1 5
15. 219F1 1 001 PF01101 1 genscan augustus 1 RP11 219F1 2 001 genscan 6 E Find Clear 18 wl Launch ZMap a a Launch In A ZMap Gefomic Features Ctrl G Dotter Zmap hit Berler On e Fly OTF Alignment Ctrl x Rename locus Gerieshirtzl Re aukhorize Ctrl Shift A dolumn data Launch In A Zmap is used to two concurrently annotate open sequences side by side This is useful when looking at the same genomic region between two different strains or even species See later section Main Zmap interface Eile Edit View Raise ticket This is the main Zmap nte rface hr2 04_24270790 24608801 17 showing an 22 I ERN see zz G a 3 A 1 zee 35 of any Seiad 33 _ 33 222 12 a cles 355 analysis and TER Su y j annotation that 3 gJ 3 Tot sto S 2333 _ er EE E may be present 3 12 i LLL aa in your region of mM PS WW ce interest E 3 3 sa 3 o 111115 It er lt 33 393 235 35 sie 333335 2 laa Ni zu 227 y I E ii E 3 HE y 3 22 le Ir ER There are various 35 sl E hidden options that you 3 343 hs can reveal by dragging dh 43939 3 3 the dotted regions us I epee Prey 3 il 4 1 E a ee BE ts This scroll bar allows you Reload Split V Split Unsplit Unlock Revcomp 3 Frame DNA Columns Zoom In Zoom O 1 Back to move anywhere marked genomic_c
16. Blixem DNA Alignmen s All Columns Shift A 216 Em CR607749 1 1831 3387 1 1557 100 000000 vertebrate_mrna vertebrate_mma mma_align 238 Em CR626535 1 1833 1967 1 135 100 000000 vertebrate_mma vertebrate_mrna mrna_align 255 Em BC002614 1 1839 1967 1 129 100 000000 vertebrate_mrna vertebrate_mma mrna_align 297 Em D38037 1 1860 1967 2 109 100 000000 vertebrate_mrna vertebrate_mrna mrna_align 306 Em S69815 1 1860 1967 2 109 100 000000 vertebrate_mrna vertebrate_mma mrna_align 310 Em S69800 1 1860 1967 2 109 100 000000 vertebrate_mrna vertebrate_mrna mrna_align 312 Em L37086 1 1865 1967 1 103 100 000000 vertebrate_mrna vertebrate_mrna mrna_align 320 Em AF322070 1 1890 1967 1 78 100 000000 vertebrate_mrna vertebrate_mrna mrna_align 338 Em AY159324 1 1931 1967 1 37 100 000000 vertebrate_mrna vertebrate_mrna mrna_align 533 Em BC140549 1 5982 6030 114 162 93 900002 vertebrate_mrna vertebrate_mrna mrna_align 626 Em BC050998 1 5983 6030 157 204 100 000000 vertebrate_mrna vertebrate_mrna mrna_align 630 Em CR616275 1 5983 6030 156 203 100 000000 vertebrate_mrma vertebrate_mrna mrna_align s ME Feature Search 634 Em CR859971 1 5983 6030 147 194 97 900002 vertebrate_mrna vertebrate_mrna mrna_align 636 Em BC002614 1 5983 6030 130 177 100 000000 vertebrate_mrna vertebrate_mrna mrna_align m y i vertebrate_mrna vertebrate_mrna mrna_align File Help 639 Em D38037 1 5983 6030 110 157 100 000000 brate brate ali Sp
17. Group alignments in a group with a lower order number will appear before those with a higher order number or vice versa if sort order is inverted Alignments in a group will appear before alignments that are not in a group To delete a group click one of the following buttons This will have an immediate effect i e you don t have to click Apply e To delete a single group click on the Delete button next to the group you wish to delete e To delete all groups click on the Delete all groups button Running dotter e To start Dotter from within Blixem or to edit the parameters for running Dotter right click and select Dotter or use the Ctrl D keyboard shortcut The Dotter dialog will pop up 60 Xx Blixem Dotter sequence Q15928 1 Auto Start 126319 u Call on self C Manual End 1294190 Last savei Zoom OO Full range gt Big picture range gt _ HSPs only IX Cancel bry Save Figure 12 Dotter dialog e Select the sequence you wish to run Dotter on before or after opening the dialog The selected sequence name will be shown at the top of the dialog e Alternatively if you just wish to edit the settings you do not need to select a sequence e To run Dotter with the default automatic parameters just hit RETURN or click the Execute button e To enter custom parameters select the Manual radio button and enter the values in the Start and End boxes e To save the p
18. Institute of Bioinformatics has many tools for analysing nucleotide and protein sequences http www expasy ch UCSC genome browser http genome ucsc edu cgi bin hgGateway UniProt has protein sequence information http www uniprot org Vertebrate Genome Annotation Browser for manual annotation http vega sanger ac uk index html 04
19. Revcomp 3 Frame DNA Columns Zoom in Zoom Out _Back 1 338012 Data loaded Start end coordinates if a coding object and transcript type are inherited from the Em BC050998 1 1153 6185 6273 lt 205 293 89 lt UNGAPPED ALIGNMENT gt 100 000000 Alignment vertebrate_mma vertebrate_mma hr2 04_24270790 24608801 parent so these may not be relevant and may need to be changed Note that the new object is coloured red due to a number of errors The checking software will not recognise evidence until the object is saved 5 Once the errors have been removed save the object to see it appear on Zmap the evidence used has been highlighted 35 Splitting windows in Zmap Use the split window function to effectively reduce the size of the window when looking at homologies This is of particular use when you have to deal with very large introns because you can essentially reduce the introns to whatever size you wish or when there are very many HSPs because you can keep your gene object in view and static but still scroll across the evidence The screen can be split Unsplit will The windows will be locked horizontally or remove the last together when you first open vertically as shown split window them To scroll independently multiple times An within each window use the active window must be Unlock button selected for spl
20. SFKSFNCSSL LKKHQI IHLEEKQCKCDVC QQUIIS 1 47 87 102 i Q8TB69 1 47 87 430 E Q8TF20 2 42 75 95 t Q86YE8 3 50 70 419 ECKE Figure 9 Alignment list sorted by group Fetching sequences Currently only available to authorised users at the Sanger Institute Double click a row to fetch a match sequence s EMBL file Grouping sequences Alignments can be grouped together so that they can be sorted highlighted hidden etc Creating a group from a selection Select the sequences you wish to include in the group by left clicking their rows in the detail view Multiple rows can be selected by holding the Ctrl or Shift keys while clicking Right click and select Create Group or use the Shift Ctrl G shortcut key Note that Ctrl G will also shortcut to here if no groups currently exist 60 Ensure that the From selection radio button is selected and click OK or Apply If you click Apply you will be shown the group you just created so that you can edit it If you click OK the group will be created with the default properties aA X Blixem Groups e From selection O From name wildcards and 7 mm O From list X canca ol Apply ox Figure 10 Groups dialog create group Creating a group from a sequence name Right click and select Create Group or use the Shift Ctrl G shortcut key Or Ctrl G if no groups currently exist Select the From name radio butt
21. alignment If HSPs are missing either the first or last Blast alignments in the set they are marked with a red diamond at their start end respectively This indicates if they do not start at the first base amino acid and or do not end with the last base amino acid of the alignment sequence The screen shot below shows what options you get when you right click over a homology note that you can also select an HSP and type o You also get further options such as retrieving the EMBL file for that homology using pfetch and starting Blixem see later section note HSPs do not need to be bumped to use Blixem 28 Eile Edit View Raise ticket Rk Bk Note the different coloured lines for bumped homologies The colouring allows you to see all matches for a piece of evidence instantly but also how good the alignment is for the feature you bumped it Unlock Revcomp 3 Frame DNA Columns ment HH Foo 0000 Right click on the Blast match of interest in this case an EST for more menu features Em BX442352 2 est human Show Feature Details Set Feature for Bump Pfetch this feature Blixem DNA all matches for this feature Blixem DNAs all matches for this column Show Feature DNA Export Feature DNA Shift A al Column Bump Column Hide Column Configure Column Bump More Opts Unbump All Columns Compress Columns
22. and click on Send X Enter password Enter paseword_ s cas Send 3 Select the dataset using left click Then click on Open DataSet chooser 2 Select the species using left click i A DataSet human and click on Open or Just double click bCX403H19 01 pre apalysK of clone bCX403H19 2007 94 08 a s gt o si jectbPA115B9 2 3 This will open the DataSet window bPA115B9 01 eanalysis PUBL 211589 2009 03 17 chr1 14 chromosome 1 GRCh37 2009 03 27 chr2 04 chromosome 2 GRCh37 2009 0641 chr3 03 chromosome 3 GRCh3 2009 4501 chr4 04 chromosome 4 GRCh3 2094 06 01 A pose DataSet chr5 03 chromosome 5 GRCh37 7709 06 01 cat chr6 18 chromosome 6 GRCh37 2008 06 01 i a chr7 04 chromosome 7 GRCbA 2009 06 01 chicken chr8 03 chromosome 8 GRZh37 2009 06 01 chimp chr9 18 chromosome 9 ZRCh37 2009 06 01 cow chr10 10 chromosome 0 GRCh37 2009 06 01 chr11 03 chromosore 11 GRCh37 2009 06 01 dog chr12 03 chromoz me 12 GRCh37 2009 06 01 gibboy chr13 13 chrory amp some 13 GRCh37 2009 06 01 ori a chr14 04 chi mosome 14 GRCh37 2009 06 01 g chr15 03 Mromosome 15 GRCh37 2009 06 01 human chr 16 03 chromosome 16 GRCh37 2009 06 01 Temur chr17 03 chromosome 17 GRCh37 2009 06 01 chr18 04 chromosome 18 GRCh37 2009 06 01 Marmosey chr19 03 chromosome 19 GRCh37 2009 06 01 medicag chr20 13 chromosome 20 GRCh37 2009 06 01 mouse chr21 04 chromosome 21 GRCh37 2
23. down Mac users up down by half a page should use fn and up down arrow Cntl page up Cntl page down up down by a whole page Home End Mac users should use Go to far left or right fn and left rights arrows Cntl Home Cntl End Mac users will Go to top or bottom have to configure their keyboards for this Alpha numeric keys la Blixem all sequences in column Blixem only highlighted sequence in column 38 Bump unbump current column within limits of mark if set otherwise bump the whole column compress uncompress columns hides columns that have no features in them either within the marked region or if there is no marked region within the range displayed on screen Note that columns set to Show will not be hidden Compress unCompress columns hides all columns that have no features in them within the range displayed on screen regardless of any column zoom mark etc settings are currently selected for zooming smart ee Mark unMark the whole feature corresponding to the currently selected subpart e g the whole transcript of an exon or all HSPs of the same sequence as the highlighted one for zooming smart bumping oorO show menu Options for highlighted feature or column use cursor keys to move through menu press ESC to cancel menu faire sient en eT complemen donee current view translate highlighted item T hides Translation zoom out to show whole sequence Z zoom to the extent of any selected feat
24. factor The factor is an inverse a zoom factor of 3 will zoom out by a factor of 3 i e the window will shrink to 1 3 of its full size A zoom factor of 1 will show the window at full size A factor of less than 1 e g 0 5 can be set in order to zoom in but this will result in a stretched dot plot so is not recommended Horizontal range Set the range of the horizontal sequence The maximum range possible is the range that was originally passed to Dotter the range you enter will be trimmed if you enter out of range values Note that this causes the matrix to be recalculated so if it took a long time to calculate in the first place stay away from this menu item Vertical range Set the range of the vertical sequence The maximum range possible is the range that was originally passed to Dotter the range you enter will be trimmed if you enter out of range values Note that this causes the matrix to be recalculated so if it took a long time to calculate in the first place stay away from this menu item Sliding window size To make the score matrix more intelligible the pairwise scores are averaged over a sliding window that runs diagonally This option allows you to edit the size of the sliding window There s normally no need to change this Note that this causes the matrix to be recalculated so if it took a long time to calculate in the first place stay away from this menu item 82 Keyboard shortcuts Left arrow R
25. is visible Crosshair fullscreen Toggle whether the cross hair is shown to its full extents or is clipped to just the dot plot area Pixelmap Toggle visibility of the grey scale dot plot image Gridlines Toggle visibility of gridlines HSPs off Select this option to turn HSP High Scoring Pair mode off Draw HSPs greyramp Select this option to view HSPs in grey scale mode In this mode the HSPs High Scoring Pairs are drawn in a shade of grey that is determined by their score The greyramp tool can be used to adjust the thresholds and contrast of the HSP image This mode replaces the standard dot plot image Draw HSPs red lines Select this option to view all HSPs as red lines This mode can be used in conjunction with the standard dot plot image HSPs are drawn over the top Draw HSPs color f score Select this option to view HSPs as solid lines whose colour depends on their score This mode can be used in conjunction with the standard dot plot image HSPs are drawn over the top 79 Help Ctrl H Help Show the Help dialog Help menu About Show the About dialog 80 Settings The settings menu can be accessed by selecting the Settings option on the Edit menu or by pressing the Ctrl S shortcut key F ga m Dotter Settings Zoom Horizontal range 61200 162727 Vertical range 11528 Sliding window size a X Cancel Figure 17 The Settings menu Zoom Specify the zoom
26. position displayed without re centering File Edit View Raise ticket Help el de d lt a bs Salis no wea Fete 23 ae e a un Middle mouse scroll a NG _ 7 i ak wheel displays the H e m 2 e 2 coordinates in bp of your un cues 3 ddi 3 _ z a5 cursor as you move over N F 3333 gt in Zmap When you release Double left clicking on a ii A FE ee a u locus will take you to that IM Is JJ pe gene in Zmap or if you click 33333 2 33 with the right mouse over the 3 7 er locus or on the white space Ban 3 333 y Ea you will get further options to 1153 33333 3 oo Click on buttons to order view Zmap features NM nn 1 Pa features by that classification FKBpIB 0 Show Variants List yO FKBP1B Filter Loci See eee zu Shows variants E Name Stat a End Strand Feature Et Source f bump hidden user hidden is visible Style Show All Loci no no es c 1 AC008073 1 002 1782 15762 curated known_cds y ted_tsct aSSOC lated Rs 6 AC008073 1 001 1795 15762 curated known_cds no y ted_tsct Column Bump UnBump 4 AC008073 1 003 1812 15762 curated nonsen _mediated_decay no no yes ted_tsct Column Hide with the locus 5 AC008073 1 0065 1833 15737 curated retgfJed intron no no yes curated _tsct E 3 AC008073 1 006 1852 15752 curated Yonsense mediated decay no no yes curated tsct 3 2 AC008073 1 004 5992 15762 curated nonsense mediated d
27. protein 3 ITSN 5 e 7 Note text Set note Prev Clone Next Clone crose 6 The Select column data to load box appears next which allows you to select the analysis and features you wish to see in Zmap Fewer selected columns will mean a shorter period of time required to open your clones Selected columns have a yellow box next to them X Select column data to load ner E OO EOS Name Description Select column data to load F augustus Ab initio gene predictions From augustus Name Description CCDS_Coding CCDS transcripts mapped from the Ensembl database wu eae Ais inicia gene preci ctl cra Ecos aa F CCDS_G CCDS t ipts f the CCDS datab ite SF ie ck Rn m CCDS_Coding CCDS transcripts mapped from the Ensembl datebase ChIP_PET_ditags Paired end tags CCDS_Gene CCDS transcripts from the CCDS database ComparaCons regions conserved across 10 species from the Ensembl Compara group F ChIP_PET_ditags Paired end tags F cpe CpG islands ComparaCons regions conserved across 10 species from the Ensembl Compara group das_2waycons_Pseudogene Pseudogenes agreed between Yale and UCSC E cpg CpG islands das_Aspic ASPIC alternative isoforms Pesole Bari Italy das_2waycons_Pseudogene Pseudogenes agreed between Yale and UCSC F das_ChromSig Chromatin Signatures indicating TSS Ernst MIT das_Aspic ASPIC alternative isoforms Pesole Bari Italy das_CONGO_Exons prot cod comparative exon p
28. section Load column data Select Genomic Features to bring up editing window X run dotter X Genomic features for chr2 04_24270790 24608801 Match name J Reverse strand File Add feature oo r PolyA lsite 152928 Fwd 152929 Delete Genomic sequence Name PolyA site 154945 Rev 154944 Delete PolyA dignal 154970 Rev 154965 Delete Flank 50000 Star PolyA site 172400 Rev 172399 Delete PolyA site 191014 Rev 191013 Delete Launch Update Polya dignal 191033 Rev 191028 Delete Pseudo PolWA signal 284410 Fwd 284415 0 5 Delete Pseudo PolWA signal 287261 Fwd 287266 Delete Dotter alignment of any 3 selected homology in paste buffer to object From the Add feature menu select See section on Dotter _ the type of feature you want to add PolyA site 2bp This will then appear in the main Pseudo PolyA signal 6bp box For polyA features only one of Some features are project TATA box the coordinates needs to be entered specific and will be defined o a as the other is calculated when working on that project NE NER automatically If necessary click to toggle the direction Fwd Rev Select strand before entering coordinates Reload reverts features back to the last save AOE X Genomic features for chr2 04_24270790 24608801
29. shown and highlighted in the alignment lists and polyA signals are highlighted in the reference sequence nucleotide header If the sub option Selected sequences only is enabled polyA features will only be shown for the currently selected sequences Display options Show Unaligned Sequence When this option is enabled any additional unaligned portions of the match sequences are displayed at the start and end of the alignments If the Limit to sub option is also enabled you can specify the maximum number of additional bases to display If the Selected sequences only sub option is enabled only the currently selected sequence s will display unaligned portions of sequence Show Splice Sites When this option is enabled splice sites are highlighted in the reference sequence nucleotide header for the currently selected sequence s The two bases from the adjacent introns are highlighted in green if they are canonical or red if they are non canonical Highlight Differences When this option is enabled matching bases are blanked out and mismatches are highlighted making it easier to see where alignments differ from the reference sequence Squash Matches This groups multiple alignments from the same sequence together into the same row in the detail view rather than showing them on separate rows 65 Invert Sort Order Reverse the default sort order Note that some columns sort ascending by default e g nam
30. text is transcript annotation FKBP1B FK506 bindyng protein 1B y FP Known 12 6 kDa locus visible remark locus annotation this text is this text is remark 15 X AC008073 1 001 cols Attributes S Transcript 15272 Al j 1931 i p j annotation completed for experimental confirmation ag 5983 ar 6030 frasfiented locus ag 12895 eo 13007 orphan ag 15145 oP 15762 oferlapping locus El Transcript Name ACO08073 1 001 Type Known_CDS Start Found End Found Remarks this text is transcript visible remark this cext is transcript annotation remark Locus Symbol FKBP1B 3 Known Full namef FK506 binding protein 1B 12 6 kDa Alidstes t this text is locus visible remark this text is locus annotation remark Remaryst Annr tation The attributes will appear in remarks windows highlighted in Green Quality control Otterlace has a built in annotation checking system that checks all manual annotation as it is being created as well as existing manual annotation flagging up inconsistent gene objects in red If you mouse over the offending gene object you will see a balloon appear explaining any errors that the checking software has found 006 Recovered File SubSeq Clone Tools genscan 1 ENST335934 ENST313482 AC008073 2 001 ENST238721 AC008073 1 004 AC008073 2 002 PFOO2S4 1 ACUUDU UU CCD31706 1 august
31. to missing genomic sequence etc 11 Transcript type Coding see Annotation Rene Guidelines Novel_CDS Putative_CDS Nonsense_mediated_decay Transcript Non_coding Ambi guous_ORF Retained_intron Antisense Disrupted_domain IG_segment IG_gene Putative Pseudogene Processed_pseudogene Unprocessed_pseudogene Transcribed_processed_pseudogene Transcribed_unprocessed_pseudogene Unitary_pseudogene Polymorphic_pseudogene IG_pseudogene Expressed_pseudogene Transposon Artifact MEE Predicted This section provides information relevant to the transcript Status of Found i lot found P translation stop Not Found CDS only This section provides relevant information to the locus gene Transcript notes Click on red annotation button to make a comment private so that it does not appear in the EMBL file File menu Saving closing plus windows for showing translation and selecting supporting evidence Exon Tools Attributes Click on any MET with right Click on any MET mouse to check Kozak ala lat ouse ito consensus The strongest eee ye Clee Kozak sequence has either an A at position 3 or G at 3 plus G at 4 See section on Kozak sequence in annotation guidelines Show Peptide Cterl Spacebar Select evidence Ctrl E Lerig start coordinate gt X Evi AC008073 1 001 Protein Sw P68106 2 Tr Q80GU2 Save obj ec
32. to perform the search By default Blixem will start the search at the beginning of the reference sequence range To start the search from the current position click the Forward or Back button instead of OK This will start searching from the currently selected base if there is one selected if not it will start from the beginning of the current 58 detail view display range when searching forwards or from the end of the display range if searching backwards Repeat a Find After clicking OK on the Find dialog press F3 to repeat the search in a forwards direction or Shift F3 to repeat in a backwards direction Alternatively if you had selected the Forward or Back button in the Find dialog then click the Forward or Back buttons again to jump to the next result in that direction Copy and paste When sequence s are selected their names are copied to the selection buffer and can be pasted to another program by middle clicking in that program Sequence names can be pasted from the selection buffer into Blixem by hitting the f keyboard shortcut If the selection buffer contains valid sequence names those sequences will be selected and the display will jump to the start of the selection Sequence names can also be pasted from the selection buffer into text boxes in dialog boxes such as the Groups dialog or Find dialog To copy sequence name s to the default clipboard select the sequence s and hit Ctrl C Sequence names can then be pa
33. 0001082280 Bank 2 5983 6030 OTTHUMEO0001082283 this textis locus visible remark for the exons 3 12895 13007 OTTHUMEO0001082281 4 15145 15762 OTTHUME00001082282 Details Exons Annotation remark this textis locus annotation remark 28 4 Exporting features for gene objects As described on the previous page if you right click over any feature or type o when a feature is highlighted you get further information These screen shots show how you can view and export an annotated sequence to your home directory in various different ways such as dumping features directly In the main Zmap window right click on an annotated gene object From the drop down menu select Export Feature DNA and choose sequence required from CDS transcript unspliced and with flanking sequence Alternatively select Export Feature peptide and choose either CDS or transcript Here you can see how to Show Feature DNA for annotated gene object AC008073 1 001 in FASTA format firstly the section of the transcript that corresponds to the CDS and secondly the whole transcript including the untranslated region UTR Note that the short cut keys are labeled on the right hand side of the panel CDS transcript unspliced with flanking sequence AC008073 1 001 Show Feature Details Set Feature for Bump Show Translation in ZMap Show Feature DNA gt Export Feature DNA gt Show Feature pepti
34. 009 06 01 chr22 96 chromosome 22 GRCh3 2009 06 01 nus_spretus chr Xg 1 chromosome X GRCh37 2009 06 01 opossum A pi g Open Search _Error Log Close pl aty pus X Recover sessions an Computer SALON ara at sseseisted uich e Funning This allows you to recover sessions that toma lo otterlace pr This pr not happen except where lace has have crashed Or when lace has been trop Cal 1 g crashed or t has been exited bu pressing the exit ae ited by pressing the Quit button in ou wi nee recover an exit these sessions ex te wal 1 aby Tree re nos be ne Eks Int u ae otter data abase or y p 8 p some o our wor whic as not een save s zebfafish i the Choose Dataset window This Recover sessions Please contact anacode if you still get an Ar ror when you attempt to exit the session or ha information about the error which caused A o be left i Aug 21 15 48 47 2484 man chr2 04 clones 217 218 Cancel window will automatically when opening a new otterlace session and previous sessions are still present bCX403H19 01 bPA115B9 01 X DataSet human pre analysis of clone bDC X403H13 2007 06 08 preanalysis of project bPA115B9 2009 03 17 gt chr1 14 chromosome 1 GRCh37 2009 03 27 chr2 04 chromosome 2 GRCh37 2009 06 01 chr3 03 chromosome 3 GRCh37 2009 06 01 chr4 04 chromosome 4 GRCh37 2009 06 01 chr5 03 chromosome 5 GRCh37 2009 06 01 chr6 18 chromo
35. 04 210623 364887 EST Human nucleotide match 79195 79311 95 000000 Target DA692754 1 287 403 percentID 90 6 chr4 04 210623 364887 EST Human nucleotide match 79195 79323 121 000000 A Target AI095103 1 326 454 percentID 96 9 FASTA gt chr4 04 210623 364887 tcttgtttctgtaggagaggccatctccatcagctataaccaaaaaaaaa acaaaaaactcctctttttgacaagtttgtaaagcectgtccatctgggtc tataataatcctccaggccctatgccactcctctttattcagccagttca Configuration file Note that if the sequence data for the match sequences is not supplied via the sequence tag in the GFF file then Blixem will try to fetch the data from a server using a program called pfetch Currently this is only supported for internal users at the Sanger Institute Details of the server are supplied via a ini style configuration file using the c argument 45 The Blixem Window The Blixem window consists of two main sections an overview section called the big picture and a detail section showing the actual sequence data These sections are separated by a splitter bar so you can maximise the space for the area you are interested in You can also hide sections of the window using the View menu Blixem can show sequences in nucleotide or protein mode a An eA X Blixem Variations DNA fwd offset gff alignment chr4 04_210623 364887 Zoom out Whole 0 119500 120000 120500 121000 121500 122 cinto a ey eee IN SS TER rr E O Qe cee ee one ee ee TAE 40
36. 100 000000 2 lt NOT SET gt 1865 1967 1 103 100 000000 3 lt NOT SET gt 15145 15760 265 880 100 000000 4 lt NOT SET gt 5983 6030 104 151 100 000000 A o m EO02 oo A A Th a ig O Feature Details for an HSP will show alignment FE gt a Lao Read information as well as any gene object it has been assigned to as evidence Prevents window Preserve Close Ctri W from being reloaded X ZMap 0 1 65 Feature Show AC008073 1 001 moo o Left click once on a gene object and Cant hit return to reveal the Feature pame Details interface where you can see Feature Gua DN lt lt the stable IDs also available by right Properties eee Bez clicking and selecting Show Feature tart Not Foun a End Not Found EE Details from the popup menu Details Exons Annotation Transcript Stable ID OTTPRIMT00000207622 Translation Stable ID OTTHUMNQ0000116040 Remark this text is transcript visible remark Featu re Display About ZMap Annotation remark this text is transcript annotation remark Evidence fa type Accession sv 1 EST Em AA948705 1 X ZMap 0 1 65 Feature Show AC008073 1 001 EST Em Al347804 1 File Help ai EST Em BF969580 1 ls y Select the Exon tab es Locus symbol ome to see Stable IDs Full name FK506 binding protein 1B 12 6 kDa z 4 Start End Stable ID Locus Stable ID OTTHUMGO0000151889 and coordinates 1 1795 1967 OTTHUMEO
37. 107P22 completed 2008 02 13 mpk nothing 220 AC093798 3 RP11 306N14 completed 2008 02 13 mpk Contains the 5 end of the NCOA1 gene for nuclear receptor coactivator 1 and two CpG islands 221 AC013459 9 RP11 169L20 completed 2008 02 18 mpk Contains the 3 end of the NCOA1 gene for nuclear receptor coactivator 1 the 5 end of the CENPO gene for CENPO centromere protein O a novel protein and one CpG island 222 AC012073 9 RP11 443B20 completed 2008 02 13 mpk Contains the 3 end of the CENPO gene for CENPO centromere protein O the ADCY3 gene for adenylate cyclase 3 the 3 end of the RBJ gene for rab and DnaJ domain containing and one CpG island 223 AC013267 11 RP11 28408 completed 2008 02 13 mpk Contains the 5 end of the RBJ gene for rab and DnaJ domain containing a ribosomal protein 513 RPS13 pseudogene the 5 end of the EFR3B gene for EFR3 homolog B 5 cerevisiae a succinate CoA ligase ADP forming beta subunit SUCLAZ pseudogene a novel gene and two CpG islands 224 ACO12457 7 RP11 509E16 completed 2008 02 29 mpk Contains the 3 end of the EFR3B gene for EFR3 homolog B 5 cerevisiae the POMC gene for proopiomelanocortin adrenocorticotropin beta lipotropin alpha melanocyte stimulating hormone beta melanocyte stimulating hormone El le Note text 4C008073 Set note clear F write access Show Range F7 Hunt selection Refresh Locks Refresh Ana Status Open from chr coords Run lace Close ODO X Open Range It looks as thoush
38. 204 AC074233 6 RP11 77022 completed 2008 01 30 jmi2 nothing n a I e w e re a p p FO p r l ate p l pe l n e statu 5 ate 205 AC110300 3 RP11 541A12 completed 2008 01 30 jmi2 nothing 7 a 206 ACI27378 3 RP11 608P22 completed 2008 01 30 mi2 nothing th at C one en try In th IS WIN d OW Was u pd ated 7 207 AC104807 5 RP11 616124 completed 2008 01 30 jmi2 nothing 208 ACO16768 10 RP11 560C7 completed 2008 02 07 jm12 3 novel gene CpG i b f d d f 209 4C018467 9 RP11 398317 completed 2008 02 07 jmi 5 novel gene a n n otato r res po n S l e O r U p a te a n ree text 210 AC110925 1 RP11 121E2 completed 2008 02 07 jmi2 nothing fi d gt h b h d 211 AC012506 9 RP11 498022 completed 2008 02 07 jmi2 4 novel genes 5 KLHL29 CpG e W I t n otes a O ut t e C O n e CO Nn te Nn t e nte re 212 AC011239 5 RP11 414D15 completed 2008 02 07 mi2 par of KLHL29 novel protein 5 213 AC009242 6 RP11 557N21 completed us n 2 th e N ote text box 7 214 AC079924 6 RP11 469L8 completed 2008 02 07 jmi2 part of ATAD2B 215 AC066692 3 RP11 424C23 completed 2008 02 07 jmi2 5 ATADZB UBXD4 RPS13 pseudo SDHC pseudo 2 CpGs 216 AC104665 3 RP11 724N3 completed 2008 02 07 jmi2 novel protein C2orf44 5 FKBP1B CpG 217 AC008073 4 RP11 507M3 completed 2008 02 07 jmi 3 FKBP1B TP5313 PFN4 novel protein novel gene 5 novel protein 3 ITSN 5 17 pipline status of ACOO8073 4 RPL1 507M3 pas SubmitContig completed 2009 06 01 18 21 28 0 218 AC009228 4 RP11 219F1 comple
39. 8 ESTT13541 ESTT13538 AC008073 6007 AC008073 7 00 AC008073 7 001 ESTT13536 AC008073 8 002 ESTT13535 AC008073 8 004 CCDS42659 1 ESTT13533 ESTT13534 AC008073 8 001 AC008073 8 005 AC008073 8 003 genscan 3 augustus 3 AC008073 9 00 3546 AC008073 4 009 AC008073 4 011 PFO00168 1 CCDS1711 2 CCDS1710 2 genscan 4 augustus 8 AC008073 4 010 ENST445614 ENST415660 ENST380883 ENST380868 AC008073 4 001 AC008073 4 002 AC008073 4 005 AC009228 1 002 AC009228 1 001 AC009228 1 003 PFOOO36 3 NST416224 AC00807334 006 el ESTT13543 AC008073 4 0 PFO0036 2 PFO0036 1 ESTT13544 AC008073 4 007 AC008073 4 003 augustus 2 ESTT13545 PF00018 5 PFOZS PFOOO18 PFO 653 PFOOO18 PFO 653 PFOOO18 PFO 653 PF00018 PF07653 AC008073 4 008 ENST449230 CCDS46230 1 AC008073 4 004 genscan 2 augustus 6 NnNNnNwwpa p ESTT13537 PFO0935 2 PFOOS35 1 RP11 219F1 1 001 PFO1101 1 genscan augustus 1 RP11 219F1 2 001 genscan 6 EL a Transcript editing window Find Clear x AC008073 1 001 Edit Close all Copy Paste New Variant Delete Deletes selected transcript Ctrl E j Fd Etrl ee cer ley Ctrl H Pecl I Ctyl D New objects or variants can be built using any existing object as a template by highlighting it in Otterlace and selecting an option from the menu You can also choose any object on Zmap as the basis for a new or variant object See Zmap sec
40. 8073 4 009 PFO7653 3 ESTT13526 ESTT13548 ACO08073 4 011 PFO0018 2 AC008073 1 001 ESTT13547 PFO0168 1 PFO 653 2 Objects are presented in the AC008073 1 002 AC008073 3 002 CCDS1711 2 PF00018 1 AC008073 3 001 CCDS1710 2 PF07653 1 order and cluster they AC008073 5 002 genscan 4 AC008073 4 008 appear on the genome For PFOOO76 1 ACO08073 6 009 augustus 38 ENST449230 ESTT13550 AC008073 6 011 AC008073 4 010 CCDS46230 1 example genscan 1 and ESTT13549 ACO08073 6 005 ENST445614 AC008073 4 004 genscan 6 are the objects genscan 5 ESTT13532 ENST415660 genscan 2 augustus ENST 444504 ENST380883 atic 5 that appear at the top and CCDS1707 1 ENST 420135 ENST380868 Use the Find option in bottom 5 and 3 of the AC008073 5 001 ENST454150 AC008073 4 p a positive strand of the Zmap AC008073 6 008 AC008073 4 Otterlace to search for l i AC008073 6 010 ESTT13531 AC008073 4m E aes ee screen respectively Editable AC008073 6 003 ESTT13530 f fs gt i E AC008073 6 001 AC008073 6 007 AC009228 ext object names etc gene objects are in Bold es ee Greyed out objects such as AC008073 7 002 AC0 9228 1 003 augustus 1 C104665 1 003 AC008073 2 005 AC008073 7 001 RP11 219F1 2 001 AC104665 1 003 extend poda T acere beyond the selected contig ENST313482 ESTT13536 ENST 416724 genscan 6 x iz Find Clear The locus and its associated transcripts and exons are attributed stable versioned database IDs e g OTTHUMG00000017411 generated
41. COOSO 4 0 PEOOO1S _ Perri sea Stops Found in translation _ MN A __ O O O X Evi AC008073 1 007 AC008073 1 001 paneer does not end with stop and End not found is Type Nonsense_mediated_decay AC008073 1 002 No evidence attached aug US 3 N5T449230 Start Found End Found AC008073 5 002 AC008073 6 009 ACO0 073 4 010 CCDS46230 1 PFOOO76 1 ACO08073 6 011 ENSTH45614 genscan 2 2 ESTT13550 AC008073 6 005 ENST 415660 augustus 6 Remarks ry ESTT13549 ESTT13532 ENST3 80883 A genscan 5 ENST444504 ENS 380868 ESTT13537 7 augustus ENST420135 ACOP8073 4 001 PFO0935 2 CCDS1707 1 ENST454150 ACO08073 4 002 PFO0935 1 AC008073 5 001 AC008073 6 008 ACA08073 4 005 RP11 219F1 1 001 Locus ESTT13531 PFO1101 1 AC008073 6 010 ESTT13530 ACP09228 1 002 genscan Symbol FKBPAB 3 L Known AC008073 6 003 AC008073 6 007 AC 09228 1 001 augustus 1 AC008073 6 001 Aq009228 1 003 RP11 219F1 2 001 Full name FK5 6 binding protein 1B 12 6 kDa AC008073 6 002 AC008073 7 002 f AC008073 7 001 PRO0036 3 genscan 6 Aliastes AC008073 2 005 ENST 416224 q CCDS1708 1 ESTT13536 AC 008073 4 006 Remarks ENST313482 AC008073 8 002 HSTT13543 4 i E gt Annotatjon Eind Clear 4 The new object will inherit its structure from the HSP However you must always check the splice sites of your object in e e e e e File Edit View Raise ticket Help Blixem in case the alignment a incorrect i H Split v spiit
42. Drs60725655 variant sequence AA chr4 04 210623 364887 ensembl variation sequence alteration 80799 80799 s Name rs57681246 url http 3A 2F 2Fwww ensembl org 2FHomo sapiens 2FVariation 2FSumm ary 3Fv 3Drs57681246 variant_sequence A C chr4 04 210623 364887 ensembl_variation SNP 81040 81040 Name rs2352935 ur1 http33A32F32Fwww ensembl org32FHomo _ sapiens yarat ion FSu ry 3Fv 3Drs2352935 variant_sequence T C 44 chr4 04 210623 364887 ensembl variation insertion 82229 82230 Name rs35105663 url http 3A 2F 2Fwww ensembl org 2FHomo sapiens 2FVariation 2FSumm ary 3Fv 3Drs35105663 variant_sequence G chr4 04 210623 364887 Augustus mRNA 119534 119941 ID transcript21 Name AUGUSTUS00000051712 chr4 04 210623 364887 Augustus exon 119534 119941 Parent transcript21 chr4 04 210623 364887 Augustus CDS 119534 119941 0 Parent transcript21 FASTA file A FASTA file has a header line that starts with gt and contains the sequence name The next line contains the start of the sequence data The sequence data can be on a single line or separated by newlines it is usually separated by newlines every 50 characters to aid readability gt chr4 04 210623 364887 tettgtttetgtaggagaggccatctccatcagctataaccaaaaaaaaa acaaaaaactcctctttttgacaagtttgtaaagcectgtcecatctgggtc tataataatcctccaggccctatgccactcctctttattcagccagttca Combined GFF and FASTA file gff version 3 sequence region chr4 04 210623 364887 44144 154265 chr4
43. GNMENT gt 95 00003 167417 167536 lt 140 268 120 ZMAPSTYLE MODE ALIGNMENT vertebrate_mRNA Em AF161523 1 681 28419 282874 139 1 139 lt UNGAPPED ALIGNMENT gt 99 990003 ZMAPSTYLE MODE ALIGNMENT vertebrate_ mRNA vertebrate_mRNA 12 4659984 5036892 2 24270790 24608801 IF TR ag ELEY i 164 k H 165 k N wo l MI um 4 1 o a E o boa m 2 E En mu Z 166 k D l 167 kK Oo a DO Om 8 B 168 k Bn Ir 4 169 k A 25 k 28 k 27 k u 172 k am a _ B i a y H 170 k 24 k 4 Ll 171 k 23 k a J 174 k p IM co aa o l EI Im m E E mo A 0 Ea ol m o wo 33 11 MW ae 175 ki thi Human sequence and Mouse sequence and highlighted human cDNA AF161523 highlighted human cDNA AF161523 37 Zmap keyboard and mouse shortcuts In general Zmap will be faster for zooming bumping etc if you make good use of the built in short cuts These can often avoid the need for Zmap to redraw large amounts of data that you may not even be interested in For example click once highlight on a feature and a carriage return will bring up evidence Another example is to press T for translation All windows Short Cut Zmap Window Short Cut Action zoom in out by 10 Cntl or Cntl page up page
44. LFYSFEEFLLGTFMFYLYINLLFS ANLSPSL VS CWRTTGTWSPWVRITSIHNS FFSELYFIPLKNFSWELLCFIFT TFSFLELI VLHSRLV GELQEPG Se an B Lra REST NLS LFLS FESFTLG W F PDLVTCLEQIKEPCNLKIHETAAKPPAICSPFSQDLSPVQGIED chr4 04 210623 364887 1 AR A E EE chr4 04 210623 364887 2 RDSVSKKKKKKKKNYVP YICIFRNS HSGMWP NSLQKSGNAWTLTS chr4 04_210623 364887 3 ETPSQKKKKKIKRTHSLD FV IN Q8IYB9 1 M ICIEM F c 40 Figure 15 Alignment tool nucleotide gt protein mode Alignment tool menu Right clicking in the alignment tool brings up a context menu The Set alignment length option allows you to specify how long a portion of the sequences should be shown in the alignment tool Close Print Print colors Set alignment length Greyramp tool This tool controls the threshold and contrast of the the dot plot image To improve visualization little peaks noise can be nullified by a minimum cut off Similarly significant peaks above a certain score can be saturated by a maximum cut off Drag the square handle and the arrows to change the threshold and contrast The Swap button swaps the positions of the top and bottom arrows inverting the colours The Undo button undoes the effect of the last drag If closed or hidden the greyramp tool can be shown with the Ctrl G shortcut or by selecting the Greyramp tool option under the View menu F gn u x Dotter Greyramp Tool Close Swap Undo Ea 100 Figure 16 Gr
45. Liang Email fliang lifetech com URL http fulllength invitrogen com InVitroGen Corporation 1600 Faraday Avenue 1 954 ber C Jessee J Polayes D Full Length cDNA Libraries and normalization Unpublished Allows you to inspect the sequence of just the chosen feature or all of the columns aligned horizontally down to either the nucleotide or amino acid level against the genome See later section on Blixem shows to This menu allows you to pre Rest en change the way that nn bumping is displayed Ba There are multiple bump ae options but the default is the most useful Searching for a sequence in Zmap DNA and peptide search windows are provided from within Zmap and can be accessed by right clicking on Zmap space and selecting the option at the bottom of the menu Both search windows are shown below DNA search window X DNA Search Peptide search X Peptide Search File Enter Peptide q o EEAEAEAOA Set Strand Frame Start End coords of search Strand f T frame f Enter DNA atgggcgtggag Enter query sequence Set Strand Start End coords of search Strand v Star fi Jena 338012 t Set Maximum Acceptable Error Rates Mismatches Jo Jn bases Jo E hy Clear X Matches for chr2 04_24270790 The results of the search file ete Start End Strand Frame are displayed in a new box with the number of matches found strand
46. O 653 3 AC008073 1 005 CCDS1709 1 AC008073 4 011 PFOOO18 2 AC008073 1 003 ESTT13548 PFOO168 1 PFO 653 2 ESTT13526 ESTT13547 CCDS1711 2 PFOOO18 1 AC008073 1 001 AC008073 3 002 CCDS1710 2 PFO 653 1 AC008073 1 002 AC008073 3 001 genscan 4 AC008073 4 008 augustus 8 ENST449230 AC008073 5 002 AC008073 6 009 AC008073 4 010 CCDS46230 1 PFO0076 1 AC008073 6 011 ENST445614 genscan 2 ESTT13550 AC008073 6 005 ENST415660 augustus 6 ESTT13549 ESTT13532 ENST380883 genscan 5 ENST 444504 ENST380868 ESTT13537 augustus ENST420135 AC008073 4 001 PFOOS35 2 CCDS170 1 ENST454150 AC008073 4 002 PFO0935 1 AC008073 5 001 AC008073 6 008 AC008073 4 005 RP11 219F1 1 001 ESTT13531 PFO1101 1 AC008073 6 010 ESTT13530 AC009228 1 002 genscan AC008073 6 003 AC008073 6 007 AC009228 1 001 augustus 1 AC008073 6 001 AC009228 1 003 RP11 219F1 2 001 919000 AC008073 6 002 AC008073 7 002 AC008073 7 001 PFOOO36 3 genscan 6 AC008073 2 005 ENST416224 CCDS1708 1 ESTT13536 AC008073 4 006 ENST313482 AC008073 8 002 ESTT13543 AC008073 2 001 ESTT13535 AC008073 4 012 El Ja Find Clear Eile Edit View Raise ticket Help H Split V Split 1 Revcomp 3 Frame DNA Columns Zoom in Zoom out Back 1 338012 Data loaded 2 Click on the HSP that will give its coordinates to the new variant object Em BC050998 1 1153 6185 6273 lt 205 293 89 lt UNGAPPED ALIGNMENT gt 100 000000 Alignment vertebrate_mma vertebrate_mrna Chr2 04_24270790 24608801
47. SH RG IRIKGD FALKGI tgaggattcgcattaa SGELGI LEIS WGVGDL aggggat ct ggggagt tggggatctgggatca N WOQHSA ISTLP gt gceactctgceectccac a B PPPD vag The 3 Frame button will show the amino acid sequence in each of the three reading frames 27 Show feature details Right click on a gene object or o key when highlighted to see information on otter IDs and Ensembl IDs For BLAST hits double click on the HSP to get the feature interface where you will find details on alignment and on what HAVANA object the HSP has been assigned to if any File Edit View Raise ticket Help Reload H Split V Split Unsplit Unlock Revcomp 3 Frame DNA Columns Zoom in Zoom out Back 1 338012 Data loaded X ZMap 0 1 65 Feature Show Em L37086 1 File Help AC008073 1 001 1795 15762 13968 Intron 2 6031 12894 6864 Transcript known cds transcripts known_cds Alignment chr2 04 24270790 24608801 Details 1 n i Feature Y y N Feature Name Em L37086 1 Par l mu I Feature Group style_id mRNA_align T i MI j mega Taxon ID 9606 Description 7 o Homo sapiens FK 506 binding protein fkbp12 6 gene complete cds o Evidence 5k for transcript AC008073 1 001 Al oh 2 a u Jun p B Align Type dna Query length 880 Matches a Sequence Strand Sequence Match Sequence Start Sequence End Match Start Match End Score 1 lt NOT SET gt 12895 13007 152 264
48. alternative isoforms Pesole Bari Italy u _ das_ChromSig Chromatin Signatures indicating TSS Ernst MIT J das_CONGO_Exons prot cod comparative exon predictions Kellis MIT das_CRG_U12 112 genes Guigo CRG das_Evigan Gene predictions by the program Evigan Pereira UPenn DAS_Exon das_Exonify evolutionarily conserved protein coding exons UCSC das_GenTrack Entries marked with important flags in AnnoTrack das_peptide_atlas Peptides from http www peptideatlas org das_Siepel_NovelLoci Novel loci predictions Siepel UCSC das_transMap_MRna TransMap cross species alignments Diekhans UCSC das_transMap_RefSeq TransMap cross species alignments Diekhans UCSC das_transMap_SplicedEst TransMap cross species alignments Diekhans UCSC das_transMap_UcscGenes TransMap cross species alignments Diekhans UCSC das_UCSC_NeIntrons das_UCSC_RetroAli3 das_UCSC_YtLuPseudogenes das_Washu_human_PASA_ESTs ESTs used for PASA van Baren WUSTL das_Washu_NSCAN1 L Non coding introns Diekhans UCSC RetroFinder pseudogenes Diekhans UCSC SIA a AU EN EY EN E E SSL eS NSCAN predictions van Baren WUSTL Fir 1 Default Previous All None Invert olumns loaded Load column data gives you the option to load in further column data to a session that is already open On The Fly OTF alignment uses exonerate to align sequences to Zmap These can be single sequences multiple seq
49. and D Zoom In Zoom out genomics coordinates 1 atgggcgtggag 1931 1942 1 File Edit View elp est human chr2 04_24270790 24608801 The position of the matching sequence is shown by a red block If you click on the red block while the genomic DNA sequence IS displayed your match will be highlighted in the DNA sequence column not shown Searching for a feature in Zmap This option allows you to list all the features contained in a column in one window There are further options for you to search within these results to find a specific feature The list of column features can be exported as a GFF file via the File menu sofi search Frame Start ES This lists all the accession numbers and End Locus _ Style Ed File View Operate associated information for the column vertebrate_mrna The results can be ordered using the buttons at the top X vertebrate_mrna Name Start 4 End Strand Query Start Query End Query Strand Score Feature Set Source Style 6 Em U61167 1 191013 194286 746 4018 99 699997 vertebrate_mrna vertebrate_mrna mrna_align 1 Em U61167 1 198209 198400 554 745 100 000000 vertebrate_mrna vertebrate_mrna mma_align Note the format needs to be correct 4 Em U61167 1 198877 198973 457 553 99 000000 vertebrate mrna vertebrate mrna mrna_align 5 Em U61167 1 200709 200830 335 456 100 000000 vertebrate_mrna vertebrat
50. and tracked within the Otter database Whenever a gene locus is edited the version number will increase and the date of the change will be saved allowing the user to find out when the annotation was last updated It should be noted that versioning occurs within the database and such changes are not externally visible Clearly it is vital that current Otter IDs are not deleted only modified unless the object is no longer valid Subseq menu Editing operations on the transcripts listed in the window Clone Tools To edit existing annotation double click on the feature in Otterlace or highlight your object and use the drop down menu or double click in Zmap File SubSeq Clone Tools Read Only genscan 1 AC008073 1 004 PFOO254 1 CCDS1706 1 augustus 5 CCDS33153 1 ESTT13529 ESTT13528 ACO008073 1 008 ESTT13527 AC008073 1005 AC008073 5003 ESTT13524 AC008073 1 001 AC008073 1 002 AC008073 5 802 PFO0076 1 ESTT13550 ESTT13549 genscan 5 augustus CCDS1707 1 AC008073 5 001 AC008073 6 010 AC008073 6 003 AC008073 6 001 AC008073 6 002 AC008073 2 005 CCDS1708 ENST313482 AC008073 2 001 AC008073 2 002 AC00807 6 004 PFO8740 1 ACO 8073 2 006 4G608073 2 003 PFOO10 1 AC008073 2 004 AC008073 3 003 augustus 4 PFOO235 1 CCDS1709 ES Fa STT13547 AC008073 3 002 AC008073 3 001 AC008073 6 009 AC008073 6 011 AC008073 6 005 STT13532 ST444504 ENN 420135 ENSTNS 4150 AC008073 6 00
51. anonical Region Scroll Navigator KB chr2 04_24270790 24608801 W i th i n th e red box fa r left As you zoom in the The red box shows the ice extent of the sequence displayed in the main window showing the area within the red box gets smaller To make the area larger use the Zoom analysis any previously out button annotated loci or any i imported genes that are present in the clone M l mil l This panel has a scroll bar to show you where you are within the chromosome It will allow you to jump to ENSGO000 250 k ENSESTGO ENSESTGO different regions It is generally only useful if you open up very large sections of a chromosome lt RP11 219 eo SENSGOOC U E NAPLL 219 ENSESTG 19 Navigating in Zmap and zooming options 1 Navigate by using the scroll bars or the middle mouse button By clicking the middle mouse anywhere in Zmap you will see a horizontal line You can move this up and down and the relative position in bp will be displayed along the line When the button is released the window will refresh centering on the position of the line You can also click in the window to make it active and use the scroll wheel to navigate up and down or achieve the same result using the scroll bar on the right hand side of the window If you release the mouse outside the Zmap window you can then check the sequence
52. ansMap_SplicedEst TransMap cross species alignments Diekhans UCSC F das_UCSC_NeIntrons_V3 Introns with non canonical splicing V3 Diekhans UCSC are Oa e su CCess U y das_transMap_UcscGenes TransMap cross species alignments Diekhans UCSC das_UCSC_RetroAli3 RetroFinder pseudogenes Diekhans UCSC F das_UCSC_NcIntrons Non coding introns Diekhans UCSC das_UCSC_YtLuPseudogenes das_UCSC_RetroAli3 RetroFinder pseudogenes Diekhans UCSC F das_Washu_human_PASA_ESTs ESTs used for PASA van Baren WUSTL la das LES E Vel UP esidogenem das_Washu_NSCAN1 NSCAN predictions van Baren WUSTL das_Washu_human_PASA_ESTs ESTs used for PASA van Baren WUSTL das_Washu_NSCANL NSCAN predictions van Baren WUSTL de Haan DEN MDNR wand Daa DACA faa Denne mern Default Previous All None Invert Reselect failed Load 0 columns loaded Gane Default Previous All None Invert Load Loading phastConsd4d 6 of 30 aa Failed columns turn red mouse over for detai Is 6 Cl ick button E eponine Clusters of transcription factor binding sites from Thomas Down s algo m EST_Human Human EST blast hits realigned using est2genome to retu rn status to M Problem with the web server 1 EST blast hits realigned using est2genome 1 blast hits from species other than human or mouse realigned using yellow to retry 7 Click load to run Otterlace and Zmap Further description information for the column data can be found here htt
53. arameters without running Dotter click Save and then Cancel e To save the parameters and run Dotter click Execute 63 e To revert to the last saved manual parameters click the Last saved button e To revert back to automatic parameters click the Auto radio button The coordinates in the Start and End box will be recalculated for the currently selected sequence Reference sequence versus itself To run Dotter on the reference sequence versus itself select the Call on self tick box in the Dotter dialog and then click Execute This can be useful to analyse internal repeats etc see the Dotter manual for more information Dotter HSPs only This starts Dotter in HSP High Scoring Pair mode See the Dotter manual for more information 64 Settings The settings menu can be accessed by right clicking and selecting Settings or by the shortcut Ctrl S Features Highlight variations When this option is enabled bases in the reference sequence that have know variations such as SNPs insertions deletions etc are highlighted in the reference sequence nucleotide header If the Show variations track sub option is also enabled then an additional line is shown above the nucleotide header showing the alternative bases for each variation Note that the Variations track can be quickly enabled or disabled by double clicking the nucleotide header Show polyA tails When this option is enabled polyA tails are
54. ce CDS start 1931 CDS 15273 a 1795 P 1967 gt A 5983 db 6030 gt Canonical splice ag 1289 13007 at sites are highlighted ag 15145 P 15762 7 in green EJ e Transcript 3 1 001 CDS line does not appear aie man in non coding transcripts Stary Foha End _ Found which is governed by the i eyfarks transcript type TT ME Locus Symbol FKBP16 y Known Full name FK506 binding protein 18 12 6 kDa Splice sites are Aliastes checked for the Remarks following sequences ag exon gt ag exon glgc Exon boundaries Orientation is shown by 900 X AC008073 1 001 either a pP Or between File Exon Tools Attributes rdin n n a d a os d Pr De 1931 CDS 15273 E change oldin 8 oe 5 1795 dh 1967 ot Non canonical control and clicking over gg 5984 Hso39 o ji t i splice sites are the or sign ag 12895 PM8008 ta ene ag 15145 db 15762 highlighted in El p red and need to Transcript be checked Name AC0O05073 1 001 Type Known_CDS Start Found End Found Remarks Annotation Locus Symbol FKBP1B x F Known Full name FK506 binding protein 1B 12 6 kDa Aliases Remarkst 10 Changing the coordinates can be done a number of ways a Copy coordinates from Blixem see section on Blixem or select a block of your choice exon homology in Zmap and paste coordinates in white space to create new exon s
55. choice in transcript editor to select both exon coordinates shift click for multiple selections Select eee Reverse Ltrl k Reverse all coordinates Trim CDS Ctrl T cntrl click to toggle Sort Ceri ae Trims peptide Merge tr sequence to first De ete Ctrl D i stop Bl codon This tool IS also available in the translation window Sort exons also orients JX XA 008073 1 001 File Exon Tools Attlributes coordinates correctly 15273 1931 A 15762 15145 ct ac 13007 12895 ct ac 6030 5983 ct a 7 __ lua Transcript Name ACOO0S073 1 001 Type Known_CDS Start Found a End Found Remarks this text is transcript visible remark this text is transcript annotation remark Merge overlapping exons Delete highlighted selected exon s Changing the orientation shows the splice sites as being incorrect because they are now on the opposite strand Aliastes yumbol FKBP1B y Known name FK506 binding protein 1B 12 6 kDa this text is locus visible remark Remarks ae this text is locus annotation 13 Tools menu Informative operations to run on the transcript Check annotation Ctrl C Hunt in map Dotter Ctrl H Er Search Par Eerl P Rename locus Renames all transcripts of a locus to a new locus name X rename locus to FKBP1B Rename locus FKBP1B Rename Ele Edt Yiew Higtory Bookmarks Joo
56. ct the clone that ACOOSOFS 4 ACOOSOFS 4 ACOOI226 4 ACOOFSSS 4 requires updating the SequenceSet You can also open this window by double clicking on the clone display in Zmap See Zmap section 5 X Clone AC008073 4 The DE line can be automatically Clone generated but must be edited Name AC008073 4 Accession Sv AC008073 4 further as it is unable to deal with ee cccco Mean o R 5 or 3 ends of genes See a e A Assembly annotation guidelines Start 1 End 166460 Strand Fwd Properties Keywords FKBP1B fo one per ITSN2 line gt PFN4 TP53I3 lescription Contains the 3 end of the FKBP1B gene for FK506 binding protein 1B 12 6 kDa the TP5313 gene for tumor protein p53 inducible j the PFN4 gene for profilin family member 4 two novel genes the 5 end of a novel gene the 3 end of the ITSN2 gene for intersectin 2 and five CpG islands Generate Click to generate DE Line Remarks Save Close y Private remarks can be added here and will not be seen in the EMBL header Click Save to add the DE line to the current session Tools menu Useful things to run on the genomic sequence being annotated File SubSeq Clone Use this to relaunch Zmap if it was accidentally Genomic Features Ctrl G Dotter Zmap hit By of pl On The Fly lt OTF Alignm Ctrl x oaii peL closed For Zmap options see Zmap
57. d codon darker blue indicates the nucleotide whose coordinate is displayed in the feedback box Alignment list header Pale red background STOP codon protein mode Alignment list header Green background MET codon protein mode 60 Keyboard shortcuts Ctrl Q Ctrl H Ctrl P Ctrl S V Shift Ctrl G Ctrl G Ctrl A Shift Ctrl A Ctrl D Left arrow Right arrow Shift Left Shift Right Ctrl Left Ctrl Right Up arrow Down arrow Home End Ctrl Home Ctrl End Ctrl Ctrl Shift Ctrl WN oars Ctrl 1 Ctrl 2 Shift Ctrl 1 Shift Ctrl 2 2 Quit Help Print Edit settings Show hide sections of the display Create group Edit groups or create a group if none currently exist Select all sequences in the current list Deselect all sequences Dotter Move coordinate section one index to the left Move coordinate section one index to the right Same as Left but in protein mode it scrolls by a single nucleotide Same as Right but in protein mode it scrolls by a single nucleotide Scroll to the start end of the previous alignment Scroll to the start end of the next alignment Move row selection up Move row selection down Scroll to the start of the display Scroll to the end of the display Scroll to the start of the first alignment Scroll to the end of the last alignment Zoom in detail view Zoom out detail view Zoom in big picture Zoom out big picture Zoom out big picture to view the whol
58. de Export Feature peptide gt CDS Column Bump B transcript Column Hide unspliced Column Configure gt with flanking sequence Column Bump More Opts b Unbump All Columns CDS Compress Columns E transcript UnCompress Columns Shift C i Export Whole View Configure This Column List All Column Features Configure All Columns List This Name Column Features Feature Search Window DNA Search Window Peptide Search Window Toggle Mark When exporting sequence you will get the first window when exporting a predefined feature and the second DNA Features Context one when you need to select a specific region 29 attggcaaacaggaagt gagccacggg X chr2 04_24270790 24608801 AC008073 1 001 gt chr2 04_24270790 end AC008073 1 at rtd A ica gtggagatcgagac ccggagacggaaggac aagaagggccaaacgtgtgt earn acaca ggaatgcetccaaaa to ggaagaagtttgatteatec i AES ee We gggtgcagcccagat gagc ttgggge agagggc tgac eh cctgatgtggcatatg ccacgggccacccc ggtgtcatccctc en catcttt gacgtggagcetgetcaacttagagtga X chr2 04_24270790 24608801 AC008073 1 001 ctccagccgcacctcctce ag cagggacc cgagacca cteccccg si caga artan aagaagggcc aatgc catc rik ents agaa cagga te a ctgca acgggccac gtcatccctcccaa gce catctttga ge g cactgcctcatggcatcatcca ae atts caca ri tta ccacacacacaaggtoc tc aoe ates cate cagagggacttgagccag tta ce tgtcactttctctcttata ett ctgttagctgctc ed aat gtcctctttgagaaaat gtaa ta aaggctctgtge
59. des the sequence name and optional data such as organism and tissue type that can be parsed from EMBL files currently only available to authorised users To load optional data see the Settings dialog Note that the optional data may be incomplete due to the inconsistent information available from the EMBL files Error Bookmark not defined The main menu Right click anywhere in the Blixem window to pop up the main menu 53 Quit Help Print Settings View V Create Group Shift Ctrl G Edit Groups Ctri G Toggle match set group g Deselect all Shift Ctri A Dotter Ctri D The options are Quit Ctrl Q Close Blixem and any spawned processes Help Ctrl H Display the user help Print Ctri P Printing options Settings Ctrl S Edit settings View V Show hide parts of the display Create Group Shift Ctrl G Create a group of sequences Edit Groups Ctrl G Edit properties for groups Deselect all Shift Ctri A Deselect all sequences Dotter Ctrl D Run Dotter on the currently selected sequence Hiding sections of the window Use to View dialog to show hide sections of the window 1 Right click and select the View option or hit the v shortcut key Quit Help Print Settings View Vv Create Group Shift Ctrl G Edit Groups Ctrl G Toggle match set group g Deselect all Shift Ctrl A Dotter Ctri D 2 Toggle check marks on or off to show hide sections 54 Blixem View panes Big picture Sho
60. distance metrics See here for more details http sonnhammer sbc su se Belvu htm 14 Attributes menu Controlled annotation vocabulary for transcript and locus X AC008073 1 001 File Exon Tools Attrib Wes f 1931 Transcript F NMD exception 17 Locus alternative 5 UTR 5983 ob 6030 12895 J 13007 15145 dh 15762 ag ag ag non submitted evidence not ordanism supported Ki Transcript readthrdugh Name ACO08073 1 001 Type Known_CDS Start Found End Remarks this text is remark lremark Locus Symbol transcript visib transcript annotavion FKBP1B Y F Known Full name FK506 binding protein 1B 12 6 kDa Aliastes Remarks this text is locus visible remark 3 this text is locus annotation for experimental confirmation not bast in genome evidence File Exon Tools Attributes controlled vocabulary can be assigned to the gene object from the Attributes menu as well as being available as right click menus in the transcript and locus remark fields They are attached to either the transcript left or locus field right OOO x AC008073 1 001 File Ekon Tools Attributes 18 CDS 1795 dh 1967 gt A 5983 P 6030 gt a 12895 JB 13007 gt agNX 15145 15762 EL Transcript Name ACOOS073 1 001 Type Known_CDS Remarks Locus Symbol Full name Aliasies t Remarkst this text is tranzcrip remark this
61. e start end and some sort descending score and ID This option reverses that sort order General settings Font Allows you to change the font that is used to display alignments in the detail view Note that you must select a monospace font otherwise matches will not be shown aligned correctly Blixem will warn you if the font you have selected is not monospace Fetch mode Allows you to change the program used to fetch sequence EMBL entries Currently only available to authorised users within the Sanger Institute Columns Load optional data Click this button to load optional data from EMBL entries currently only applicable to authorised users within the Sanger Institute Note that this operation can take a long time if there are many sequences The button will be greyed out once optional data has been loaded Column visibility Tick un tick the check marks to show hide individual columns Adjust the column width by entering the new width in the text box in pixels Note that if you enter a zero width then the column will be hidden regardless of whether the check mark is ticked or not Greyed out columns are optional data columns and will only become available once optional data has been loaded Grid properties ID per cell Use this to change the vertical scale of the grid a smaller value means the grid will be more spaced out a larger value means the grid will be more compact Max ID Defines the maximum cut o
62. e 75 TEGO NONNO een 75 Thealisnment TOOL Le 76 AH ara E I7 Ma a E O E E E 78 FC I Diese ee ee 78 FS UE TANS MMi gsr eae O EEE UERR EN 78 VIEW a AAN 78 HEIDEN ernennen sexton nese acne 80 SORO Rene NIE NENNE E re 81 Keypad SINC Sasse 83 Annotation resources sccccccccccccccccccccccccccccccccccccccccccsccccccccccccccccccccccscecccccccccocccs 84 Blank Otterlace User Manual Written by Charles Steward and Laurens Wllming cas sanger ac uk lw2 sanger ac uk Wellcome Trust Sanger Institute Otterlace Otterlace is an interactive graphical client which uses a local acedb database with Zmap and perl Tk tools to curate genomic annotation Annotation is stored in an extended Ensembl schema the otter database which presents the annotator with contiguous regions of a chromosome The acedb database provides local persistent storage so that if the software or desktop machine crashes reboots or is exited the editing session can be recovered Since all communication goes through the Sanger web server annotators can work wherever there is a network connection Starting an Otterlace Session Type otterlace amp in a terminal window If you are using Mac OS X double click on the otterlace icon You will be required to authorise your session by entering your password If you experience any problems email anacode sanger ac uk X Choose DataSet 1 Enter your password in a SEE le Open Recover sessions Quit the box
63. e are three sections in the detail view in protein mode one for each of the three reading frames for the active strand Only the active strand is shown To view the other strand toggle the display using the Toggle strand button or the t shortcut key In protein mode the yellow header bars show the translated reference sequence for that reading frame STOP and MET codons in the reference sequence are highlighted in red and green There is also an additional header section at the top showing the nucleotide sequence 50 Main header shows the nucleotide List headers show the sequence 3 frame translation a gccctgttgctccttacggggaggacattgtggttggaagtt taggagce F rame 1 Cgagcacgtcccctagaagacaagctccgatccgegtaccatttcctgacc i Name Score ld Start cagccgcaccacaccacatacacggctcaattcccgetccagtagcecec End alignments ei 9923 1 41 43 907 K JAER Frame 2 l 754 2 47 45 88 DGPFLUPGGVVPSPHTP VESRG 187 alignments Q5RB38 1 42 42 1 17 Frame 3 m 1 44 53 549 561 E Q16587 2 42 44 4 la alignments Q16587 3 2 42 44 4 21 Figure 8 Alignment lists protein mode In the nucleotide sequence header codons are read from top to bottom and then left to right starting at row 1 for frame 1 row 2 for frame 2 etc Middle clicking on a coordinate will highlight the three nucleotides for the selected codon and the currently active reading frame by default frame 1 Left clicking in an alignment list sets the active reading frame
64. e reference sequence Scroll left one coordinate Scroll right one coordinate Go to position Toggle the active strand Toggle the match set Group Toggles visibility of the 1 alignment list Toggles visibility of the 2 alignment list Toggles visibility of the 3 alignment list protein mode only Toggles visibility of the 1 big picture grid Toggles visibility of the 2 big picture grid Toggles visibility of the 1 exon view Toggles visibility of the 2 exon view Only applicable if a coordinate is currently selected middle click a coordinate to select it 3 Limited to just the selected sequences if any are selected otherwise acts on all sequences 69 Dotter User Manual Written by Gemma Barson gb10 sanger ac uk Wellcome Trust Sanger Institute 18 January 2011 70 Dotter This manual explains how to configure run and use Dotter Dotter is a graphical dot plot program for detailed comparison of two sequences Every residue in one sequence is compared to every residue in the other sequence The first sequence runs along the x axis and the second sequence along the y axis In regions where the two sequences are similar to each other a row of high scores will run diagonally across the dot matrix Dotter is maintained by the Wellcome Trust Sanger Institute and is available as part of the SeqTools package The software can be downloaded from the Sanger Institute s website http www sanger ac uk 71
65. e using the left and right arrow keys 57 You can move the selection to the start end of the previous next match by holding Ctrl while using the left and right arrow keys limited to just the selected sequences if any are selected includes all sequences otherwise Finding sequences The Find dialog allows the user to search for sequences by name Press the Find button on the toolbar or hit the Ctrl F shortcut key to open the Find dialog 60 X Blixem Find sequences e Sequence name search wildcards and m C DNA search _ gt O Sequence name list search a Back e Forward IX Close lt ok Figure 7 Find dialog There are three search modes Sequence name search Search for match sequences by name The wild card means any number or zero of any character and means 1 character which can be any character Any sequences whose names match the search string will be selected and the display will scroll to the start of the selection DNA search This searches for a given sub sequence of nucleotides in the reference sequence If the sub sequence is found the display will scroll to the start of the sub sequence and the first base in the sub sequence will be selected Sequence name list search the same as Sequence name search but for multiple sequences Each sequence names should be on a separate line Enter your search text in the appropriate box and click the OK button
66. e_mrna mrna_align for Zmap SO use as a wild card 7 Em U61167 1 200917 200962 289 334 100 000000 vertebrate_mrna vertebrate_mrna mrna_align i 8 Em U61167 1 204425 204591 122 288 100 000000 vertebrate_mrna vertebrate_mrna mrna_align For example accession numbers 2 Em U61167 1 206447 206511 57 121 100 000000 vertebrate_mrna vertebrate mma mrna_align 3 Em U61167 1 209968 210003 21 56 100 000000 vertebrate_mrna vertebrate_mrna mrna_align may have a database prefix and version suffix such as Em U61167 1 vertebrate_mma O e 2 right mouse to activate this e Column Hide menu Select Show feature List Column Configure gt Column Bump More Opts gt Unbump All Columns Compress Columns Z UnCompress Columns ift C Export Whole View gt X vertebrate_mrna Show Feature List oa File View Operate Help TU M 1 1 1 1 Feature Search Window Search Strand Query Start Query End Query Strand Score Feature Set Source Style DNA Search Window 1 199 86 90000 ertebrate ming vertebrate mens mens align i Preserve Peptide Search Window 3 at Bz Export gt Export results as GFF file Toggle Mark M Close Ctri W t Io a Show Style 141 Linenvaver sa wis agus 1 155 100 000000 vertebrate_mrna vertebrate_mrna mrna_align 169 Em CR859971 1 1822 1967 2 146 97 300003 vertebrate_mrna vertebrate_mrna mrna_align Blixem DNA Alignme ts A 206 Em AF060872 1 1829 1967 1 141 84 400002 vertebrate mrna vertebrate mma mrna_align
67. eature by pressing either z or Z as described previously Now when you bump an evidence column to look at matches that overlap the feature you will find that bumping is much faster because only those matches that overlap the feature get bumped and you also have fewer matches to look at The quickest way to bump a column is 1 Click on the column to select it 2 Bump it by pressing b if you press b again the column will be unbumped If you have marked a feature then bumping is restricted to matches that overlap that feature otherwise bumping is for the whole column If you use the default bumping mode i e you pressed b then you will find all matches from the same piece of evidence are joined by coloured bars the colours indicate the level of colinearity between the matches see next screen shot 1 Green the matches at either end are perfectly contiguous e g 100 230 gt 231 351 2 Orange the matches at either end are colinear but not perfect e g 100 230 gt 297 351 Matches may also be this color when there are extra bases in the alignment e g around clone boundaries 3 Red the matches are not colinear e g 100 230 gt 141 423 Alignment quality of the HSPs is depicted by the width of every alignment displayed since the width is a measure of that HSP s score Therefore the wider it is the closer the score is to 100 The precise score is displayed in the Zmap details bar by clicking on the
68. ecay no no yes curated _tsct Show Feature List Feature Search Window X Feature Search Er Eile Help File View Operate Help anec SERIEN Name Start End Strand Feature Set Source Style DS 2 04 24270790 24608801 42 C2orf44 1635 1656 locus locus locus Block chr2 04_24270790 24608801 i Column locus 4 ENSG00000119782 1795 15757 locus locus locus L st of a loci y Y 22 ENSESTG00000005326 Vote 15757 locus lo 5 dl 40 CCDS1706 1 1931 15273 locus l contal ned Specify Filters d d 37 CCDS33153 1 1931 15144 locus lo 2 5 Use rop own menus 10 AC104665 2 5983 15402 locus lo with in current colilla 8 AC008073 5 19665 28524 locus ocus lo x Frame to refi ne featu re sea rch 15 ENSG00000115128 19670 28525 locus locus lo Zmap SeSSION Start 25 CCDS1707 1 19843 28310 locus locus Io End with n Zmap 30 ENSESTG00000005337 19911 28502 l locus lo 20 C2orf84 28607 49549 locus locus loc Locus 3 TP5313 29516 37262 locus loc Style 29517 36866 I locus 11 ENSG00000115129 20 2 Zoom in by using the Zoom in Zoom out buttons at the top or by drawing a rectangle around the area of interest with the left mouse button Use the z key on the keyboard to zoom to whatever feature is highlighted Use the Z key to zoom to a whole transcript if you have an exon s highlighted or all HSPs if you have one HSP highlighted HSPs are the blocks that you see in the homology columns such as ESTs and protei
69. ecify Search 641 Em S69815 1 5983 6030 110 157 100 000000 vertebrate_mrna vertebrate_mrna mrna_align 643 Em S69800 1 5983 6030 110 157 100 000000 vertebrate_mrna vertebrate_mrna mrna_align Align chr2 04_24270790 24608801 T h f 645 Em L37086 1 5983 6030 104 151 100 000000 vertebrate mima vertebrate mma mrna_align Block chr2 04 24270790 24608801 O searc or a 647 Em AF322070 1 5983 6030 79 126 100 000000 vertebrate_mrma vertebrate_mrna mrna_align f 649 Em AY159324 1 5983 6030 38 85 93 800003 vertebrate_mrna vertebrate_mrna mrna_align Column vertebrate_mma eatu re enter 653 Em CR626535 1 5983 6697 136 850 100 000000 vertebrate_mrna vertebrate_mrna mrna_align a 673 Em CR614690 1 5992 6030 1 39 00 000000 vertebrate_mrna vertebrate_mma mrna_align Feature Em U61167 1 our querv here 705 Em BC050998 1 6185 6273 205 293 00 000000 vertebrate_mma vertebrate_mrna mrna_align nd li k n 708 Em CR614690 1 6185 6273 40 128 1P0 000000 vertebrate_mrna vertebrate_mrna mrna_align Specify Filters a CHIC O 721 Em BC050998 1 6554 6621 294 361 140 000000 vertebrate_mrna vertebrate mma mrna_align Help so use the following format accession_number if you are not sure about the database and version The result lists all the exons and associated match information for query accession Em U61167 1 31 AAA X vertebrate_mrna Eile View Operate Herp Name Start 4 End Strand Query Star
70. ent tool The alignment tool shows the portions of the two sequences at the current cross hair position The sequences will move to remain centred on the cross hair coordinates when the cross hair is moved The same shortcut keys for moving the cross hair can be used in this window Aligning matches are highlighted and colour coded according to whether they are an exact or conserved match cyan for exact violet for conserved In nucleotide gt nucleotide mode both strands of the horizontal sequence are shown in the alignment tool In nucleotide gt protein mode all three reading frames of the horizontal sequence are shown and the best match out of the three frames determines the highlight colour for the bases in the vertical sequence If closed or hidden the alignment tool can be shown with the Ctrl A shortcut or by selecting the Alignment tool option under the View menu Dotter Alignment Tool 121130 chr4 04 210623 364887 1 ACAG TGAGTGTGCGGGG CGTCCCAAGGC CD654571 1 CAGC AACTCTTAACGT ATGTGGCCATAC 125 121130 chr4 04 210623 364887 1 ACCTCCTCAGCCTTGGGACGCCCTGCCCC CACTCACCATTTCCCCACTTCTGAAGTAT TCGGGAGTCTTAGCTACCAATCATCCAATACCCGCAGG CAGAGCGACGGAGG CD654571 1 CAGCCTCAGCCTCCGTCG TG GGGTATTGGATGATTGGTAGCTAAGACTCCCGAATACT TCAGAAG TGGGGA GGAACTCTTAAC GGGATGTGGCCATAC 125 Figure 14 Alignment tool nucleotide gt nucleotide mode 76 X Dotter Alignment Tool 127065 EDNFNT FLIFLRA
71. ent transcript defined and the Name tag should be set in the parent rather than the child exons Note that Blixem will recognise exons that do not have a Parent tag if they have a Name tag instead but they may not get grouped correctly with other exons from the same transcript Typically one defines the parent transcript the exons and the CDS regions Blixem will then calculate the missing components in this case the UTR regions and the introns Blixem will recognise other combinations of inputs and will always calculate the missing components as long as enough information is provided Variations SNPs insertions and deletions are supported as well as combined variations One may use the generic sequence_alteration type for these but it is good practice to use more specific types such as snp or deletion where applicable Sample GFF file A sample GFF file may look like this denotes that text has been omitted gff version 3 sequence region chr4 04 210623 364887 44144 154265 chr4 04 210623 364887 EST Human nucleotide match 79195 79311 95 000000 Target DA692754 1 287 403 percentID 90 6 sequence GATCTGGC chr4 04 210623 364887 EST Human nucleotide match 79195 79323 121 000000 Target AI095103 1 326 454 percentID 96 9 sequence TTTAAATT chr4 04 210623 364887 ensembl variation deletion 80798 80799 Name rs60725655 url http 3A 2F 2Fwww ensembl org 2FHomo sapiens 2FVariation 2FSumm ary 3Fv 3
72. er window contains the dot matrix plot It also shows any exons for the sequences along the bottom of the window for the horizontal sequence or along the right hand side for the vertical sequence X Dotter chr4 04_210623 364887 vs CD654571 1 File Edit View Help chr4 04_210623 364887 121000 121200 121400 121600 121800 100 200 UJ CD654571 1 gt O 500 600 700 Figure 13 The main window Cross hair The blue cross hair shows the coordinates at a particular position It can be moved by clicking dragging with the left mouse button or by using the following keyboard 75 shortcuts Left arrow Move one dot left right along the horizontal sequence Right arrow Shift Left The same as Left Right but for protein sequences this moves by a Shift Right single nucleotide coordinate rather than a whole dot amino acid Up arrow Move one dot up down along the vertical sequence Down arrow Shift Up The same as Up Down but for protein sequences this moves by a Shift Down single nucleotide coordinate rather than a whole dot amino acid Move diagonally up left or down right Useful for moving along an alignment Move diagonally down left or up right Useful for moving along an alignment Zoom in with a child Dotter You can open a new child Dotter on a particular region from the current Dotter window Middle click and drag the mouse to select the region to open the new Dotter on The alignm
73. eyramp tool I7 Main menu The main menu can be accessed via the menu bar at the top of the dot plot window or by right clicking in the dot plot window Save plot Print Ctrl P Close Ctrl W Quit Ctri Q File menu Save plot Save the current dot plot It can be re loaded by calling Dotter from the command line using the argument Note that you will need to call Dotter with the same portion of each sequence that was originally passed to Dotter in order for the alignment tool to function correctly when you load the dot plot Print Print the current dot plot Close Close the current Dotter window Also closes the associated alignment and greyramp tool but does not close any other Dotter windows Quit Close the current Dotter window and all associated Dotters as well including any child or parent Dotters If you just wish to close the current Dotter then use the Close menu option instead Edit menu Settings Ctri S Settings Show the Settings dialog View menu 78 Greyramp tool Ctrl G Alignment tool Ctrl A Y Crosshair Y Crosshair label Y Crosshair fullscreen Y Pixelmap Gridlines e HSPs off Draw HSPs greyramp Draw HSPs red lines Draw HSPs color f score Greyramp tool Show the greyramp tool Alignment tool Show the alignment tool Crosshair Toggle visibility of the cross hair Crosshair label Toggle visibility of the cross hair label only has an effect if the cross hair
74. ff value for the ID scale Min ID Defines the minimum cut off value for the ID scale 66 Appearance Use print colours Select this option to make Blixem use grey scale colours suitable for printing Display colours Change any of Blixem s custom display colours such as the colour aligned bases are shown in or the colour stop codons are highlighted in etc There are four colours for each item e Normal this is the standard display colour e Normal selected this is the colour used when the item is selected if applicable Typically one would use a slightly darker or lighter shade of the Normal colour for this so that the item does not look radically different when it is selected e Print this is the standard colour used when the Use print colours option is enabled e Print selected this is the colour used when Use print colours is enabled and the item is selected 67 Key In the detail view the following colours and symbols have the following meanings Alignment list header Yellow background Reference sequence Alignment list Cyan background Identical residues Alignment list Violet background Conserved residues Alignment list Grey background Mismatch Alignment list with grey Deletion background Alignment list Yellow vertical line Insertion Alignment list Thin blue vertical line Boundary of an exon Nucleotide header Sky blue background The three nucleotides for the currently protein mode selecte
75. g features for gene OD ec an 29 BUMPED ICA ea T a T RERERRLTEE 28 Searchine tor a Coue en ANa eaoaai 30 Searching fOr NEUE A ZMA Dee 31 Selecting single or multiple features and hiding showing them 33 Rapla vanant CONSTITUCION ua a 34 SPUNE WINdOWS IN ZIAD ies 36 EN AA a a aa aai aa A AAAS 37 Zmap keyboard and mouse shortcuts oooooccccncccnnonccnnnnconnnanncnnnnnnnnanccnnnnnnnaninioss 38 Tips tora speedier ZIMA nessa een eisernen 40 BIIX CM A O En nn E AE 42 GENOMA ae 43 A A AA AAAA EEE 43 PEAT AA cet ted ee 43 GER IS nee erh eisen eher E 43 Conan NO EN ea da e 45 NEBEN Nd O ee ee ee neh 46 ACUESTA A bondoces lone ean desk waeabovseewbased ees 47 BIE PICO iaa 48 Da A E E T E A EA tee 49 IE 00 ee ee 52 DE agree sql ee se ee 53 Flame SECTIONS ol Me WIN O Wii 54 DILO y A o E tea een ie en E Cwm ne eee ee er 56 NAVI SAL ON ae rer ee 56 ZOON NE ee 56 A RER ERUNE TRHEEUNEENEREENEAERUEEIREERUERRIERUIUHUIERUHERCRERLITTEHERUIERERERRIENENT 57 SOE EAO ee E E eee recor oe 59 Feiching SEGUEN E Senarra an a a O A T O 60 ERE E E e toto ee erry ee een er eee 60 PRU MUN CLO rere ROA ie O eecis ett E A EER 63 SEINE Se Eee 65 EC ALLE Senator Beeren T et ke os ee Dee Bee 65 LESION ANY OP ON SE Er 65 GEN Talse IND ee een 66 CO ee 66 A RO 66 a E nn ee 67 A TUE DUSLETIE E E A TE ET TT 68 keyboard Shortcuts 69 a AR 71 Getting Started A A seed huwieeddneeaehiesededs 72 RUDTMO DO tera iio 72 The Dotter WindoWs a ee ee e
76. he SeqTools package The software can be downloaded from the Sanger Institute s website http Awww sanger ac uk An aside about the name Blixem BLIXEM was originally an acronym for BLast matches In an X windows Embedded Multiple alignment although this is a bit of a misnomer now because Blixem can handle any kind of alignment not just BLAST matches We have dropped the acronym and the capital letters so the correct name is just Blixem 42 Getting Started Running Blixem As a minimum Blixem takes the following required arguments blixem display mode N P lt features file gt Where lt features_file gt is the path name of a GFF version 3 file containing the alignments and any other features The display mode argument is the only mandatory option It defines the display mode n for nucleotide or pP for protein Run blixem without any arguments to see further usage information Input files Blixem takes one or two files as input a mandatory GFF version 3 file containing the features and optionally a separate file containing the reference sequence in FASTA format blixem m N P lt reference sequence file gt lt features file gt If the reference sequence file is not provided the reference sequence must be supplied in FASTA format at the end of the GFF file following a comment line that reads FASTA Note that the reference sequence must always be a nucleot
77. hing to the right the positive strand DNA matches i e ESTs mRNAs and RefSeq and repeats are all displayed to the right of the center although they may align to either strand The thin bar to the right is the clone that the genomic sequence is made up from Double click on this to access the DE editing window 2 Annotated transcripts green is coding CDS red is non coding UTR and transcript variants and purple shows the coding region of NMD variants Grey transcripts see dotted line contain exons outside the sequence slice being viewed and should not be confused with Halfwise hits 3 Curated features such as PolyA features are seen as horizontal black lines 4 Phastcons44 conserved regions detected using multiple sequence alignments of 44 organisms 5 Imported annotation from CCDS human and mouse only 6 Imported transcripts via DAS source Here PASA_ESTs are shown 7 Predicted transcripts such as Genscan pale blue Augustus gold and Halfwise predictions of Pfam grey 8 Imported annotation from Ensembl 9 gis_pet_ditags and chip_pet_ditags are indicators of transcript boundaries 10 Repeats blue Line light green Sine gold other tandem repeats are red 11 CpG islands appear as yellow boxes 12 Protein matches are strand specific SwissProt are light blue and Trembl pink 13 EST matches are displayed as purple blocks and are broken down into human ESTs mouse ESTs and other ESTs from other orga
78. ide sequence and match sequences must be the correct type for the mode e nucleotide sequences for nucleotide mode or protein sequences for protein mode GFF file Blixem uses the GFF version 3 file format In this section we give a very brief description of this file format see http www sequenceontology org gff3 shtml for a full description The GFF file should start with the following two comment lines Additional comments can be included but may be ignored gff version 3 sequence region chr4 04 210623 364887 44144 154265 Each subsequent line defines a feature A feature line must have the following 8 tab separated columns reference sequence name source type start end score strand phase 43 An optional 9 column defines any tags separated by semi colons Blixem supports the following GFF tags Additional tags can be supplied but may be ignored Target required for alignments Gap required for gapped alignments ID required for parent features Name required for transcripts and SNPs Parent required for child features In addition Blixem supports the following custom tags percentld only applicable to alignments populates the ID column sequence only applicable to alignments supplies the sequence data variant_sequence only applicable to variations supplies the variation data url only used by variations GFF3 special characters must be escaped Transcripts Note that exons should have a Par
79. ight arrow Shift Left Shift Right Up arrow Down arrow Shift Up Shift Down Ctrl W Ctrl Q Ctrl S Ctrl H Ctrl A Ctrl G Ctrl D Move the cross hair one dot left right along the horizontal sequence The same as Left Right but for protein sequences this moves by a single nucleotide coordinate rather than a whole dot amino acid Move the cross hair one dot up down along the vertical sequence The same as Up Down but for protein sequences this moves by a single nucleotide coordinate rather than a whole dot amino acid Move diagonally up left or down right Useful for moving along an alignment Move diagonally down left or up right Useful for moving along an alignment Close the current window If this is a dot plot window it also closes the associated alignment and greyramp tool Quit Dotter Also quits any associated Dotters i e any child or parent Dotters Open the Settings dialog Open the Help dialog Show the alignment tool Show the greyramp tool Show the main dot plot window 83 Annotation resources AspicDB useful analysis of splice junctions http t caspur it ASPicDB CCDS http www ncbi nlm nih gov projects CCDS CcdsBrowse cgl Ensembl genome browser http www ensembl org index htm Entrez Gene for nucleotide and protein sequence cloning gene information etc http www ncbi nlm nih gov sites gquery HORDE database for olfactory receptors http genome weizmann ac il horde Swiss
80. ignment Go to the start of the current alignment or the end of the previous alignment Go to the end of the current alignment or the start of the next alignment Go to the end of the last alignment Scroll the detail view range to the left by one page Scroll the detail view range to the left by one base Scroll the detail view range to the right by one base Scroll the detail view range to the right by one page Scrolls to the start of the first alignment from that Acts only on selected sequences if there is currently a selection if no sequences are currently selected then this operation acts on all sequences 32 sequence if any are found a 2 Toggle strand Toggle which strand is the active strand Feedback box The feedback box contains information about the currently selected sequence and or coordinate if either is selected Click on a row in the detail view to select a sequence Middle click on a base in the detail view to select that coordinate Text in the feedback box can be selected and copied Reference sequence Match sequence coordinate coordinate 127038 BX503083 a eta N Match sequence Match sequence name length Figure 11 Feedback box Moused over item feedback area The area to the right of the toolbar contains information about the currently moused over item e g a match sequence in the alignment list or a variation in the variations track For a match sequence this information inclu
81. iprot_raw ete 2009 08 25 11 04 A 223 ACO13267 11 RP11 28408 completed 2008 02 IN mpk Contains the 5 end of the RBJ gene for rab and DnaJ doped con protein 513 RPS13 pseudogene the 5 end of the Ep tRNAScan completed 2007 11 28 20 52 44 0 5 cerevisiae a succinate CoA ligase ADP fory trf completed 2007 11 28 19 20 48 0 pseudogene a novel gene and two CpG islapas vertorna completed 2009 09 16 18 44 13 24 Aug 09 100 224 AC012457 7 RP11 509E16 completed 2008 02 29 mpk ontains the 3 end of the EFR3B gene fa aaa m PNYIC gene for proopiomelanocorlip4crenogeyeCotropin beta lipotropin a ii EA AA alphAnelanocyte stimulating hg Blonocyte stimulating hormone A y ext Clone se E K Ja Note text AC008073 Set note clear F write access Show Range F7 Hunt selection Refresh Locks Refres Ana Status Open from chr coords Run lace Close Right click on show report of pipeline status Double clicking on a clone shows history of edits 5 A clone can be selected using the left mouse button Use the shift button to select multiple clones write access can be Selected clones become a 098 turned off by clicking ace almon Colour NOW soc jae ee 2008 02 07 3 FKBP1B TPS313 PFN4 novel protein novel gene 5 novel protein 3 ITSN 5 X 217 note history of AC008073 4 RP11 507M3 CpG on th e yel low b U tton cl IC k Run lace 2008 02 07 jm2 y nero TPS313 PENS novel protein novel gene 5 novel
82. itting File Edit View dl Help Stop Reload H Split V Split Unsplit Unlock Revcomp 3 Frame DNA Columns Zoom in Zoom out Back 1 338012 Data loaded chr2 04 24270790 24608801 n 3573 S 4 chr2 04 24270790 24608801 36 Launching in a Zmap This function allows you to open two or more sequences alongside each other such as a human region and the syntenic region in mouse or two haplotypes so that simultaneous investigation can be carried out To do this you will need to open both sets of clones in the same Otterlace session To open both Zmap windows in one window as shown below you need to select Launch In A Zmap option in one clone set These clones will open to the left of the already open Otterlace session This screen shot shows human gene SF3B14 and the syntenic region in mouse The gene copy and paste function referred to in the Otterlace section is of much use here saving time when building gene objects Human gene SF3B14 has already been manually annotated and the similarity in the gene structures can be seen between the HAVANA gene object and the automated Ensembl object in mouse Mouse information bar Human information bar He 1 338012 Data loaded vertebrate_mRNA 4 24270790 24608891 oom in Zoom Out Back Em AF161523 1 681 lt UNGAPPED ALI
83. ive reference sequence strand in Blixem controls the orientation of the display coordinates are shown increasing from left to right for the forward strand and decreasing for the reverse strand The active strand is always shown at the top i e the top grid and top transcript view in the big picture and the top pane in the detail view In protein mode only the active strand is shown in the detail view One must toggle the strand to view the other strand Toggle which strand is active by e pressing the Toggle button Z on the toolbar or e pressing the t key 47 By default Blixem assumes that the reference sequence passed to it is the forward strand unless otherwise specified by the reverse strand command line argument Big Picture The Big Picture section shows an overview of the reference sequence The reference sequence coordinates are shown along the top You can zoom in to view a shorter range by using the Zoom in button at the top left of the screen Use Zoom out or Whole to zoom out Whole zooms out to view the full length of the reference sequence The big picture consists of two grids showing the alignments for each strand and two sections between these grids showing the transcripts for each strand The grids have a scale on the left hand side showing the percent ID and alignments are plotted against this scale The scale and extents of the grids can both be edited see the sec
84. ld exons Note that Dotter will recognise exons that do not have a Parent tag if they have a Name tag instead but they may not get grouped correctly with other exons from the same transcript Typically one defines the parent transcript the exons and the CDS regions Dotter will then calculate the missing components in this case the UTR regions and the introns Dotter will recognise other combinations of inputs and will always calculate the missing components as long as enough information is provided Sample GFF file A sample GFF file may look like this denotes that text has been omitted gff version 3 sequence region chr4 04 210623 364887 44144 154265 chr4 04 210623 364887 EST Human nucleotide match 79195 79311 95 000000 S Target DA692754 1 287 403 percentID 90 6 sequence GATCTGGC chr4 04 210623 364887 EST Human nucleotide match 79195 79323 121 000000 s Target AI095103 1 326 454 percentID 96 9 sequence TTTAAATT chr4 04 210623 364887 ensembl_variation deletion 80798 80799 Name rs60725655 url http 3A 2F 2Fwww ensembl org 2FHomo sapiens 2FVariation 2FSumm ary 3Fv 3Drs60725655 variant sequence AA chr4 04 210623 364887 Augustus mRNA 119534 119941 ID transcript21 Name AUGUSTUS00000051712 73 chr4 04 210623 364887 Augustus exon 119534 119941 Parent transcript21 chr4 04 210623 364887 Augustus CDS 119534 119941 Parent transcript21 74 The Dotter Windows The dot plot window The main Dott
85. lming cas sanger ac uk lw2 sanger ac uk Wellcome Trust Sanger Institute Zmap Zmap is a software package that provides a visualisation tool for genomic features The software is written in C utilising the gnome toolkit GTK2 to draw features on a canvas Zmap accepts input from multiple sources in multiple formats across multiple genomes and is written in a way so that the addition of further formats is made as trivial as possible Currently the list of formats includes GFF and DAS which may reside in any one of a file an acedb instance an http server Multiple genomes and their associated features can be displayed in a single view as aligned blocks providing support for comparative annotation Zmap does not include any utility for editing the features that it displays It does however provide a powerful external interface with which to modify the features displayed on the canvas Using this interface Otterlace is used to annotate sequences present in the Otter database This in turn updates to the Vertebrate Genome Annotation VEGA website http vega sanger ac uk index html Opening Zmap Zmap is opened via the Tools menu bar in Otterlace File SubSeq Clone Tools Click on Tools and select Launch Zmap Read Only genscan 1 ACO08073 1 004 PFOO254 1 CCDS1706 1 augustus 5 CCDS33153 1 ESTT13529 ESTT13528 ACO008073 1 006 ESTT13527 A4C008073 1 005 AC008073 1 003 ESTT13526 ACO008073 1 001 ACO08073 1
86. ls Help Ltri L File Exon Attributes Runs QC script Zooms to highlighted object in Zmap Dotter alignment of any selected homology in paste buffer to object See section on Dotter Searches translation for Pfam domains X Pfam AC008073 1 001 searching pfam querying server Cancel X Pfam AC008073 1 001 Significant Pfam A Matches Pfam ID amp Class PF00254 FKBP_C Domain open Pfam page Locations Alignments 11 gt 105 in Belvu GS ma http ipfam sanger ac ub searchsequencelresuts jobid 09076413 964 MP Getting Started E Latest Hendines X Belvu var tmp PF00254 19890 4979 aln Si sanger Sequer earch results j description of th equence ome Trust This provides a link to the match in the Pfam database Pfam puerta Y Picked ACO08073 1 001_11 105 1 95 1 match 177x249 ACOO8073 Q8k O8KRN3_CYTJO 09S3R6_PORGI 09 651_PORGI O9HVL2_PSEAE Q9KP11_VIBCH FKBB_SHIFL 019JYIS_NETIMB MIP_COXBU MIP_TRYCR QSL3M3_ACTAC Belvu is a multiple sequence alignments viewer that uses an extensive set of modes to color residues such as by conservation and by residue type user configurable Other useful features include fetching of protein entries by double clicking and easy tracking of the position in the alignment Belvu is also a phylogenetic tool that can be used to generate distance matrices between sequences under a selection of
87. map will zoom straight to it Note this may not work if you are searching for a feature out side of an area that is actively marked Help 1 338012 Data loaded vertebrate mma X ZMap 0 1 65 Feature Show Em U61167 1 Eile Help 9 Alignment Details Feature m ZZ Description U61167 1 Human SH3 domain containing protein SH3P18 mRNA complete cds Feature Name Feature Group style_id Taxon ID r Align Align Type dna Query length 4053 Matches Fa Sequence Strand Sequence Match Sequence Start Sequence Ena Match Start Match Ena Score 1 lt NOT SET gt 198209 198400 554 745 100 000000 2 lt NOT SET gt i 206447 206511 57 121 100 000000 3 lt NOT SET gt 209968 210003 21 56 100 000000 4 lt NOT SET gt i 198877 198973 457 553 99 000000 5 lt NOT SET gt I 200709 200830 335 456 100 000000 6 lt NOT SET gt I 191013 194286 746 4018 99 699997 7 lt NOT SET gt 200917 200962 289 334 100 000000 8 lt NOT SET gt 204425 204591 122 288 100 000000 Selecting single or multiple features and hiding showing them 1 If you left click once on a feature in Zmap you will highlight all of its exons the coordinates of which are now stored in the paste buffer and can be copied elsewhere such as into the transcript editing window in Otterlace 2 You can select multiple features by holding the Shift key down and left clicking with mouse
88. n a marked region only Bumping without marking is slow and removes the lines connecting Blast matches 4 When you have finished working within a marked region unbump the evidence you have been working on e g ESTs and unmark that region before you go on to select the next region to mark and bump or you could miss visualising the evidence in the new region 5 If you want to get rid of some white space try the compress c function or alternatively toggle off some of the columns Warning this may hide features as well If a column e g ESTs is bumped and you want to lose it temporarily it is quicker to turn the column off when you turn it on again it will still be bumped when it re appears than unbump then rebump again later 6 Jumping to genes objects If you expand the left hand scroll navigator overview you can jump directly to genes and objects by double clicking on them 40 Blixem User Manual Written by Gemma Barson gb10 sanger ac uk Wellcome Trust Sanger Institute 17 January 2011 41 Blixem This manual explains how to configure run and use Blixem Blixem is an interactive browser of pairwise matches displayed as multiple alignments It is not strictly a multiple alignment tool rather a one to many alignment It is used to check the alignments of nucleotide and amino acid sequences against a reference sequence Blixem is maintained by the Wellcome Trust Sanger Institute and is available as part of t
89. n hits To mark the rectangle click and hold the left mouse button at the top left of the area you want to outline and then drag out the outline until it encloses the area you want to zoom to When you release the button Zmap zooms in to that rectangle The red box is draggable You can use the left mouse to alter the bounds of the display in the main window and the scrollbar Use these buttons to Zoom in to a region or to Zoom out Help 1 338012 Data loaded jz ESS to the right of the main eee window to scroll through cil jig EE the data quickly i ee AC008073 i Ge a ENSESTC aan ENSESTGO 1 2 a aa acooso7 50k 4 _ _ Eure NE 42113 Glesaes A B E33 ae a E G em5 ENSG0000 33 Ah z ENsestco 299 K a 7 E a ENSESTGO az SE To save space when you SR z I r are inspecting a region you ee EE can drag the dotted lines ae Ee back to their original position to remove the laa E scroll bar and locus panel information Note it is not necessary to have any of these panels open while you work 21 The Focus Feature vs the Marked Feature If you click on a column background then that column becomes the focus column and you can do various short cut operations on it such as pressing b to bump it If you click on a feature then that feature becomes the focus feature and similarly you can do various short c
90. navan human and vertebrate analysis and Otterlace Zmap Blixem Dotter user manual Written by and contributions from Charles Steward cas sanger ac uk Gemma Barson Laurens Wilming Ed Griffiths James Gilbert Jennifer Harrow 23 May 2011 Blank Contents Ollerldce sida 2 a OS 2 Data Sc 6 0 016 hand 2 Ikranscript cho ser seclion zus nis 5 File menu Manage the Otterlace editing SESSION seeeeeeeeeeeceeeeeeeeeeeeeeeeeees 5 Subseq menu Editing operations on the transcripts listed in the window 6 Clone menu Edit properties of each of the clones one or many opened in the OllerlaCe eaten 7 Tools menu Useful things to run on the genomic sequence being annotated 8 Transenptealtor secu Nara 10 File menu Saving closing plus windows for showing translation and selecting A 12 Exon menu Tools 1or GAINS tie CXON Sado 13 Tools menu Informative operations to run on the transcript uueneeeeee 14 Attributes menu Controlled annotation vocabulary for transcript and locus 15 A ee 16 PENA A no E EEE 18 A O A RETE 18 Mal Lap MLCT DO 19 Navigating in Zmap and ZOOMING OPtIONS oooooncccnccnnnccnnnnnnnnnnnnnnnnnnononnnaninnnnnos 20 The Focus Feature vs the Marked Feature ccccccecececeeeeeeeeeeeeeeeeeaaaeeeeees 22 General Zmap display Ia caia ii iii 24 Functionality of the features at the top of the Zmap display 26 SNOW TAU asalta 28 Exportin
91. nd contains the sequence name The next line contains the start of the sequence data The sequence data can be on a single line or separated by newlines it is usually separated by newlines every 50 characters to aid readability gt chr4 04 210623 364887 tettgtttetgtaggagaggccatctccatcagctataaccaaaaaaaaa acaaaaaactcctctttttgacaagtttgtaaagcctgtccatctgggtc tataataatcctccaggccctatgccactcctctttattcagccagttca Z2 GFF file Dotter uses the GFF version 3 file format In this section we give a very brief description of this file format see http www sequenceontology org gff3 shtml for a full description The GFF file should start with the following two comment lines Additional comments can be included but may be ignored gff version 3 sequence region chr4 04 210623 364887 44144 154265 Each subsequent line defines a feature A feature line must have the following 8 tab separated columns reference sequence name source type start end score strand phase An optional 9 column defines any tags separated by semi colons Dotter supports the following GFF tags Additional tags can be supplied but may be ignored Target required for alignments Gap required for gapped alignments ID required for parent features Name required for transcripts and SNPs Parent required for child features Transcripts Note that exons should have a Parent transcript defined and the Name tag should be set in the parent rather than the chi
92. nisms 5 reads are on the left and 3 on the right 14 mRNA matches contains all species and are displayed as brown blocks 15 RefSeq matches are the orange blocks 16 Features and analysis available le Help Current Columns available Columns Reverse Strand Forward Strand genomic_canonical O Show e Default O Hide curated O Show e Default O Hide trembl Show Default Hide curated features O Show 8 Default O Hide phastcons44 O Show e Default O Hide predicted transcripts O Show e Default O Hide imported_transcripts O Show e Default O Hide repeats Show Default Hide h l b b swissprot O Show e Default Hide O pene 1 4 T e Co umns utton rings predicted regulatory features O Show e Default O Hide a 2 A 3 Frame Translation O Show Default Hide u p th IS WI ndow a lowi ng you gf_coding seg O Show e Default O Hide gf_atg O Show e Default O Hide to customize Zmap by turning imported transcripts Show e Default Hide ran alle O Show Default O Hide featu res on and off trembl Show e Default O Hide gf_splice Show e Default O Hide est human O Show 8 Default O Hide est mouse O Show e Default O Hide predicted transcripts Show e Default Hide est_other O Show 8 Default O Hide vertebrate_mrna O Show e Default O Hide refseq O Show 8 Default O Hide h f h ens_cdna O Show Default Hide Se ect t e eatu res t at you sa
93. nt construction Otterlace and Zmap can be used together to generate variant objects quickly Existing transcript objects can be used as a template for a new object while a Zmap HSP can be used to provide the coordinates for the new variant The new object will take its transcript type from the parent 1 Select the object that will form the foundation to the new variant either by highlighting the object in Otterlace or clicking on the object in Zmap AAA X Recovered human chr2 04 clones 217 218 Eile Edit View Raise ticket File SubSeq Clone Tools stop pelos H spne v spue Nevcomp 3 Frame onal cotumns zoom in zoom out Back genscan 1 AC008073 R 2 002 AC008073 a 8 004 PF0003 6 A AC008073 1 006 185X 15752 13901 Exon 2 5983 6030 48 Transcript putative and nmd nonsens cc DS42 659 i 1 PFO A 6 1 chr2 04_24270790 24608801 104665 1 003 AC008073 6 004 ESTT13533 ESTT13544 pa a alguno ESTT13534 AC008073 4 007 aan T TT AC008073 1 004 PFO8240 1 AC008073 4001 AC008073 4 003 u AC104665 2 004 AC008073 2 006 AC008873 8 005 augustus 2 PFOO254 1 AC008073 2 003 42008073 8 003 ESTT13545 CCDS1706 1 PFOO107 1 genscan 3 ENST408053 augustus 5 augustus 3 PFOOO18 5 CCDS33153 1 AC0080 2 004 PFO 653 5 ESTT13529 AC008073 9 001 PFOOO18 4 ESTT13528 AC008073 3 003 PF07653 4 AC008073 1 006 augustus 4 ESTT13546 PFOOO18 3 ESTT13527 PFOO235 1 AC008073 4 009 PF
94. ntal scroll wheel Scroll up down the currently moused over alignment list Scroll to the start end of the previous next match limited to currently selected sequences if any are selected includes all sequences otherwise Scroll to the start end of the display Scroll to the start end of the currently selected alignments or to the first last alignment if none are selected Scroll the detail view range one nucleotide to the left right Scroll the detail view range one page to the left right Scroll to a specific coordinate position Zoom in out of the detail view Zoom in out of the big picture Zoom the big picture out to view the full length of the reference sequence 56 Selections Selecting sequences You can select a sequence by clicking on its row in the alignment list Selected sequences are highlighted in cyan in the big picture You can select a sequence by clicking on it in the big picture The name of the sequence you selected is displayed in the feedback box on the toolbar If there are multiple alignments for the same sequence all of them will be selected You can select multiple sequences by holding down the Ctrl or Shift keys while selecting rows You can deselect a single sequence by Ctrl clicking on its row You can deselect all sequences by right clicking and selecting Deselect all or with the Shift Ctrl A keyboard shortcut You can move the selection up down a row using the up down arrow keys
95. on and enter the name of the sequence in the box below You may use the following wildcards to search for sequences for any number of characters for a single character Click OK Creating a group from sequence name s Right click and select Create Group or use the Shift Ctrl G shortcut key Or Ctrl G if no groups currently exist Select the From name s radio button Enter the sequence name s in the text box You may use the following wild cards in a sequence name for any number of characters for a single character You may search for multiple sequence names by separating them with the following delimiters newline comma or semi colon You may paste sequence names directly from another compatible program e g ZMap click on the feature in ZMap and then middle click in the text box on the Groups dialog Grouping in Blixem works on the sequence name alone so the feature coords output by ZMap will be ignored Click OK 61 Creating a temporary match set group from the current selection You can quickly create a group from a current selection e g selected features in ZMap or just the current selection in Blixem using the Toggle match set option To create a match set group select the required items and then select Toggle match set from the right click menu in Blixem or hit the g shortcut key To clear the match set group choose the Toggle match set option again or hit the g sh
96. ortcut key again While it is enabled i e toggled on the match set group can be edited like any other group via the Edit Groups dialog Any settings you change e g highlight colour will be saved even if the match set group is toggled off and then on again If you delete the match set group using the Edit Groups dialog all of its settings will be lost you will get the default settings again the next time you enable the match set group To avoid this disable it by toggling it off using the Toggle match set menu option or g shortcut key rather than by deleting it in the Groups dialog Editing groups To edit a group right click and select Edit Groups or use the Ctrl G shortcut key 00 X Blixem Groups Group name Hide Highlight Order Group D a E petete Group2 O zul E petete Delete all groups IX Cancel oY Apply 5 ok Figure 11 Groups dialog edit groups You can change the following properties for a group Click on Apply or OK to apply the changes Name You can specify a more meaningful name to help you identify the group Hide Tick this box to hide the alignments in the alignment lists 62 Highlight Tick this box to highlight the alignments Colour The colour the group will be highlighted in if Highlight is enabled The default colour for all groups is orange so you may wish to change this if you want different groups to be highlighted in different colours Order When sorting by
97. ot Yellow bar exact match mismatch conserved match deletion insertion Figure 5 Alignment colour key Alignment lists There are separate lists of alignments for each strand and reading frame of the reference sequence Each list has a yellow header bar containing the reference sequence At the left the yellow bar shows the reference sequence name and which strand frame it is e g 1 means forward strand reading frame 1 2 means reverse strand reading frame 2 49 Reference Strand First coord Last coord in sequence and in displayed Sequence displayed name frame range data range oe vela Start Sequence E AW889533 1 227 84 66 Match Alignment Start position of End position of sequence score and alignment on alignment on name ID match sequence match sequence Figure 6 Alignment list details Nucleotide mode There are two sections to the detail view in nucleotide mode one for each strand The active strand is shown at the top and defines the coordinate direction increasing if the forward strand is active decreasing if the reverse is active Name Score id Start Sequence End A BXS03083 1 179 99 2 182 Active strand ENSESTT lt gt 7308618 1 1 1 1 alignments BM720874 1 508 DA689205 1 239 99 356 116 Other strand Ps5530 1 267 99 317 47 BU159584 1 472 95 526 alignments GENSCAN lt gt 6646298 1 1 1 i Figure 7 Alignment lists nucleotide mode Protein mode Ther
98. p scratchy internal sanger ac uk wiki index php Otterlace filter_descriptions Transcript chooser section File menu Manage the Otterlace editing session When turning off the write access button on the previous page editing can still be carried out in a Read Only database but such changes will not be saved back to the Otter database and Use the Save option to save your work regularly to the master database This will also fetch new otter The menu bars provide different options for annotation as explained in the next sections IDs for new objects The Close option will quit the current Otterlace session SubSeq Clone Tools are thus not permanent Read Only f 908073 2 001 AC008073 8 002 AC008073 4 006 a AC008073 2 002 ESTT13535 ESTT13543 ACO08073 8 004 AC008073 4 012 AC008073 6 004 CCDS42659 1 PFO0036 2 Save opt Roe ACO008073 1 004 ESTT1 3533 PF00036 1 PFO8240 1 ESTT13534 ESTT13544 Resync Ctrl R PFOO254 1 AC008073 2 006 AC008073 8 001 AC008073 4 007 Close Ctrl l CCDS1706 1 AC008073 2 003 AC008073 8 005 AC008073 4 003 augustus 5 PFOO107 1 AC008073 8 003 augustus 2 CCDS33153 1 genscan 3 ESTT13545 k ESTT13529 AC008073 2 004 augustus 3 PF00018 5 Keystroke ESTT13528 PF07653 5 shortcuts are AC008073 1 006 AC008073 3 003 AC008073 9 001 PF00018 4 d d ESTT13527 augustus 4 PF07653 4 provided AC008073 1 005 PFO0235 1 ESTT13546 PF00018 3 AC008073 1 003 CCDS1709 1 AC00
99. redictions Kellis MIT h Ch i t T MIT das_CRG_U12 U12 genes Guigo CRG das_ChromSig romatin Signatures indicating TSS Ernst MI IN gt i W das Esigan Gane predictions by the program Evigan Pereira UPenn 8 A ellow ro ress ba r das_CONGO_Exons prot cod comparative exon predictions Kellis MIT DAS_Exon das_CRG_U12 U12 genes Guigo CRG das_Exonify evolutionarily conserved protein coding exons UCSC mE das_Evigan Gene predictions by the program Evigan Pereira UPenn MF dea fart esk A ich area E shows the status o ata DAS_Exon ee meta alas PawelGeu Pran A ore das_Exonify evolutionarily conserved protein coding exons UCSC M das_S epel_NovelLoci Novel loci predictions Siepel UCSC d i Y ll b E das_GenTrack Entries marked with important flags in AnnoTrack das_transMap_MRna TransMap cross species alignments Diekhans UCSC Oa l n g e OW Oxes das_peptide_atlas Peptides from httpe was peptideatlas org das_transMap_RefSeq TransMap cross species alignments Diekhans UCSC F das_Siepel_NovelLoci Novel loci predictions Siepel UCSC das_transMap_SplicedEst TransMap cross species alignments Diekhans UCSC tu rn reen wh e n co U m n S das_transMap_MRna TransMap cross species alignments Diekhans UCSC das_transMap_UcscGenes TransMap cross species alignments Diekhans UCSC das_transMap_RefSeq TransMap cross species alignments Diekhans UCSC F das_UCSC_NcIntrons Introns with non canonical splicing Diekhans UCSC das_tr
100. some 6 GRCh37 2009 06 01 chr7 04 chromosome 7 GRCh37 2009 06 01 chr8 03 chromosome 8 GRCh37 2009 06 01 chr9 18 chromosome 9 GRCh37 2008 06 01 chr10 10 chromosome 10 GRCh37 2009 06 01 chr11 03 chromosome 11 GRCh37 2009 06 01 chr12 03 chromosome 12 GRCh37 2009 06 01 chr13 13 chromosome 13 GRCh37 2009 06 01 chr14 04 chromosome 14 GRCh37 2009 06 01 chr15 03 chromosome 15 GRCh37 2009 06 01 chr16 03 chromosome 16 GRCh37 2009 06 01 chr17 03 chromosome 17 GRCh37 2009 06 01 chr18 04 chromosome 18 GRCh37 2009 06 01 chr19 03 chromosome 19 GRCh37 2009 06 01 chr20 13 chromosome 20 GRCh37 2009 06 01 chr21 04 chromosome 21 GRCh37 2009 06 01 chr22 08 chromosome 22 GRCh37 2009 06 01 chrx 11 chromosome X GRCh37 2009 06 01 E Dpen Search Error Log Close The Search feature allows you to search the Dataset for any feature such as Otter ID gene name etc X Find loci stable_ids or clone Search for kkkkk Locus names or synonyms international or EMBL clone names Otter Gene Transcript Translation Exon OTT stable_ID EnsEMBL Gene Transcript Translation Exon ENS stable_IDs CCDS names or Pipeline hit names tl be Clear Search Close Tue Tue Tue Tue Tue Tue Tue Tue Tue Tue Tue Tue Tue Tue Tue Sep Sep Sep sep sep sep sep sep Sep Sep Sep Sep Sep Sep Sep a a a a a ad a a a odo ood od 13 13
101. sted into other applications using Ctrl V The default clipboard can be pasted into Blixem using Ctrl V If the clipboard contains valid sequence names those sequences will be selected and the display will jump to the start of the selection Note that text from the feedback box and some text labels e g the reference sequence start end coords can be copied to the selection buffer by selecting the required text with the mouse or copied to the default clipboard by selecting it and then hitting Ctrl C Text can be pasted from the default clipboard into text entry boxes on dialogs such as the Groups or Find dialog by using Ctrl V Sorting alignments Alignments can be sorted by selecting the column you wish to sort by from the drop down box on the toolbar 59 Name Organism Gene name Tissue type Strain Group Score Identity Position Figure 8 Sort by list The default sort order may be ascending or descending depending on what makes most sense for the selected column e g sorting by position is ascending by default but sorting by score or ID is descending To get the inverse of the default sort order select the Invert sort order option in the Settings dialog Alignments can also be sorted by group Alignments that are part of a group will then be listed first before any that are not in a group and ordered according to the group s order number See the Groups section for more details ae Be ER
102. t Query End Query Strand Score Feature Set 6 Em U61167 1 191013 194286 746 4018 99 699997 vertebrate_mrna 1 Em U61167 1 198209 198400 554 745 9000000 vertebrate_mrna 4 Em U61167 1 198877 198973 457 553 99 000000 Vestebrate_mrna 5 Em U61167 1 200709 200830 335 456 100 000000 vertebrate_m 7 Em U61167 1 200917 200962 289 334 100 000000 vertebrate_mrma 8 Em U61167 1 204425 204591 122 288 100 000000 vertebrate_mrna 2 Em U61167 1 206447 206511 57 121 100 000000 vertebrate_mrna 3 Em U61167 1 209968 210003 21 56 100 000000 vertebrate_mrna AAA X ZMap 1 chr2 04_24270790 24608801 Elle Edit View Raise ticket k Revcomp 3 Frame DNA Columns Zoom In Zoom Out Back Em U61167 1 3273 i 191013 194286 lt 746 4018 3274 lt GAPS NOT SHOWN gt 99 699997 hr2 04_24270790 24608801 185 k a o 190 k E 19 k D D booooo 00 00 e ch Ls TT 7 N 200 k EEE RES o A further window will appear containing information about the feature Source vertebrate_mma vertebrate_mrna vertebrate_mma ertebrate_mma vertebraten a vertebrate_mrna vertebrate_mma vertebrate_mrna AL hon Style mrna_align mrna_align mrna_align mrna_align mrna_align mman mrna_align mrna_align Alignment 32 AAA If you now left double click on the match you want to inspect Z
103. t area is marked 6 If no feature or area is selected then the visible screen area minus a small top bottom margin is marked Mark an area 1 Select an area by holding down the left mouse button and dragging out a box to focus on that area 2 Press m to mark the area Manual cropping of the marked borders You can manually change the borders of the marked area by putting your cursor over this area and using the cropping tool by clicking and holding with the left mouse button and dragging to make the area bigger or smaller Unmark a feature Press m or M again i e the mark key toggles marking on and off 23 General Zmap display features Different features are displayed in distinct columns as follows HE AMM EDEN File Edit View Raise ticket Help stop F load H plit v spiit unspiit lala DNA Col ACOp8073 4 1 166660 RA11 507M3 chr2 04_24270790 24608801 a A O Wa A hte 2 5 i i pro gt 4 a 9 Note you may see more or fewer features and columns depending on how your preferences are set up For descriptions of all column types such as DAS sources visit this URL http scratchy internal sanger ac uk wiki index php Otterlace_filter_descriptions 24 1 The thick yellow line represents the genomic sequence everything to the left represents the negative strand and everyt
104. t to CDNA Em D38037 1 y L37086 1 Zmap Note this 569815 1 44948705 does not save to the 1347804 BF969580 m r A BI546013 aster database 8154013 cae gt 4C008073 1 001 ar MGVETETISPGOGRTFPKKGQTCVVHYTGMLOQNGKKFDSSRDRNKPFKERTGKQEVIKGF EEGAAQMSLGORAKLTCTPDVAYGATGHPGYIPPNATLIFDVELLNLER 9 O IX Kozak Checker 5 acogcuaugg 3 olose 3 uggcgauacc 5 Y1 translation E Paste Delete Trim 3 Highlight hydrophobic Close Click box will turn yellow to highlight hydrophobic residues Select homology in Zmap or in Blixem Trims peptide and paste in here sequence to first using either middle stop codon mouse or paste Choose the start Irim F Highlight hydrophobic BEN button coordinate before using the trim function x AC008073 1 001 translation v gt AC008073 0 1 ial MGVETETISPGDGRTFPKKGQTCVVHY TGMLONGKKFDSSRORNKPFKERIGKQEVIKGF EEGAAQMSLGORAKLTCTPDVAYGATGHPGVIPPNATLIFDVELLNLE 7 X AC008073 1 001 translation AC008073 1 File MGVEIETISPGC Edit TCVVHYTOMLONGKKFDSSRDRNKPFKERIGKOEVIKGF EEGAAQMSLGOF Search ea ATLIFDVELLNLER View Fi Irim El Hi grrrgrre mge Find Next Find Previous Click and hold with right mouse to bring up search function to find an amino acid sequence not over MET 12 Exon menu Tools for editing the exons Tools Attributes Select all exons Click over exon of
105. ted 2009 02 11 jv2 Contains the 5 end of the ITSN2 gene for intersectin 2 a ribosomal protein L3ba Augustus completed 2007 11 28 20 54 08 0 RPL364 pseudogene a high mobility group nucleosomal binding domain 2 CpG completed 2007 11 28 16 40 19 0 HMGN2 pseudogene and a CpG island Eponine completed 2007 11 28 20 55 46 0 219 ACO78975 6 RP11 107P22 completed 2008 02 13 mpk nothing 4 Esteganoma HUN AS EZLN A CO J Est2genome_human_raw completed 2009 09 16 20 24 50 24 Aug 09 100 220 AC093798 3 RP11 306N14 Qmpleted 2008 02 13 mpk Contains the 5 end of the NCOA1 gene for nuclear receptor coactivator 1 and two Est2genome_mouse completed 2009 09 23 18 43 59 24 Aug 09 100 CpG islands Est2genome_mouse_raw completed 2009 09 16 20 24 50 24 Aug 09 100 y y compl 09 2 47 Aug 09 221 ACO13459 9 RP11 169L20 compl amp d 2008 02 18 mpk Contains the 3 end of the NCOA1 gene for nuclear receptor coactivator 1 the 5 eng of the CENPO gene for CENPO centromere protein O a novel protein and one Gp cici coup lated gt 200711620 208 82 28 8 island Hal fwise completed 2007 12 01 13 33 41 12 422 222 AC012073 9 RP11 443B20 completed 2088 02 13 mpk Contains the 3 end of the CENPO gene for CENPO centromere protein the RepeatMasker completed 2007 11 28 19 15 35 0 ADCY3 gene for adenylate cyclase 3 the 3 end of the RBJ gene fa 5 Uniprot_5w completed 2009 08 25 11 16 17 15 6 domain containing and one CpG island NMIRRAZTR _ Un
106. tion Copy and Paste makes a copy of selected transcript s and assigns unique transcript and locus IDs Note can be used to copy objects from one data set to another if both data sets have been opened in the same Otterlace session New makes a copy of selected transcript and assigns unique transcript and locus IDs as well as naming the transcript and locus after the clone that the 3 end of the object is from Each new locus will be incremented by 1 Change the locus name to a known symbol if necessary File Exon Tools Attributes 1931 CDS 1795 dh 1967 ag 5983 P 6030 ag 12895 de 13007 ag 15145 dh 15762 15273 Variant makes a copy of selected transcript and assigns a new variant number tg LO LU E Transcript Name ACO05073 1 001 Type Known_CDS Start Found End Found Remarkst Locus Symbol See Transcript editor section for details on this window and the options available FKBP1B FK506 binding protein 1B y FP Known Full name 12 6 kDa Aliasies t Remark l Clone menu Edit properties of each of the clones one or many opened in the otterlace editing session The Clone menu allows you to add the DE File SubSeq Clone Iools description line to a ao clone The menu lists all the clones that make up the genomic slice you are looking at in the order that they appear in Sele
107. tion in the Settings dialog The active strand alignments and transcripts are shown at the top and the other strand at the bottom The direction of the coordinates is determined by the active strand The active strand can be toggled using the t shortcut key or the Toggle strand button on the toolbar Zoom in Zoom out Whole PO 40000 50000 60000 70000 S MEE ES A E AN TES pista ZZ ae eee A A 7 A A eae C AAA A Y E _ O AA zz E 20 sag AAN A A ee ria A A AA AA A EAS DANA DOES RE ROA 60 E A A AAA ee pan SS o Cie 20 Figure 3 The Big Picture section Bumping the transcript view By default exons and introns for the same strand are drawn overlapping each other They can be expanded or bumped by pressing the b shortcut key or by enabling the relevant option in the View dialog see 48 Zoom in Zoom out Whole PP 40000 50000 60000 70000 E A E IAE ES eee eee pets A A A A A AN polis A A A A aE SS a eee 20 On ron on imes nn e A A A at a EEES ae AA MA A IA AA O eS AA O AOS A 60 40 Te ee Te 20 MEE ee ZZ zz e ETRE Figure 4 Expanded transcript view Detail View The Detail View shows the actual sequence data for the match sequences Match sequences are lined up underneath the relevant section of reference sequence and individual bases are highlighted in different colours to indicate how well they match Match colours Reference sequence I Cyan Grey Violet D
108. ttgac aa gt chr2 04 24270790 24608801 DNA ACO08073 1 001 952 bp gac cgactctgcagtgacggcgaggagacgagc En gcga Bu ggggetggggecggageegagecggggt egggcagc t cagaggcagggcctgtgagaccgc u gtgga a at Hk bb acta on tg aatggga aga agtttga at cctttc sagt agaattggca tcatc saggttttga aga agotar agcccagatgagc tgga ag gaur ga a ae ctga chick cir atatggagcc gt cct cgtggagctgct idos st aggcagga aaggtggetggagatggetgc t ms tagcctgctctgccactgggacggcte cae r a gctc cttga ah Bere ctaacctcact att tctgcccaagttgctctgt atgtgtte gtcagtgttc ae attc cttga om acttcggttgcagattgaagcatttc Ge attttg gt be atata ich gcc pS pa ole cagatctcttgtt atgta cat age ata X FASTA filename Save in folder D cas v b Browse for other folders MX Cancel fea Save Name X Choose DNA Extent Eile Help Selected Feature AC008073 1 001 Set Start End coords for DNA export Start 1795 EJEna 15762 Flanking bases 0 Cancel Bumping features This section describes how to select a feature mark it and then zoom in to it and examine evidence that overlaps that feature The default setting for Zmap is to show HSPs drawn on top of each other This saves space on the canvas making it easier to see the general features of the region of interest The bump option allows you to see the HSPs as multiple alignments 1 Click on the feature you are interested in perhaps a transcript 2 Mark it by pressing m 3 Zoom in to the f
109. turated_swissprot O Show O Default Hide want to be visib le on Zmap curated features O Show e Default Hide saturated trembl O Show Default Hide saturated_est human O Show Default e Hide and cl ick on Apply saturated est mouse O Show Default e Hide saturated_est other O Show Default Hide Osh Ouro a saturated vertebrate mma Show Default e Hide curated ow efault ide DNA O Show Default e Hide A O A 7 Show All Default All Hide All Default All Hide All Revert Apply Revert sets the features to the default setting Functionality of the features at the top of the Zmap display D X ZMap Preferences for ZMap 1 fie This window sets the range for Cut Ctrl X Copy Ctri C Blixem Blixem The default setting is Paste Ctri V Redraw General Pfetch Socket Server Advanced 200 000 bp However you can Preferences Scope set it to a more appropriate range Set Developer status A New in a ateari ii for the clones you are annotating Save screenshot Ctri D Print screen shot Ctri P The range must be reset when Close Ctr W you start a new Zmap window Quit Ctri Q MX Cancel oP ok Statistics A General Help i CCESS Keyboard amp Mouse Session Details E to hel p Alignment Display 7 Release Notes See ZMap tickets Contact menu BOETA ZMap ticket Acedb ticket Helpdesk Anacode ticket File Edit View Raise ticket Reload
110. uences highlighted in Zmap missing accession numbers or a fasta file of one or many sequences Data can be entered in all three of the fields in the OTF window at the same time to search on accession s from a file and a seqtext area Results are dynamically loaded onto Zmap to the right or left of the clone lines see later Zmap section depending on orientation X On The Fly OTF Alignm Query sequences brought to you by exonerate Accessionst Fetch from clipboard Clear Fasta file Browse Fasta sequence i _ iM E Parameters Clear existing alignments of same type Number of transcript alignments to report 0 for all 1 F Only search within marked region Maximum intron length 200000 Launch Close Limit search to the marked region in Zmap Set this box to 1 to search for the best match or set to 0 to search for all matches Renames a locus to a new locus name X rename locus FKBP1B toz FKBP1B Use this window to increase the window of search for genes with large introns Browse local directory for sequence files X Chie fasta file Directory 7 automount evs users2 root acedb analysis EL File name Open Cancel Files of type Fasta Files seq pep dna fasta fa Transcript editor section 000 X AC008073 1 001 i CDS stop File Exon Tools Attributes a Coding sequen
111. ures e g exon introns HSPs etc or any rubberbanded area if there was one Z Zoomtowhole transcript or all HSPs of a selected feature Zmap Mouse Usage Ri Single mouse button click highlight a feature or column horizontal ruler with show feature or column menu sequence position for options such as pfetch show Plus drag draw a rectangle displayed on button feature DNA show peptide around an object for zoom release centre on export peptide mouse position Release mouse outside Zmap window to prevent re centering Double mouse button click display details of selected same as single click same as single click feature Double click on object to get edit window highlight a subpart of a same as single click same as single click feature e g a single exon or alignment match 39 OR multiple highlight TO Tips for a speedier Zmap 1 Specifically zoom and mark within Zmap early on after launching Either select a gene object and press z to zoom OR select a rectangle to zoom in by dragging the left mouse button around it Reverse complement now if necessary then press m to mark the region 2 The quickest way to zoom out of Zmap again is to right mouse click on the zoom out buttons at the top of zmap and choose one of the options this is definitely much quicker that doing individual zoom outs with the left mouse button Likewise for zooming in again or use keyboard equivalents 3 Bump withi
112. us 5 PFO8240 1 CLDSs3 baa 1 AC008073 2 006 ESTT13529 AC008073 2 003 ESTT13528 PFOO10 1 This example shows that gene object AC008073 1 004 has no supporting evidence added to it The gene object will turn black once the checking software finds no inconsistencies The complete list of checks carried out is as follows 1 No internal stop codons exist in coding object 2 Transcript has start_not_found set if the translation doesn t begin with Methionine 3 Transcript has end_not_found set if the translation doesn t end with a stop 4 The correct selenocysteine remark and coordinates are automatically added if seleno appears in an annotation remark for the transcript 5 Locus has a description also known as full name 6 Transcripts within each locus are all on the same strand 7 Transcripts do not have a 5 UTR with start_not_found of 1 2 or 3 UTR start_not_found has been added as a menu option 8 There is evidence attached to each transcript 9 Nucleotide evidence is only used once in each locus 10 The same locus name root is not used for transcript names in more than one locus 11 All the transcript names in the same locus have the same locus name root 12 Transcript names start with the locus name if the locus name ends with dot number which means the clonename in such circumstances 13 Transcript names end dash digit digit digit 16 Zmap User Manual Written by Charles Steward and Laurens Wl
113. ut operations on it such as zooming in to it Note when you select a feature then its column automatically becomes the focus column While the focus facility is useful the focus changes every time you click on a new feature Sometimes you want to select a working feature or area more permanently To do this you can mark the feature or area and it will stay marked until you unmark it Marking an area within Zmap to work on is essential allowing you to work much faster The marked area is left clear while the unmarked area above and below is marked with a blue overlay see screen shot below Double left clicking on any File Edit View Raise ticket Help Revcomp 3 Frame DNA Columns gene object Opens the 1338012 Dataloaded AC008073 1 001 1795 15762 13968 coordinate editing interface _ known eds transerinte knawn eds x AC008073 1 001 chr2 04_24270790 24608801 mupa File Exon Tools Attributes I ag I 1931 CDS 15273 Li 1795 db 1967 gt E ag 5983 oF 6030 gt ag 12895 oP 1300 gt _ ag 15145 oP 15762 F 5k Pal Transcript T Name ACO08073 1 001 Type Known_CDS Start Found End Found non submitted evidence 2 e pomana this text is transcript visible this text is transcript annotation Locus bos Symbol FKBP1B Y F Known i Full name FK506 binding protein 1B 12 6 kDa sel i Aliastes t a
114. w big picture Active strand O Show grid Show gxons Other strand Show grid Show exons Alignment iists Show alignment lists Active strand Show active strand Other strand Show other strand Figure 13 The View dialog _ Bump exons Active strand grid is Romp onen now hidden F Sort by Identity QQ kb e Name Score ld Start Sequence AYEATARA 1 Alternatively use the following keyboard shortcuts to toggle visibility of a component 1 2 3 Ctrl 1 Ctrl 2 Shift Ctrl 1 Shift Ctrl 2 Hide top pane in detail view Hide second pane in detail view Hide third pane in detail view protein mode only Hide top grid in big picture active strand Hide bottom grid in big picture other strand Hide top exon view active strand Hide bottom exon view other strand 55 Operation Navigation Scrolling Middle click drag in big picture Middle click drag in detail view Horizontal scrollbar Vertical scrollbars Horizontal mouse wheel Vertical wheel mouse Ctrl left Ctrl right Home End Ctrl Home Ctrl End comma full stop Ctrl Ctrl Go to button or p key Zooming keys and ajal Ctrl or Ctrl keys and Whole Shift Ctrl and Whole Select a region to jump to Select and centre on a base Scroll the detail view range Scroll up down an alignment list Scroll the detail view range if your mouse has a horizo
115. x annotation completed A BESSERE this text is locus visible remark this text is locus annotation remark i The marked area is designated by the blue shading at the top and bottom This screen shot shows a of the screen shot The boundaries column that has been can be manually changed see next selected and then marked page on manual cropping 22 Mark a feature 1 Select a feature to make it the focus feature 2 Press m to mark the feature the feature will be highlighted with a blue overlay Feature marking behaves differently according to the type of feature you highlighted prior to marking and according to whether you press m or M to do the marking 1 If you press m the mark is made around all features you have highlighted e g a whole transcript a single exon several HSPs 2 If you press M to do the marking around transcripts the whole transcript becomes the marked feature and the marked area extends from the start to the end of the transcript 3 If you press M to do the marking around alignments all the HSPs for that alignment become the marked feature and the marked area extends from the start to the end of all the HSPs 4 If you press M to do the marking around all other features the feature becomes the marked feature and the marked area extends from the start to the end of the feature 5 If no feature is selected but an area was selected using the left button rubberband then tha
116. you are about to open a large sequence set Would you like to restrict the number of clones visible in the ana_notes window If so please enter the number of the first and last clones you would like to see First Clone 13 1 Last Clone 2009 35 Open All Open Range Description of error Send this error log to a pde sanger ac uk An option to email anacode with the errors is provided to facilitate a diagnosis Always include a usefu pa description in the email 4 The SequenceSet window appears also known as Ana_notes It shows remarks that can be added using the entry field at the bottom to help track annotation progress This window also allows you to either open the whole contig range in one scrollable window or open a selected range of your choice X Open a slice Enter chromosome coordinates for the start and end of the slice to open the clones contained Slice start 10001 end 1010001 Run lace Cancel These options allow you to open specific regions and are designed to make opening clones quicker The SequenceSet window columns show clone 30 X SequenceSet chr2 04 A 5 A 202 AC096570 1 RP11 368018 completed 2008 01 30 jmt2 pa of novel ger 2 novel genes N u m be r aCCess l O n n u m be r l n te rn a p PO ect 203 AC096561 1 RP11 257D7 completed 2008 01 30 jmi2 nothing h d

Download Pdf Manuals

image

Related Search

Related Contents

ParaView User's Guide (Version 1.6)  PRESTIGE II 600  欲しいシステム~視覚障害の立場から  Manual del usuario - Remis-Net, Sistema de gestión para agencias  Téléchargez ici le manuel du logiciel pour Nikobus  Descargar hoja de datos  Hitachi DV 18DL Cordless Drill User Manual  T'nB USPEACE mobile phone case  

Copyright © All rights reserved.
Failed to retrieve file