Home
Copy Number Analysis in Partek® Genomics Suite™ 6.6
Contents
1. 1932 9 chri 33606055 598 1p35 1 1p34 das Deletion 11339 14455 32 1 490335 10 Iri 571980 1p343 1p32 151 Ampl Amplification 2165 12523 1715 23 239176 ii chi 571980 72760 1p32 2 1p31 1 151 Unchange 1556 11233 1385 4 2 14522 12 Ichi 72760190 1p3 41 1 151 Ampl Amplification 58644 1028 84 2 6741 13 dri 72818834 1153220 1p31 1 ip15 J 151 Unchanged 4250 27612 1539 3 2 1 10198 14 chri 1153220 1p13 2 ig2i 1 151 Ampl Amplification 7670 73 2 55175 15 Iri 145107483 229146879 ig 1 1 l 151 Unchange 84039396 54664 1537 38 2 27473 16 chri lace Saco ja 13 ig 151 Unchang changed eoa 14732 EE 1562 85 p 94588 A a nanara a laaa l a aa Fi igure 12 Viewing edie a segmentation txt spreadsheet Each row of the spreadsheet represents one genomic region per sample You may use the Tools gt Merge Adjacent Regions to combine similar regions like the first two unchanged regions shown above p Copy Number Analysis in Partek Genomics Suite 6 6 1 Visualizing regions of interest The regions identified by the segmentation algorithms may be visualized in the chromosome viewer either at a single region or entire chromosome level e To visualize a region of interest you may right click on a row header and then choose Browse to Location e Alternatively to get an overview of the results select Visualization gt Plot chromosome view and the Track Wizard will appea
2. 6 26 LOH Al Amplification 3 Deletion simple amplification pen Bin ah A simple deletion LOH AI Figure C2 Integration of copy number workflow with loss of heterozygosity LOH or allelic imbalance AI under allele specific copy number AsCN workflows enables the identification of copy neutral events A further consideration is that the correct interpretation of currently available algorithms for LOH has been proven complex and difficult because cancer cells frequently deviate from the diploid state and tumor specimens often contain a significant proportion of normal cells For instance it has been shown that as the proportion of tumor cells in a sample decreases and approaches 50 or less the capacity to detect the LOH diminishes Yamamoto et al Am J Hum Gen 2007 Moreover the genotyping algorithms fail to call a heterozygote SNP accordingly in a situation when only one of two alleles gets amplified e g 3xA and 1xB a false positive LOH call can be the consequence ASCN analysis on the other hand is a method that enables reliable detection of allelic imbalance in tumor samples even in the presence of large proportions of normal cells Unlike LOH it does not require a large set of normal reference samples For a heterozygous SNP only those are informative a balance is expected between the two alleles 1XA and 1xB or 1 1 ratio The AsCN algorithm provides an estimated number of copies of each allele and therefore ena
3. HMGXB4 0 1 0 0143113 22 35477114 38 6 22 35695268 35743988 E 0 1 0 0181783 22 35477114 38 7 22 35695797 35743988 E 0 1 0 0179809 22 35477114 38 8 22 35695797 35743988 A 0 1 0 0179809 22 35477114 m 9 22 35695797 35743988 0 1 0 0179809 22 35477114 38 10 22 35695797 35743988 IR_024194 0 1 0 0179809 22 35477114 38 eae 22 35695797 35743988 IR_024195 TOM1 0 1 0 0179809 22 35477114 3 22 35731633 35731752 IR_037471 MIR3909 0 1 4 44009e 005 22 35477114 3 13 22 35777060 35790208 WM_002133 0 1 0 00490574 22 35477114 z 14 122 35796116 35820496 a 0 1 0 00909659 22 35477114 3 15 22 35937352 35950046 M_014310 0 1 0 00473634 22 35477114 38 16 22 36002811 36013385 IM_005366 0 1 0 00394534 22 35477114 38 Figure 27 Viewing the gene list spreadsheet a result of overlapping genes with regions of copy number changes Each row of the table represents one RefSeq transcript As previously mentioned this type of regions to genes overlap is gene centric and enables genomic integration For instance GO enrichment can be directly invoked on the gene list spreadsheet to detect the functional groups affected by copy number changes to learn more please see the GO Enrichment Tutorial available at Help gt On line Tutorials gt User Guides tab The gene list spreadsheet can also be used to find possible fusion genes To do that you might use the interactive filter on column 8 Percent overlap with gene to get the list of genes with overlap of up t
4. X 270 q PCA Mapping 28 6 EH 7 A LH PC 1 10 6 540 770 1000 Figure 4 Principal component analysis scatterplot showing the total allele intensities of normal red dots and cancer blue dots samples Each dot represents a single sample Note the tumor samples blue are more dispersed than the normal samples red Estimating copy number from marker intensities The first step in the analysis of Affymetrix intensity data is to estimate the number of copies of each marker allele The Create copy number from allele intensities only function can be accessed through the Import section of the Copy Number workflow and will estimate the copy number of each marker by comparing it to a reference Depending on the design of the experiment the user first has to choose between paired and unpaired samples Figure 5 A paired design assumes that each sample has its own reference sample PGS will generate a copy number spreadsheet This is the typical and recommended situation for tumor normal sample pairs for finding somatic cancer mutations Unpaired samples on the other hand use a common reference single or group of samples and PGS will create an allele ratio spreadsheet intended primarily for visualization purposes in addition to the copy number spreadsheet Unpaired analysis is typically used for studying inherited effects Paired unpaired copy number creation Choose the method A Paired samples e
5. correction Affymetrix CEL files To normalize for GC content use the custom import settings during the import procedure Customize Figure Al and under the Algorithm tab of the Advanced Import dialog check the Adjust for GC content box Figure A2 Import Affymetrix CEL Files l Sample Information File Optional mc Output File C Users User Documents Partek Partek Example Data SNP6 0 20Hapmap Sample SNP6 0 20HapmapSample No Figure Al Accessing the custom import settings during the import of Affymetrix CEL files Probes to use in the import Probes to Import IV Interrogating Probes Control Probes Probe Filtering Skip Indude From File Filter File Normalization configuration Pre background Adjustment Adjust for Fragment Length V Adjust for GC Content V Adjust for probe sequence Quantile Normalization Skip Distribution File Figure A2 Adjusting for GC content during the import of CEL files Copy Number Analysis in Partek Genomics Suite 6 6 23 Appendix B Filtering the FFPE samples by fragment length Many samples used in medical research have been formalin fixed and paraffin embedded FFPE Unfortunately storage and recovery of nucleic acids in this form can increase the rate of breaks in nucleic acids The high density of Affymetrix arrays allows longer restriction fragments which are therefore more likely to contain breaks in FFEP sample
6. not shown Unfortunately there are no universal answers on how to set the limits for the fragment filter In general you should monitor the distribution of the fragments on agarose gels prior to hybridization on the array and use the size information from the gels to optimize the limits used for analysis as the loss of longer fragments may be different for each dataset If no physical size data is available removing fragments above 500 or 700 base pairs is a recommended starting point as the nucleic acids in the sample above this size may have been damaged Copy Number Analysis in Partek Genomics Suite 6 6 25 Appendix C Integration of copy number with LOH and AsCN Although copy number analysis is a powerful tool for studying genomic aberrations it lacks the capacity to detect genotypic changes which are copy neutral If we consider loss of heterozygosity LOH this event may be caused by a hemizygous deletion in which one allele is lost and the other allele remains present Figure C1 middle panel That type of LOH can be recognized not only by SNP genotyping but by copy number analysis as well However an allele may lost initially but the subsequent amplification of the remaining copy creates a copy neutral LOH Figure Cl right panel also known as uniparental disomy Different mechanisms have been described to create copy neutral LOH in meiosis and mitosis the common feature is that copy neutral LOH can only be detected w
7. the mRNAs 4 mr Manage available annotations Configure result Result file gene ist txt j on the computer display Download required in red The resulting spreadsheet gene list is shown on Figure 27 Each row corresponds to a transcript RefSeq in this tutorial and the columns are as follows 1 3 Genomic coordinates of the transcript Copy Number Analysis in Partek Genomics Suite 6 6 20 Coding strand Transcript ID Gene symbol Minimum distance of the region to the transcription start site positive values indicate downstream while negative values indicate upstream 8 Percent overlap with gene length of gene to region overlap divided by the length of the gene 9 Percent overlap with region length of overlap divided by the length of region 10 Correspond to the columns 1 in the segment analysis spreadsheet oa Se _ Current Selection 22 L 2 ok 4 5 6 re 8 9 10 11 1 transcript transcript start transcript stop strand Transcript ID sene Symbo Distance to TSS Percent Percent Chromosome Start chromosome overlap with overlap with St 1 22 28315364 28398668 IR_026963 TTC28 AS1 70641 0 15201 0 0227762 22 28386005 2 2 22 28374002 29075854 IM_001145418 TTC28 133875 0 792153 1 2q 28386005 2 a 22 35462130 35483381 mM_001008494 ISx 14984 0 294904 0 00233832 22 35477114 36 4 22 35653445 35691801 NM_00 100368 1 HMGXB4 0 1 0 0143115 22 35477114 38 S 22 35653445 35691801 NR_027780
8. 1246M3ps 1869MBps 24931 J al Figure 9 Chromosome view of copy number spreadsheet The default tracks from the top are Genomic features from a selected annotation source Ensembl Transcripts shown here Heatmap track Profile track and Cytoband track Chromosome I is shown by default The Genomic label genomic coordinates is shown below the Cytoband track Detection of regions with copy number variation Starting with copy number estimates for each marker either taken directly from the vendor s input file or calculated previously the goal is to derive a list of regions where adjacent markers share the same copy number PGS offers two algorithms for region detection Genomic segmentation and Hidden Markov Model HMM Basically both algorithms examine trends across multiple adjacent markers The genomic segmentation algorithm identifies breakpoints in the data 1 e changes in copy number between two neighboring regions The HMM algorithm looks for discrete changes of whole number copy number states e g 0 1 2 with no limit on the upper limit and will find regions with those numbers of copies Therefore the HMM model performs better in cases of homogenous samples when the copy numbers can be anticipated such as clinical syndromes with underlying chromosome or germ line gene aberrations Genomic segmentation is preferred for heterogeneous samples with unpredictable copy numbers such as cancer because tissue biopsies often c
9. CHR 2 CHR 3 CHR 4 CHR 5 CHR 6 CHR 7 CHR 8 CHR 9 CHR 10 CHR 11 CHR 12 CHR 13 CHR 14 CHR 15 CHR 16 CHR 17 CHR 18 CHR 19 CHR 20 CHR 21 rr Greset Figure 18 Histogram plot providing an overview of amplified red and deleted blue regions shared across the samples on either the entire genome as shown or on selected chromosomes Both the menu and the display may be used to control which chromosomes are displayed left click in the menu to toggle on off right click in menu or graph to show only that chromosome For a sample centric visualization invoke the copy number classification plot e Close the histogram plot by selecting the red X in the upper right corner of the plot e Select Plot detected regions again Choose the Plot copy number classification option and select OK Copy Number Analysis in Partek Genomics Suite 6 6 15 The copy number classification view provides an overview of all the samples and the copy number regions on each chromosome Figure 19 Each sample is drawn as a separate column next to the chromosome Amplified regions are depicted in red deleted regions in blue and the regions with no copy number change are depicted in white Sample names are given across the top As shown in the insert in Figure 19 displaying fewer chromosomes will show the detail for each sample more clearly File Window Chromosomes style W Amplification W Deletion Unchanged Left dick
10. Copy Number Analysis in Partek Genomics Suite 6 6 Introduction At a simple level copy number analysis can be thought of as a gene dosage question where are the regions of the genome that have altered abundances increased or decreased Furthermore what genetic elements are in those regions and how might a change in the abundance of the genes in those chromosomal regions affect phenotypes To answer those questions the copy number analysis workflow in Partek Genomics Suite PGS uses a variety of commercially available SNP genotyping arrays with closely spaced genomic markers Affymetrix and Illumina as well as comparative genomic hybridization CGH arrays Agilent NimbleGen or custom spotted arrays to detect amplified or deleted regions within and shared across samples The workflow Figure 1 begins with either intensity data or copy number data depending on the array platform allows for organization of numerous samples into projects provides multiple options for data analysis and enables easy integration with other workflows This tutorial will illustrate how to e Import the data in Partek Genomics Suite Perform exploratory data analysis Estimate copy number for each marker Detect and analyze regions of copy number variation Find regions shared across samples Create a list of regions that meet certain criteria Find genes that overlap the regions of interest Learn about other annotations for regions V
11. L T 504 Female 07 27 07 2 77112 5 42197 2 04939 IC_580T CEL pi IC_580T CEL T 580 Female 07 28 07 1 96063 1 80866 2 01355 IC_594T_FF CE IC_594T_FF CE T 594 Female 07 27 07 1 95618 2 32267 2 17715 IC_957_FF CEL IC_957_FF CEL T 95 Female 07 27 07 1 13677 1 48696 1 0251 Figure 7 Viewing the paired copy number spreadsheet Each row represents one cancer sample the marker columns show the copy number ratio of tumor to normal for each pair of samples Creating a reference baseline of normal samples All copy number experiments give relative copy numbers sample compared to a reference Basically PGS needs to be given a normal reference or baseline for the copy number calculation It could be an existing reference file provided by Partek Option 1 in Figure 8 see the Affymetrix and Ilumina baseline files available at Help gt On line Tutorials gt Baseline Files or a reference file previously created from a set of normal samples Option 2 or a subset of samples within the current project Option 3 There are two different ways you may create a reference baseline cnmodel file create a reference from a collection of normal samples not included in this experiment or use samples from the current experiment as the reference baseline The resulting dialog 1s shown in Figure 8 To generate a pooled reference file from a collection of normal samples first import only the normal samples and then use Tools gt Create Copy Number Bas
12. Region Report Diploid copy number from 1 7 to 2 3 Result file segmentation txt Report unchanged regions Report SNP and CNV counts Lx Figure 11 Setting the key parameters of the genomic segmentation procedure The resulting spreadsheet segmentation shows one row per genomic region per sample Figure 12 The columns provide the following information 1 4 Genomic location of the region 5 Sample ID 6 Description of the copy number change amplification deletion etc T The length of the region in base pairs 8 The number of markers in the region Q Marker density in the region region length divided by the number of markers 10 Geometric mean of the copy number of all the markers in the region 11 Minimum p value of the one sided t tests of the difference of the copy number in column 10 vs the diploid range urrent Selection w i hwi ens J33 699815 2 25919 L Erum 1550 a 1p36 J5 lp36 33 e 92 1 854 3 chri E 126 123 15320 1p36 22 151 Deletion 1230 a4 1464 a 50716 4 cori 1231 25588234 1p36 22 151 Unchanged 13272914 a062 1645 a 87173 ae cori 25588 25699 1p36 11 151 110998 s0 96 a 27463 6 cori 25699 1p36 1 11 151 Unchange lez 20842 288 2183 4 1 81903 T chri 26 20328074 33393918 1p36 ll 151 7065844 3535 1998 82 1 41851 B chri 32 33393918 3360 06055 1p35 1 1 151 Deletion 212137 107 1982 59 1
13. ach sample has its own reference Create a copy number spreadsheet gt Unpaired samples use a common reference Create copy number and allele ratio spreadsheets F igure 5 Viewing the Copy Number Creation dialog Copy Number Analysis in Partek Genomics Suite 6 6 5 In the current example to estimate the copy number from allele intensities PGS will compare a cancer sample tumor to the peripheral blood sample normal from the same individual using the SubjectID column to identify the sample pairs In other words the allele intensities of each normal sample will serve as the baseline for the calculation the algorithm assumes that normal samples have 2 copies of DNA at each locus If the intensity of a probe is 2 times brighter than the baseline it has twice as much DNA at the location on the genome which the probe targets Conversely half the normal intensity will point to a single copy of DNA at given locus e Inthe Import section of the Copy Number workflow select Create copy number from allele intensities only e To proceed with the analysis select Paired samples in the Copy Number Creation dialog Figure 5 and select OK e The Create Copy Number from Pairs dialog will appear Figure 6 The baseline samples are defined by Column 3 Tumor and the Baseline category should be N normal e Match the sample pairs uses Column 4 SubjectID as the parameter that pairs the samples from each patient Set these options acc
14. axis of each marker grey dots of the sample highlighted in the heat map track In the other words grey dots are a visualization of copy numbers in the cells of the copy number spreadsheet While the grey dots represent the raw copy numbers the bold heavy dots represent the smoothed copy number average of 30 adjacent markers by default More about the chromosome view and different ways to customize it can be found in the Chromosome Viewer User Guide Help gt On line Tutorials gt User Guides e To examine a different sample click on another row of the Heatmap track Toggle through different samples and notice the different pattern in copy numbers across samples Copy Number Analysis in Partek Genomics Suite 6 6 8 e To zoom in use the magnifying glass icon IS at the top of the window e To view another chromosome use the down arrow next to the chromosome location box at the top of the window e When you have finished visualizing the raw copy numbers select the red X at the top right of the window File View Window k q chr 1 0 249250621 Tracks Ensembl Transcripts hg19 Ensembl Transcripts Ensembl Transcripts hg19 Heatmap 2 IC_Intensities_SNP6_pairedcopynumber l Profile Copy Number Ensembl Transcri p ts 7 Cytoband hg19 Genomic Label Co py Number 151 1 a HI New Track i Remove Track Copy Number Mm 151 400 S TR TLE ae Strand Track height Color IBps 62 3MBps
15. bles the detection of allelic imbalance even in cases when alleles are amplified or deleted e g 3xA and 1xB Moreover LOH can be considered a special case of AI e g 1xA B allele deleted Figure C3 Therefore due to its improved robustness ASCN may be a preferred application in tumor focused applications normal LOH allelic imbalance Figure C3 Loss of heterozygosity LOH as a special case of allelic imbalance The situation on the left represents a normal heterozygous SNP with one copy of each allele Copy Number Analysis in Partek Genomics Suite 6 6 27 References Diskin SJ Li M Hou C Yang S Glessner J Hakonarson H Bucan M Maris JM Wang K Adjustment of genomic waves in signal intensities from whole genome SNP genotyping platforms Nucleic Acids Res 2008 Nov 36 19 e126 Ramakrishna M Williams LH Boyle SE Bearfoot JL Sridhar A Speed TP Gorringe KL Campbell IG Identification of candidate growth promoting genes in ovarian cancer through integrated copy number and expression analysis PLoS One 2010 Apr 8 5 4 e9983 Yamamoto G Nannya Y Kato M Sanada M Levine RL Kawamata N Hangaishi A Kurokawa M Chiba S Gilliland DG Koeffler HP Ogawa S Highly sensitive method for genomewide detection of allelic composition in nonpaired primary tumor specimens by use of Affymetrix single nucleotide polymorphism genotyping microarrays Am J Hum Genet 2007 Jul 81 1 114 26 Copy Number Analysis in Partek Geno
16. e average intensities of regions and if the corresponding cut off p value is below the P value threshold The genomic size of a region is defined by the number of genomic markers included in the region Minimum genomic markers while the magnitude of significant difference between two regions is controlled by Signal to noise simplified it could be thought of as the difference in copy numbers between the regions If the t test is significant it can be concluded that the region differs significantly from its nearest neighbors with respect to copy number However a second step 1s needed to identify the exact nature of the difference 1 e whether the difference is an amplification or deletion In this stage two one sided t tests are used to compare the mean copy number in the region with the expected normal diploid copy number For a detailed explanation of the genomic segmentation procedure please consult the Understanding Genomic Segmentation white paper Help gt On line Tutorials gt White Papers A reader interested in in depth optimization of the segmentation procedure 1s encouraged to refer to the tutorial Optimizing Copy Number Segmentation in Partek Help gt On line Tutorials gt Copy Number Copy Number Analysis in Partek Genomics Suite 6 6 10 Genomic Copy Number Segmentation Spreadsheet 2 Segmentation Parameters Minimum genomic markers 50 P value threshold 0 001 Signal to noise 0 3
17. eline with the intensity spreadsheet selected Once you have created a cnmodel file this way use the top radio button Option 2 in Figure 8 The other way to generate a baseline for copy number calculations is to use some or all of the samples in the existing experiment Option 3 in Figure 8 Using this option copy number estimates will also be generated for each normal sample compared to the pool of normal samples For more information about using unpaired samples in copy number calculations please consult Understanding Copy Number Creation Help gt On line Tutorials gt White Papers Copy Number Analysis in Partek Genomics Suite 6 6 7 10 AFFX 5C 2 18389 1 45485 1 55652 2 08497 1 88579 1 98325 2 16272 1 94614 1 97389 1 40598 This operation will analyze the intensity spreadsheet to generate a copy number spreadsheet and allele ratio spreadsheet Specify a reference to use when creating copy number Use an existing reference file cnmodel Option 2 0 Use Fartek distributed baseline Option 1 0 Use all samples in the spreadsheet Spreadsheet 1 IC_Intensities_SNP6 C Use only reference samples from spreadsheet Spreadsheet 1 1C_Intensities_ SNP6 fe Option 3 Reference column 3 Tumor Reference value WN Results Specify the output files that will be created when creating copy number and allele ratio from spreadsheet 1 IC_Intensities SNP6 Copy number result IC_Intensities_SNP6_copynum
18. entation spreadsheet with a database of genomic abnormalities characteristic for particular diseases or syndromes and quick identification of possible matches The user can choose between the Partek distributed database 60 syndromes or use a custom one requires the following information organized by column the name of the abnormality chromosome number starts location and stop location The input to Test for known abnormalities should be a list of aberrations for each sample do not include unchanged regions in the input or every syndrome will be shown as positive The power of this feature lies in the information contained in the database which can be easily created or customized End of Tutorial This is the end of the Copy Number in Partek Genomic Suite v6 6 tutorial The tutorial described all the steps needed for copy number analysis data import detection of genomic regions with copy number changes analysis of regions with copy number changes and comparison of those regions with genomic databases If you need additional assistance with this data set please call our technical support staff at 1 314 878 2329 x350 or email at support partek com Last revision October 2014 Copyright 2014 by Partek Incorporated All Rights Reserved Reproduction of this material without expressed written consent from Partek Incorporated is strictly prohibited Copy Number Analysis in Partek Genomics Suite 6 6 22 Appendix A GC wave
19. er 6 Scan Date Reporting options Add columns with sample ID lists 2 Result File segment analysis Browse 4 Caneel Figure 15 Viewing the Analyze Segments dialog Copy Number Analysis in Partek Genomics Suite 6 6 13 The summary segment analysis spreadsheet Figure 16 shows one region per row The columns provide the following information 1 4 Genomic locations of the region Total number of samples Number of samples with amplifications and the average amplified copy number respectively Number of samples with deletions and the average deleted copy number respectively A indicates that a region with particular characteristic does not exist or cannot be computed e g if a region is not amplified in none of the samples the average 5 6 7 8 9 10 11 12 Number of samples with no change in copy number and the average copy number in those samples respectively Number of markers in the region Length of region in base pairs Two columns per sample the average copy number in each sample as 13 14 15 Total number of samples with copy number aberrations well as the copy number change status of the same sample e g amplified deleted unchanged depending on the copy number and the threshold for unchanged defined in the segmentation dialog amplified copy number will be shown as This list may be filtered to contain only the region
20. es by category Plot detected regions Filters for shared segments preseseeeeeneanenseerenenneeseeenanemsasseseerensneseseereaneeseseentennenesasereenenenseeeeenenensssneurensesseneeeene 7 Affymetrix optional _ Integrate with LOH amp AsCN illumina P eC PPP PPP PPP PPP errr Ca Overlap annotations dbSNP amp known genes Integrate with gene expression optional Figure 1 Overview of the copy number analysis in PGS The analysis of Affymetrix CEL files starts with the import of allele intensities and the copy number calculation is performed by PGS For other vendors including Affymetrix CHP files Agilent NimbleGen and Ilumina PGS imports copy number or log ratio data as provided by the individual vendor software Integration with loss of heterozygosity LOH or allele specific copy number AsCN workflows requires SNP genotype data Integration with the gene expression workflow requires gene expression data Importing the data When starting a new experiment the first step involves importing samples e Open the Copy Number workflow within PGS by selecting it from the Workflows drop down list in the upper right corner and select Import Samples Figure 2 6 moan CopyNumberSaretes E Specify Data Type Import Allele Intensity This option requires Create copy number from intensities Import from Affymetrix CEL files Import Copy Number or Log Ratio These options do not need Create c
21. fmt Browse 2 Allele ratio result IC_Intensities_SNP6_alleleratio fmt Browse E Cancel igure 8 Viewing the Unpaired Copy Number dialog Three kinds of baseline references are possible using a baseline file distributed by Partek Option 1 using a previously created cnmodel reference file Option 2 or using some or all of the samples in the current experiment as the reference Option 3 Visualization of copy number of allele intensity data The result of copy number estimation per marker can be visualized by invoking the chromosome view from the Copy Number workflow At this stage of the workflow you can visualize only the raw copy number estimates of each marker Figure 9 e Select Plot chromosome view from either QA QC or Visualization Figure 9 If prompted to Select an annotation source select Ensembl Transcripts e The Track Wizard may prompt for which spreadsheet should be used for visualization either the intensity or copy number spreadsheet may be chosen At this step select the Copy Number 2 spreadsheet and Create Each of the ten cancer samples is represented by a single row in the Heatmap track below Ensembl Transcripts with the color pattern ranging from blue no copies to red four or more copies of a marker The name of the selected sample can be found to the left of the selected row in the Heatmap track The Profile track is located below the heat map track and shows the copy number value y
22. hen copy number is studied in combination with SNP genotype Please note that irrespective of the preservation of total number of copies the biological effect is still important as recessive mutations would no longer be masked by their dominant normal counterparts E ee _ Figure C1 Possible mechanisms of LOH and their impact on copy number Left panel heterozygous SNP numbers indicate the number of copies of each allele normal or most common allele green mutant red Middle panel hemizygous deletion leading to the loss of normal allele Right panel duplication of the mutant allele The situation in the middle panel changes the gene copy number while the situation in the right panel is copy number neutral E i E A solution for detection of copy neutral events is the integration of copy number workflow with LOH or the use of allelic imbalance AI under the Allele Specific Copy Number AsCN workflow advantages of AI over LOH are discussed below With this approach the copy number data are supplemented with SNP genotyping data currently available with Affymetrix and Illumina to label the genomic regions in the following fashion amplification without LOH AI amplification with LOH AI deletion without LOH AT deletion with LOH AI copy neutral LOH AI Figure C2 The last category copy neutral LOH AI is the added value of the workflow integration Copy Number Analysis in Partek Genomics Suite 6
23. ing the experiment such as file names Subject ID Gender etc The rest of the columns are individual markers from the microarray with the logs normalized intensities associated with each marker marker labels are the column headers Importing the IC_Intensities_SNP6 fmt is the equivalent of importing the 20 samples files and adding sample attributes Copy Number Analysis in Partek Genomics Suite 6 6 3 1 IC_Intensities_SNP6 a Current Selection IC_151N CEL pimg 5 6 a 8 Gender Scan Date AFFX 5Q 123 AFF N 151 Female 07 27 07 1 58375 0 94 T 15i Female 07 27 07 1 75068 0 91 N 201 Female 07 28 07 2 14467 0 99 T 201 Female 07 27 07 1 73562 0 86 N 22 Female 07 27 07 2 20981 1 11 T 22 Female 07 27 07 1 88953 1 07 N 258 Female 07 27 07 1 89186 1 08 Figure 3 Viewing the spreadsheet with the intensity data for the copy number tutorial Each row represents one sample Setting the SampleID If you plan to do any kind of genomic integration with other studies e g gene expression then the results of both studies must be tagged with the same sample identifiers The sample identifier must be specified before analysis results are created for copy number experiments e Under the Import section of the Copy Number workflow select Choose sample ID column Select 4 SubjectID for Sample ID Column Exploratory analysis As an introduction to the copy number analysis you might want to explore the intensity data f
24. isualize the data at any of the steps Please be aware of an inherent limitation of the analysis of copy number regions the inability to detect copy neutral events 1 e copy neutral loss of heterozygosity LOH or copy neutral allelic imbalance One way to deal with these issues is to supplement the copy number analysis with SNP genotyping data currently available with Affymetrix or Ilumina arrays Figure 1 For more information on integration of copy number with LOH or Allele Specific Copy Number AsCN please consult Appendix C Note The following tutorial was prepared using PGS version 6 6 As PGS is a rapidly evolving software application future versions of PGS may show different screenshots than what are displayed within this tutorial To ensure that you are using the most current version of Partek Genomics Suite please use Help gt Check for Updates from within PGS There may be slight differences between the screenshots shown in this tutorial and what you observe due to different operating systems and different versions of PGS Copy Number Analysis in Partek Genomics Suite 6 6 1 Illumina Agilent NimbleGen Other TPR RRR EET E TETHER REET TESTE EEE ERED Import allele intensity data Import copy number Create copy numbers or log ratio data Copy number estimates j Map breakpoints gt Detect amplifications amp deletions Detect changes Detect changes in multiple sampl
25. ive filter please click on the interactive filer icon in the main window J4 with the Amplified spreadsheet selected e Set the Column drop down list to 8 Total Deletions e Type 0 in the Max box and press enter Figure 22 Column 8 Total Deletions Min O Max ol S Figure 22 Configuring the interactive filter tool Copy Number Analysis in Partek Genomics Suite 6 6 17 A filtered list of 60 regions will appear Please note that the deletions are no longer present in the list The yellow and black bar on the right hand side of the spreadsheet not shown indicates that the spreadsheet has been filtered the height of the bar depicts the proportion of the filtered entries with respect to the number of entries in the original spreadsheet e To save the filtered list right click on the title of the Amplified spreadsheet in the list of spreadsheets left pane of the main window and select Clone e Set the Name of resulting copy to Amplifiedonly and set the drop list of Create as a child of spreadsheet to 2 segmentation segmentation txt Figure 23 Select OK e Note that the cloned spreadsheet has ptmp after its name in the spreadsheet navigator This indicates that the spreadsheet has changed but has not been saved e Save the new spreadsheet amplifiedonly by selecting the Save icon in the main window kel For File name type in Amplifiedonly and Save Notice that both the ptmp and no longer appear CP Clone Spread
26. mics Suite 6 6 28
27. nd l A Copy Number E Amplification W Deletion O Unchanged ma 400 eC ab saa Color 2 g q 3 n TS ante gay E E E ee B K vO PE RRETA pe ee ee rere a Sey wh AP Cee r _e J J J ULD LI CUD acd Ullal Deg UT TAT a CE Mls IBps 62 3MBps 124 6hMSps 186 9MBps 249 31 Ca Crese baal a Figure 13 Chromosome view of segmentation spreadsheet Tracks may be reordered by dragging track names in the Tracks navigator on the left Analysis of shared regions with copy number variation Once the regions with amplifications and deletions have been detected the next step is to compare the regions across the samples and detect those which are shared by multiple samples Figure 14 For instance cancer samples such as the ones used in this tutorial Copy Number Analysis in Partek Genomics Suite 6 6 12 are characterized by genomic instability and multiple mutations Therefore it is most interesting to pick up only those copy number aberrations that appear in multiple or even all of the samples as those might be good candidates for diagnostic or prognostic markers or might involve key genes responsible for cancer pathogenesis The Analyze detected segments function will identify genomic segments across all samples Please note that since aberrations may be broken into smaller regions the common regions may contain fewer than the minimum number of markers specified in
28. o for example 50 not shown If only a part of a gene is amplified or deleted it is possible that a translocation event took place that split the gene in two parts Further annotation options PGS offers even more options for annotating the list of shared regions For instance the Find overlapping genes tool enables annotating the regions with annotations from the Database of Genomic Variants As this database is an overview of known structural alterations of the human genome spanning more than one thousand base pairs it can be used as a Starting point for detection of previously described copy number alterations or detection of novel copy number alterations in these samples Copy Number Analysis in Partek Genomics Suite 6 6 21 Overlap with known SNPs under Copy Number Analysis stage of the workflow invokes the dialog quite similar to the one shown in Figure 26 but with an additional option to annotate the regions with information from the dbSNP database Two additional columns will be added to the right of the initial spreadsheet the list of SNPs described in each region rs numbers and the total number of SNPs in the region If the list of SNPs is very long you may need to right click on the row header and use the Create list of dbSNP command Another functionality based on annotation is Test for known abnormalities also under Copy Number Analysis in the workflow which enables the comparison of the regions listed in the segm
29. ontain contaminating healthy Copy Number Analysis in Partek Genomics Suite 6 6 9 tissue and cancer cells are quite heterogeneous with respect to multiple chromosome aberrations The number of copies of each marker created in the previous step will now be used to detect the genomic regions with copy number variation 1 e to identify amplifications and deletions across the genome e Open the Copy Number Analysis section of the Copy Number workflow e Select Detect amplifications and deletions The dialog in Figure 10 will appear e Select Genomic Segmentation and then select OK e Specify the Minimum genomic markers as 50 and leave the remaining options at their default settings shown in Figure 11 To start the segmentation select OK Detect Amplifications and Deletions X Detect Amplifications and Deletions Select a method for detecting amplifications and deletions The segmentation algorithm finds break points in the data The Hidden Markov Model HMM finds regions from a specified list of states Genomic Segmentation HMM Region Detection mmi Figure 10 Viewing the Detect Amplifications and Deletions dialog Genome segmentation itself is divided into two steps In the first step each region 1s compared to an adjacent region in order to tell whether both have the same average copy number and if a breakpoint can be inserted by using a two sided t test the t test actually compares th
30. opy number from intensities Import from Agilent data Load a project following Illumina GenomeStudio export Import Illumina Final Report Text File Import from NimbleGen Pair or CGH Data Summary files D Open existing file _ Open existing project L Figure 2 Viewing the Import Copy Number Samples dialog Copy Number Analysis in Partek Genomics Suite 6 6 2 For Affymetrix arrays PGS can import CEL files containing allele intensities and copy number estimates will be calculated from those intensities during the subsequent steps If you are using Agilent Illumina NimbleGen or Affymetrix CHP files PGS can import files containing calculated copy numbers or log ratios To learn more about particular import procedures please consult the respective vendor specific tutorials available under Help gt On line tutorials or contact our technical support team at support partek com A common artifact affecting all types of arrays is GC waves which have been shown to cause false positive copy number variations The exact cause of GC waves is not fully understood but they have been associated with low quality sample DNA and or DNA degradation If your Affymetrix data are affected by GC waves PGS enables you to implement a GC wave correction procedure when importing Affymetrix CEL files following the protocol by Diskin SJ et al Nucl Acids Res 2008 For details please consult Appendix A of this t
31. or insight into possible groupings within the dataset or to detect any outliers or sample swaps One method is the principal component analysis PCA scatterplot e Inthe QA QC section of the workflow select Principal component analysis PCA The rotated PCA plot is shown in Figure 4 Each dot on the plot corresponds to a single sample 1 e one row of the spreadsheet and can be thought of as a summary of all normalized marker intensities in the sample PGS automatically uses the first categorical column in this example column 3 Tumor to color the dots Depressing the mouse wheel while moving the mouse or using the Rotate mode will rotate the plot to find the optimal angle of view As seen in the figure the peripheral blood samples 1 e normals cluster together whereas the cancer tissue samples are more dispersed and show considerable variability with respect to their total allele intensity profiles That finding corresponds very well to the underlying biology and the genomic variability of cancer cells To learn more on how to customize the PCA plot please consult Partek On line Documentation Help gt User s Manual Note the PCA plot can be invoked on allele intensities as shown here but also on the copy number spreadsheet generated in the next step as well Therefore PCA is also available to users analyzing data from other vendors Copy Number Analysis in Partek Genomics Suite 6 6 4 HOM Color 3 Tumor
32. ordingly and select OK e The C_Intensities_SNP6_pairedcopynumber spreadsheet is created Figure 7 E 2 Create Copy Number from Pairs Spreadsheet 1 xs Identify the baseline samples Column 3 Tumor x 2 Baseline category N Y 2 Match the sample pairs Column 4 SubjectID v 2 Result file IC_Intensities_SNP6_pairedcopynumber fmt Browse oe K Figure 6 Viewing the Create Copy Number from Pairs dialog Alternatively if paired samples are not available the Unpaired samples option should be selected in the Copy Number Creation dialog shown in Figure 5 Both paired and unpaired copy number procedures produce a copy number spreadsheet in this tutorial C_Intensities_SNP6_pairedcopynumber Figure 7 Each row represents one of the tumor samples Columns 7 include copy numbers of each marker columns 1 6 are identical to those in the C_Intensities_SNP6 spreadsheet Copy Number Analysis in Partek Genomics Suite 6 6 6 Current Selection IC_151T_FF CEL pimg oh 4 Tumor SubjectiID E _FF CE 151 5 6 Gender Scan Date T X 5Q 123 8 9 AFFX 5Q 456 AFFX 5Q 789 Female 07 27 07 2 42484 1 9396 2 86727 201 Female 07 27 07 1 2475 1 72845 1 25524 22 Female 07 27 07 1 38206 1 91484 1 32699 258 Female 07 27 07 2 45219 2 26834 2 17861 315 Female 07 27 07 1 85366 1 84564 2 01329 I1C_3997_FF CE IC_399T_FF CEN 399 Female 07 27 07 2 41627 1 57144 2 42373 IC_504T CEL pi IC_504T CE
33. oss the genome shared by at least eight samples make the following changes in the Configure Criteria dialog Figure 20 Set the Name as Amplified Set the Spreadsheet to 2 segmentation summary segment analysis Set the Column drop list to 6 Total amplifications Set the box Include values greater than or equal to 8 Uncheck the box Include values less than or equal to Select OK Copy Number Analysis in Partek Genomics Suite 6 6 16 CF Configure Critena Data source Name Amplified Spreadsheet 2 segmentation summary segment analysis Column 6 Total Amplifications Configure criteria Indude values greater than or equal to Figure 20 Viewing the Configure Criteria dialog e Select Save to save the list of 86 amplified regions confirm the name of the list in the following window OK and Close to exit the List Creator dialog However please note that although the list does contain the regions amplified in eight or more samples some samples may also contain deletions in the same regions Column 8 in Figure 21 For the downstream analyses we may want to filter out those regions 1 e to have the final list of regions that are only amplified in 8 or more samples There are various options to perform the filtering in PGS for this tutorial we will use the interactive filter File Edit Transform View Stat Filter Tools Window Custom Help Dex hho tS OD Bie HQ Workflows 1 IC_Intensities_SNP6 a _ Curren
34. ows tab Select the source and destination spreadsheets as shown in Figure 24 Select OK This will append the rows of the deletedonly spreadsheet into the amplifiedonly spreadsheet The amplifiedonly spreadsheet now contains 154 rows 60 amplifications and 94 deletions e To save the new joined list as a separate spreadsheet right click on the amplifiedonly spreadsheet in the left pane and select Clone For Name of resulting copy select an appropriate file name such as amplified_or_deleted For Create as a child of spreadsheet select 2 segmentation Ssegmentation txt Select OK e The Merge Spreadsheets command appended the deleted regions onto the amplified list Now the amplifiedonly list contains both the regions amplified in 8 or more samples and regions deleted in 8 or more samples To return the amplifiedonly spreadsheet to its original state where it only contains the regions amplified in 8 or more samples right click on the amplifiedonly spreadsheet in the spreadsheet view on the left and select Revert to Last Saved State The amplifiedonly spreadsheet will now contain only the 60 regions The joined list amplified_or_deleted will be the starting point for the following steps of the tutorial and may be used for LOH and AsCN workflows for integration with copy number data please see the respective tutorials as well as appendix C of this tutorial Merge Spreadsheets Insert Columns Append Rows Source Spreadsheet Sp
35. r e Please select the Genomic Segmentation 2 segmentation Segmentation txt when asked to Choose the spreadsheets that you would like to add to the plot This adds the Regions track which is the primary visualization of the segmentation results to the plot e Fora more comprehensive visualization also add the Copy Number 2 C_Intensities_SNP6_pairedcopynumber track in the Track Wizard and Create Figure 13 The Regions track entitled Genomic segmentation depicts the segmentation results each line represents a single sample sample names can be found on the left side of the selected line while the amplified deleted and unchanged regions are shown as red blue and white respectively The Heatmap and Profile track have already been discussed previously However please note that the Profile track depicting the sample selected in either Heatmap or Regions track now also includes the average aberrations for each region File View Window Tracks h G chr1 0 249250621 Tracks z Ensembl Transcripts hg19 Ensembl Transcripts Ensembl Transcripts hg19 Regions 2 segmentation segmentati Ensembl Transcripts Heatmap 2 IC_Intensities_SNP6_pat Profile Copy Number Genomic Segmentation segmentation txt Cytoband hg19 Amplification W Deletion Genomic Label 151 n l E Copy Number a0 New Track Remove Track Sa a iii ETIN CSRS BT ii LB a a Ta Track ual mm mom Stra
36. readsheet Name 2 1 deletionsonly deletionsonly hd Destination Spreadsheet Spreadsheet Name 2 1 amplifedonly amplifiedonly v Figure 24 Configuring the Merge Spreadsheets dialog Find overlapping genes While the list of regions containing aberrations is interesting the underlying biology is best investigated by identifying the genes contained in the aberrations One way to annotate interesting regions in PGS is to generate a list of genes which overlap the regions The Find overlapping genes function allows two options Figure 25 the main difference being the focus of the output For a region centered view which may be more appropriate for cytogenetic studies you may add a new column with the genes contained in the regions The gene centered view suitable for genomics integration 1s available by creating a new spreadsheet with genes that overlap with the regions Copy Number Analysis in Partek Genomics Suite 6 6 19 gy Find Overlapping G e Find Overlapping Genes Select a method for annotating regions with genomic features D Add a new column with the gene nearest to the region Figure 25 Options in Find Overlapping Genes dialog e Make sure the amplified_or_deleted spreadsheet is selected in the spreadsheet list e Select Find overlapping genes in the Copy Number Analysis section of the workflow e As this tutorial will explore genomic integration select Create a new spreadsheet with genes tha
37. roup of samples over the entire genome and to decide on the parameters for the next step List Creation Copy Number Analysis in Partek Genomics Suite 6 6 To obtain the histogram plot select Plot Histogram from the Plot Detected Regions dialog Ensure the proper spreadsheet is selected and select OK 14 Plot Detected Regions ale X This dialog will plot a karyogram of the regions in the selected spreadsheet Spreadsheet with genomic regions 2 2 segmentation summary seqgment analysis x Plot Histogram 2 e C Plot Copy Number Classification eo Gea Ga Figure 17 Viewing the Plot Detected Regions dialog The Karyogram View Figure 18 shows an overview of the shared regions across the genome with amplified regions coded in red and deleted regions coded in blue The histogram heights reflect the number of samples that share that kind of aberration at a particular location For example the long arms of chromosomes 3 and 7 seem to be amplified in the majority of tutorial samples and most samples share the deletion in the long arm of chromosome 4 e Use the mouse over function to get the information on cytobands as well as on the exact number of shared regions at each position and the number of samples sharing that type of aberration Commie i sa File Window Chromosomes style Amplification M Deletion Left dick to Right click to show only 2 CHR 1
38. s PGS can preferentially remove probesets from long fragments from the analysis to increase the resulting signal to noise ratio within the sample and can result in cleaner data a eee tea eS PEI 4 No Filter i 1 PEIN E R oe RN 7 ah sf Iu tales 4 f EA i ee k fadt pre a 4 d h s iaa POTTS ad j ia TAN j 4 f E k t E E T ebay PSL i 2 in ee i L sN te ae toe 8 S e a PEPE peterety DOTE DOOIE ooo Ot pga arte Tna n G E TS Fgh n a Pel halt Sh eae te as AR gyi ae p i A A u DE pi l f 5 wid i ee ea i Oy ew here hee v Ai a r S di t Ris 9 hia ie aS DoD Peper E AAE dager as P eee PNT aX EEEN R o PR Oe we oe ee ee Ve rar aa ye I A f De P 7 ey THE y r i ee wr me te et ree Figure B1 Probesets that hybridize to longer fragment lengths have higher signal to noise ratios in FFPE samples The top track shows the inclusion of all probesets areas of copy number deletions can be observed The middle track shows the inclusion of probesets with fragment lengths FL less than 750 bp and the same copy number deletions can be detected The bottom track shows the data for the probesets that were filtered out note that aberrations are smaller and harder to differentiate from the background noise Removing such probesets decreases noise and improves statistical power To change the distribution of restriction fragments used in
39. s that meet user specified criteria next section _ Current Selection 4 oO co co oo co om b b H H b b H iy i Figure 16 Viewing the Summary segment analysis spreadsheet Each row of the spreadsheet represents one genomic region shared across multiple samples Visualization of shared regions 47231926 43070430 43414701 49658417 52689480 52874472 52887805 53426984 55141568 103740604 107088052 392353131 39386079 122965748 124297671 127416908 43070430 43414701 52689480 52710591 52887805 53426984 55077283 95262530 104289463 183060840 107306530 39241139 124297671 127416908 127630013 4p12 4p11 4pii 4p1i 4qi1 4q1i 4q12 4q12 4q12 4q12 4q12 4q24 4q34 3 6q21 amp p11 22 8p11 22 8q24 13 8q24 13 8q24 21 6 Total Amplifications si si wy D DA w eja A B B ee Naj T Amplification Average Copy Number 2 4993 2 45194 2 42121 2 38827 2 40201 2 41097 2 38421 2 38421 2 3787 2 4597 4 32094 2 85275 2 83678 2 90256 3 00294 2 88703 SEJ 8 Total Deletions Oo oo to E oo wo 1 49239 1 55235 1 38959 1 42554 1 43174 1 46571 1 5426 1 5426 1 45503 1 41905 1 538767 1 09004 1 09004 1 45791 1 45791 1 45791 To visualize the regions shared across the samples PGS offers two plots histogram and copy number classification Figure 17 Both are intended to give an overview of the common aberrations in the g
40. sheet ee Name of resulting copy AmplifiedOnly Create as a child of spreadsheet 2 segmentation segmentation txt Canel App Figure 23 Configuring the Clone Spreadsheet dialog e Examine the Amplifiedonly spreadsheet and notice that it contains 60 rows Select the Amplified spreadsheet and notice that it also contains 60 rows when it contained 86 rows when it was created The reduced number of rows is due to filter that was just applied The yellow and black bar next to the scroll bar on the right of the spreadsheet indicates that a filter has been applied e Right click on the yellow black scroll bar and Clear Filter The number of rows should return to 86 For this exercise please perform the same sequence of steps to create a list of deleted regions shared across eight or more samples and remove the amplifications make sure that you have the summary segment analysis spreadsheet selected when you invoke the List Creator You should have a list e g deletedonly of 94 regions Now you will create a single list containing all of the amplifications only regions that occurred in 8 or more samples and all of the deletions only regions that occurred in 8 or more samples e To merge both lists amplified and deleted regions please select the amplifiedonly spreadsheet in the spreadsheet pane on the left Copy Number Analysis in Partek Genomics Suite 6 6 18 e Use File gt Merge Spreadsheets and select the Append R
41. t Selection 20 S 2 IC_Intensities_ SNP6_pairedcopynumbe L va 3 4 5 6 7 8 9 10 B i Chromosome Start Stop cytoband Total Number Total Amplification Total Deletions Deletion Total segmentation segmentation txt of Samples Amplifications Average Copy Average Copy Aberrations ne se summary segment analysis 1 20 29490217 29835260 20q11 21 10 9 3 12631 1 1 53524 10 2 20 29835260 29962144 20q11 21 10 9 3 65559 1 1 53524 10 IESE 29962144 29977855 20q11 21 10 9 3 83854 1 1 53524 10 i 4 20 29977855 30383685 20q11 21 10 9 3 98845 1 1 53524 10 s 20 30383685 30779482 20q11 21 10 9 3 85329 1 1 53524 10 I 6 20 30779482 31029062 20q11 21 10 9 3 67948 1 1 53524 10 IE 31029062 31333161 20q11 21 10 9 3 74463 1 1 53524 10 is 20 31333161 31431132 20q11 21 10 9 3 67619 1 1 53524 10 a 20 31431132 31517605 20q11 21 10 9 3 63155 1 1 53524 10 10 20 31517605 31947057 20q11 21 10 9 3 76102 1 1 53524 10 11 20 31947057 31952140 20q11 21 10 9 3 67761 1 1 53524 10 12 20 31952140 32281308 20q11 21 10 9 3 13519 1 1 53524 10 13 20 37215840 37273736 20411 23 10 8 2 89873 2 1 55754 10 14 MT 408 10626 10 10 4 14109 0 E 10 15 MT 10626 16149 10 10 4 02668 0 2 10 16 3 169755852 170996560 3q26 2 10 9 3 46989 0 9 17 f6 25720561 25984462 6p22 2 10 8 2 7203 1 1 51855 fo Figure 21 Viewing the initial list of regions amplified in amp or more samples e To invoke the interact
42. t overlap with the regions and select OK e The database that associates genes annotations with cytogenomic locations must be specified in the next dialog Figure 26 PGS offers a number of possibilities such as RefSeq and Ensembl or custom annotations for the latter option please use Manage available annotations If a database file is outdated or not present the user will be prompted to download the updated version before the analysis Select RefSeq Transcripts and OK to proceed E Output Overlapping Feature Report regions from the specified database Ww mRNA RefSeq Transcripts The Reference Sequence RefSeq collection aims to provide a comprehensive integrated nonredundant well annotated set of sequences induding genomic DNA transcripts and proteins Download required Click OK to download the file 0 AceView Transcripts AceView provides a curated comprehensive and non redundant sequence representation of all public mRNA sequences mRNAs from GenBank or RefSeq and single pass cDNA sequences from dbEST and Trace Download required Click OK to download the file O Ensembl Transcripts Ensembl transcripts are based on experimental evidence and thus the automated pipeline relies on the mRNAs and protein sequences deposited into public databases from the scientific community Ensembl Transcripts release 62 Ensembl transcripts are based on experimental evidence and thus the automated pipeline relies on
43. the analysis from the Filter menu select Filter Column gt Filter by Fragment Length Figure B2 at Tools Window Custom Help B Filter Rows gt Clear Column Filters Sample Columns Filter Out Response Variables Filter Out Factor Variables Filter in Genomic Variables Column Filter Manager Filter by Chip CEL pima ol lil Figure B2 Accessing the Filter by Fragment Length dialog PGS might prompt for a file containing a list of expected fragment sizes If this file is not found the software will download it from the Partek servers Figure B3 when you select Download Copy Number Analysis in Partek Genomics Suite 6 6 24 Please Specify Files For Genome WideSNP_6 Files Filter File Download Available Default Library File Folder C Microarray Libraries Change Figure B3 Download of fragment size filter from Partek server Once the filter file is downloaded you ll be prompted to set the minimum and maximum fragment length in base pairs that you d like to use for the analysis A histogram of the expected overall expected distribution is displayed to aid in your selection You can set the Min and Max by either typing in values or using the slider buttons Figure B4 After setting the filter select OK and note that a horizontal yellow bar has been introduced on the bottom of the spreadsheet indicating that the spreadsheet has been filtered on columns
44. the segmentation parameters Markers 593 Markers 258 Markers 558 Markers 478 s I Figure 14 The Analyze Detected Segments function detects and reports the regions across all samples The original detected region for each sample is outlined in gold with the number of markers contained in the region shown to the right Regions overlapping multiple segments are drawn in color with the number of markers shown below each region e Select Analyze detected segments function from the Copy Number Analysis section of the Copy Number workflow As shown in Figure 15 the Analyze Segments dialog can test for associations between copy number variations and sample categories by using the a test For example is there a difference in aberrations between cancer grades Since in this paired analysis all pairs ovarian cancer versus their normal share the same phenotype associations are not tested with this paired tutorial data Hence leave all the boxes unchecked and select OK to proceed r Analyze Segments 2 segmentation The total number of amplifications and deletions will be reported for each region with the same aberation state across all samples One or more phenotypes may be selected Each selected phenotype will be tested independent of other phenotypes for association with the copy number status Test association with phenotypes Optional E 3 Tumor 2 I E 4 SubjectID E 5 Gend
45. to toggle N Righ 3 cn N h i y k oO ight di ONA VNA s ON Ekoso SHO JAS BHO GH CHR 1 es 08S 08S cc 8Sc 08S GG 8S 08S cc 8S O8S cc 8S O8S GG 8S O8S coc 8S O8S cc 8Sc O8S CHR 2 BD dita Figure 19 Copy number classification plot providing an overview of amplified red deleted blue and the regions with no copy number change white in each sample The small inset on the bottom left shows the increased detail if just one chromosome is visualized Wil ba lid id ia Lil al e Creation of a list of regions Creating lists of shared regions can be described as the core step of the copy number analysis in a project with multiple samples patients For instance one can select all the deleted regions on chromosome 4 find all the regions spanning more than 50000 bp or pick up all the regions containing a certain number of markers In this exercise all the samples have same underlying phenotype and two lists will be created deleted and amplified regions across the genome shared by 8 or more samples Those lists can then later be used for integration with loss of heterozygosity and allele specific copy number tutorials refer to Appendix C for more information e To invoke the List Creator please go to Create region list in the Copy Number Analysis section of the Copy Number workflow e Start defining a new criterion by selecting Specify New Criteria To filter out all the amplified regions acr
46. utorial e As this tutorial will not read raw data files but will start from the data already imported into PGS please close the import dialog Cancel and proceed with the next section to learn how to calculate copy number estimates for Affymetrix CEL data The analysis of copy number or log ratios provided by Agilent Illumina NimbleGen or Affymetrix CHP platforms is described in the section Detection of Regions with Copy Number Variations Tutorial data set This example data set consists of 20 paired samples from an ovarian cancer study in which a fresh frozen tumor sample and peripheral blood were obtained from each of 10 female patients Ramakrishna et al PLoS One 2010 GSE19539 All 20 samples were analyzed using the Affymetrix Genome Wide Human SNP Array 6 0 For a discussion on formalin fixed paraffin embedded FFPE tissue samples please refer to Appendix B of this tutorial e Select Help gt On line Tutorials from the PGS main menu and download the data for the Overlapping Copy Number with LOH project in zip format e Unzip the files into a directory of your choosing e Select File gt Open browse to the folder containing unzipped the tutorial data select the file IC_Intensities_SNP6 fmt and select Open The datasheet will now open in the PGS main window Figure 3 The spreadsheet was generated from the import of SNP6 CEL files and shows all 20 samples on rows Columns 1 6 contain sample information describ
Download Pdf Manuals
Related Search
Related Contents
M0S07937_I - Servizio Assistenza Tecnica Polti 2lg - Leroy Somer BT MA E BTB MA JCUP QD3800 Manual Copyright © All rights reserved.
Failed to retrieve file