Home
MethLAB – A GUI for Analyzing DNA Methylation Data USER`S
Contents
1. a System Requirements R must be installed The program has been tested in systems with gt 2 GB of RAM and gt 1 5 Ghz processor However the analysis time will vary depending on the analysis selected and the processing speed of the system you use For example a standard analysis of 27 000 CpG sites and 300 subjects that included fixed effects for chip ID took less than a minute on a machine with 4GB RAM and a 1 5 Ghz dual core processor With random instead of fixed effects the analysis took 40 minutes on the same machine A 450k dataset with 200 subjects was also analyzed on the same machine and a fixed effects analysis took less than 3 minutes to finish However we strongly recommend using a system with higher RAM to run large 450k datasets b Installing R MethLAB operates as a package within R and requires R version 2 11 1 or higher R can be downloaded and installed from the following links For Windows OS http www biometrics mtu edu CRAN bin windows base For Mac OS http www biometrics mtu edu CRAN bin macosx c Installing the tcltk library MethLAB uses the tcltk library to provide a graphical user interface This comes pre installed with R for Windows Mac users who have not already installed the tcltk library may download and install the tcl library from http www biometrics mtu edu CRAN bin macosx tools Mac users will also need to follow an additional step Installing BWidget Download the BWidget tool using th
2. Open Phenotype File to bring up a dialog box Browse select your file b A preview file should be displayed Check the mae ge output window to see the dimensions of the phenotype file ees c Under Files select Open Methylation File and pon U browse select your methylation file A progress bar ESSER EAE ENA eae AA a will appear Note that on Macs the progress bar will not depict loading progress accurately Large files 74 Fle Preview e g 450K data may take 5 30 minutes to load the sample sample2 sample3 sampled sample5 sample6 cg00000292 0 80750945 0 7979212 0 80453941 0 80638774 0 80202174 0 8051169 first time Another preview file will appear Check cg00002426 0 78047290 0 74849131 0 77335669 0 78145955 1076444949 0 7574208 2900003994 0 08591150 0 10107286 0 09100838 0 08878010 010049684 0 0942546 cg00005847 0 16666286 0 17851227 0 16905708 0 20281076 018574976 0 1888238 preview file and output window to verify that the file cg00006414 0 08675518 0 08306357 0 09947211 0 10801474 0 09987742 0 0923918 d d ll c9000079810 04304761 0 05263126 0 05429628 0 04871844 0 04900410 0 0550003 oade proper y cg00008493 0 97424061 097256040 0 97335716 0 97194463 0 97326829 0 9756730 cg00008713 0 03572830 0 03038920 0 03416632 0 03384009 0 03712058 0 0398732 cg00003407 0 02224512 0 02065415 0 02004187 0 01949562 0 01954805 0 0199153 4 m To avoid having to reload the file in subseque
3. from the operator menu and type 3 in the Inclusion Criteria box E baa ee Inclusion Operator Inclusion Criteria hs booleanoperator b A similar strategy can be used to specify exclusions according to the example below Note that a MethLAB 1S currently enabled to Inclusion2 Operator Inclusion Criteria perform inclusions exclusions rs based on numeric variables only Exclusionl Operator Exclusion Criteria gt Y 2 booleanoperator Exclusion2 Operator Exclusion Criteria c More complex data manipulations can be accomplished by specifiying combinations Two inclusions and or a two exclusions connected by boolean Inclusion2 Operator Inclusion Criteria operators ae be specified For mm R example to limit by race and gender input the necessary criteria in the two inclusion criteria and use the boolean Inclusion1 Operator Inclusion Criteria 1 booleanoperator Exclusion Operator Exclusion Criteria p operator AND To avoid errors it is boolesnop rator imperative that you specify a boolean o y operator if you have two or more Exclusion2 Operator EE E EEN inclusion exclusion criteria as shown to the left TargetID cg01009664 4 2 2 CpG Selection 901013324 a By default MethLAB will analyze all CpG sites in the cg0 1015871 methylation file If you would like to analyze only a subset cg01015873 of CpG sites include the names of the CpG sites you would cg01017147
4. csv file where each column represents a SE Jromatrane B Z DEA A BSS EH dotee Sample and each row represents a CpG site The first column of the methylation file should A E contain unique CpG site labels The m sampled sample3 sampled sample5 samples sample7 sample ss methylation file cannot include any other Qo cg0000029 0 807509 0 797921 0 804539 0 806388 0 802022 0 805117 0 784016 0 780612 g0000242 0 780473 0 748491 0 773357 0 78146 0 764449 0 757421 0 772684 0 753131 g0000399 0 085912 0 101073 0 091008 0 08878 0 100497 0 094255 0 100875 0 086423 column with text text annotation can be included in the annotation file A sample g0000584 0 166663 0 178512 0 169057 0 202811 0 18575 0 188824 0 178833 0 175865 meth lation fi le is available at g0000641 0 086755 0 083064 0 099472 0 108015 0 099877 0 092392 0 097213 0 08909 y 0 0 0 0 cg0000798 0 043048 0 052631 0 054296 0 048718 0 049004 0 055 0 052958 0 048284 0 http genetics emory edu conneely MethL AB 0 0 0 0 cg0000849 0 974241 0 97256 0 973357 0 971945 0 973268 0 975673 0 967207 0 971517 cg0000871 0 035728 0 030389 0 034166 0 03384 0 037121 0 039873 0 031508 0 029605 10 cg0000940 0 022245 0 020654 0 020042 0 019496 0 019548 0 019915 0 024325 0 018463 11 cg0001019 0 578723 0 609522 0 641377 0 571899 0 599364 0 597456 0 585923 0 589293 wo ny nm wo eS WwW mN c The annotation file can contain essentially any gm Peg ee gE information but the fi
5. like to analyze in a txt or csv file The file should contain a cg01021465 cg01025762 column header as shown to the right Note that if fewer than 100 CpG sites are included in the analysis QQ plots and Manhattan plots will not be produced T Mesa ee Fes Analysis Inclusions amp Exclusions Output Window CpG Selection e contains 3 rows and 100 co ur me MT file contains 23998 rows and b Using the CpG Selection option under the Data Manipulation menu load the txt or csv file into MethLAB a Ti J zi Log Window Phenotype File not Selected Opening Phenotype file Done The Methylation file you last used will be use Opening Methylation file Done 4 ii 4 3 Data Analysis 1 Under the Analysis menu select Linear Regression to analyze data l F MethLAB LS ete Files DataMeanipulation Analysis Linear Regression Mindow Your phenotype iile contains 3 rows and 100 co Your methylation file contains 23998 rows and 2 A list of phenotype variables from the selected phenotype file will appear j Oo _ _ 4 76 Analysis les Sara Variables Selection Double Click on the Variable 1 chip Beta Model Linear Model 2 gender w 3 trait Covariate Type Random Effects Covariate FDR methods Methylation Dataset BH Complete CpG dav FDR Criterion 05 a b c d Beta Model Default Untransformed
6. 4 2 2 Unless a subset file is selected MethLAB defaults to the Complete CpG Dataset option which analyzes all CpG sites in the specified methylation file If a subset is selected MethLAB will perform the analysis and adjust for multiple testing with only that subset If Global Analysis is selected MethLAB will instead evaluate the association between the phenotype and average beta values across all available CpG sites This analysis seeks to identify global methylation patterns by fitting a linear model based on average beta values rather than individual CpG sites For simplicity the input and output formats are similar for complete subset and global analyses 3 Click the Analyze button to start an analysis under the specified model a progress bar should appear Note The progress bar may not be a very accurate representation of the progress in the case of a fixed effects analysis 2 done 5 Output Files 5 1 Text File MethLAB outputs a text file that contains the t statistics p values and flags indicating the Bonferroni Holm significance and FDR significance of each CpG site for a given model Additional fields containing CpG annotation information will be included if available see 4 0 d and 4 1 d Vo a3 gt Sa aye Insert e Page Layout Formutas Review JMP ze oes Calibri 42 A a SS Weep Text Genera gt Fa yA m a Ste J Format Painter aoe ee R
7. 9459 cg09523691 4 578764618 6 836E 06 FALSE FALSE 0 01051291 ATG12 5 2 Q Q Plots For every analysis with gt 100 CpG sites MethLAB produces both classic and modified quantile quantile Q Q plots for the log p values and for the t statistics with confidence intervals 10 Quantile Quantile plot Quantile Quantile plot Quantile Quantile plot for t statistics 5 3 Manhattan Plot ore Quantile Quantile plot for t statistics For analyses of gt 100 CpG sites MethLAB automatically outputs a Manhattan plot if an annotation file is specified with the chromosome and position information for each CpG site The column header for the column containing the chromosome information should be CHR and the column header for the column containing the position information should be MAPINFO 11 Manhattan Plot for association between methylation and trait O qe oo oo i a O E t eN E Chromosome 5 4 Plots of Individual CpG sites After the analysis is finished the number of CpG sites significant at the specified FDR will be indicated Alternatively if lt 100 CpG sites are included in the analysis a Holm cutoff will be used MethLAB automates plots of the beta values against the phenotype for the top CpG sites with regression lines based on the specified model Enter the ci ee number of plots to be generated in the There are 9 s
8. Beta Values Select either untransformed or logit transformed beta values as the dependent variable for your linear model The logit transform log beta 1 beta is equivalent to the M value or the log signal ratio commonly analyzed in the gene expression literature in this case it is the log ratio of methylated to unmethylated signal Covariate Type Default Continuous Covariate Because MethLAB allows both continuous and categorical covariates covariate type must be specified in the Covariate Type box before selecting the variable By default variables are continuous Ifa Class Covariate is selected the variable will be represented as a factor categorical variable in the linear model text box Linear Model Double click variables from the Variable Selection box to select the independent variable and covariates The model will appear in the Linear Model box in the form x1 x2 x3 x4 where x1 is the independent variable to be tested for association and x2 x4 are additional continuous or categorical covariates Random Effects Covariate Users may choose to adjust for technical variation by modeling batch or chip ID as fixed or random effects To include as a fixed effect simply enter batch or chip ID into the linear model as a categorical covariate To include as a random effect select batch or chip ID as a Random Effects Covariate e Note that MethLAB has been optimized to perform fixed effects analyses extremely rapidl
9. MethLAB A GUI for Analyzing DNA Methylation Data USER S MANUAL OVERVIEW OF METHLAB WORKFLOW Select phenotype methylation and annotation optional files txt or csv formats E Select output directory Open Files Select subject inclusion exclusion criteria Data Manipulation Specify CpG subsets if applicable txt or csv formats Enter FDR method lt K aa pur Q oO a co m ad A Specify linear model by double clicking variable names and Analysis type continuous or class from the menu Enter the FDR criterion Text files Analysis Results and Log Files Graphs Q Q and t statistic plots Manhattan and individual CpG site plots Output Table of Contents ay Introduction 2 Statistical Analysis 2 Getting Started 2 3a System Requirements 2 3b Installing R 2 3c Installing the tcltk library 3 3d Installing MethLAB 3 3e Launching the MethLAB GUI 3 Running MethLAB 4 4 0 Formatting Files 4 4 1 Loading Files 5 4 2 Data Manipulation 6 4 2 1 Subject Selection 6 4 2 2 CpG Selection 7 4 3 Data Analysis 8 Output Files 9 5 1 Text File 9 5 2 Q Q Plots 10 5 3 Manhattan Plot 11 5 4 Plots of Individual CpG sites 11 Errors 12 1 Introduction DNA methylation is a type of epigenetic modification that has been associated with numerous complex traits and diseases MethLAB pr
10. analyzed but a subset of subjects or CpG sites can be selected for analysis by following the steps below Users who intend to analyze all subjects and CpG sites in the input datasets can skip to section 4 3 Your z CpG Selection s 287 rows and 303 4 Your methyTration rire contains 2699 rows and 3 a Select subjects to analyze through the option Inclusions amp Exclusions under the Data Manipulation pulldown menu See 4 2 1 b Select CpGs that you would like to analyze Wh Log Window using the option CpG Selection under the opening Phenotype file Done ce A 99 Opening Methylation file Done Data Manipulation menu See 4 2 2 If no selections are made all available subjects and CpG sites will be analyzed 4 2 1 Subject Selection Inclusionl Operator Inclusion Criteria Upon Opening the subj ect selection screen select Rae e A subjects based on inclusion or exclusion criteria booleanoperator v a Specify a phenotype or trait and a boolean Inclusion2 Operator Inclusion Criteria operator lt or gt from the Inclusion and Operator drop down menus and input numeric criteria in the Inclusion Criteria box sin SONO For example if the dataset includes subjects cocks from 4 racial groups denoted 1 to 4 and the T amp user wants to evaluate only subjects from Exclusion2 Operator Exclusion Criteria groups I and 2 then select Race from the Co A g Inclusion menu lt
11. e eet Roe Ae a oe 2 ed Merge ax Center iS a Comdmional format _ Cem _ Inzet Delete Format Clipboard Da Number Ta Styles Cells Al t Index pe B Pes D E F G H 1 J K L index iTargetiD t statistic p value Bonferroni Significant Holm Significant FDR significant Gene 2 11477 cg11540997 6 361824551 7 43E 10 TRUE 2 05E 05 DUOX2 3 5312 cg05294455 6 173084092 2 17E 09 TRUE TRUE 2 99E 05 MYL4 4 16609 cgi16536918 5 997199699 5 76E 09 TRUE TRUE 5 30E 05 AVP 5 6026 cg06051311 5 67117571 3 34E 08 TRUE TRUE 0 00019765 TRIM15 6 21823 cg21842274 5 6575587793 3 58E 08 TRUE TRUE 0 006019765 CRHBP 7 25554 cg25551168 5 473304638 9 34E 08 TRUE TRUE 0 00042915 AVP 8 16623 cgi6545105 5 295361634 2 30E 07 TRUE TRUE 0 000906414 CRHBP 9 3115 cg03098721 5 206266968 3 58E 07 TRUE TRUE 0 001235089 TILL7 10 566 cg00546897 4 996352495 9 94E 07 TRUE TRUE 0 003046193 LOC284837 11 16174 cgi16098726 4 958338743 1 19E 06 TRUE TRUE 0 00328657 GPS 12 20255 cg20291222 4 8557380582 1 93E 06 FALSE FALSE 0 00484669 CAPS2 13 26367 cg26385222 4 304003209 2 46E 06 FALSE FALSE 0 005654793 HCA112 14 27150 cg27210390 4 732705488 3 42E 06 FALSE FALSE 0 00685418 TOMIL1 15 25357 cg25374813 4 728849207 3 48E 06 FALSE FALSE 0 00685418 SLC23A1 16 20952 cg20994801 4 678060918 4 39E 06 FALSE FALSE 0 008004406 PIK3CD 17 9950 cg10037005 4 660754957 4 75E 06 FALSE FALSE 0 008004406 CD37 18 478 cg00466249 4 652127321 4 93E 06 FALSE FALSE 0 008004406 MGC15523 19
12. e link here Double click it to unpack it and create the folder BWidget 1 8 0 Open your terminal and type the following commands to finish the installation process e cd Desktop or the folder name that contains the unpacked file e sudo mv BWidget 1 8 0 usr local lib d Installing MethLAB MethLAB can be downloaded from the URL http genetics emory edu conneely MethLAB It is in the form of a zipped file Once the file is downloaded follow the instructions below For Windows OS e Click on the Packages button which is on the top menu of the R Gui R RGui w 3 File Edit View Misc Windows Help ani Load package T Set CRAN mirror Select repositories R R Console Install package s R version 2 11 1 Copyright C 203 Update packages SEE P ISBN 3 900051 07 Install package s from local zip files Ris free software and comes with ABSOLUTELY NO WARRANTY You are welcome to redistribute it under certain conditions e Select Install package s from local zip files from the Packages menu e Follow the prompts as they show up on the screen For Mac OS e Click on the Packages amp Data option on the Main menu bar e Select the Package Installer option from the dropdown menu e In the Installer window choose Local Source Package from the dropdown menu Click the Install button and browse to the location where the MethLAB tar file has b
13. een downloaded Click Open to install e Launching MethLAB e When installation is complete call the program by typing the following commands at the R command line gt library MethLAB gt MethLAB e The MethLAB GUI will appear within 45 seconds 4 Running MethLAB To begin an analysis use the dropdown Files menu to open the phenotype and methylation files Please note that all other features are disabled until both a phenotype and a methylation file have been specified unless a methylation file has been opened previously see section 4 1 b 4 0 Formatting Input Files Home Insert Page Layout Formulas Data Review View JMP S amp Cut 5 alibri v JA a aay Wape enera a The phenotype file can be either a txt file or am E ae rele gt aia i J Format Painter Ulo A E S lE E ai mergeacenter 7 a csv file with each column representing a cepa 8 o a sample and each row representing a phenotype n O e e B C D E F G H J A sample phenotype file 1S available at 4 ere sample2 sample3 sample4 sample5 sample6 sample7 sample8 sample9 e 2 chi 1 1 1 1 1 2 2 2 2 htt a enetics emor edu conneel MethLAB 3 me male male male male male male male male male 4 trait 4 58 7 06 4 55 9 29 8 55 7 17 4 71 7 31 9 08 5 Ma HI example_mi Home Insert Page Layout Formulas Data Review View JMP b The methylation file can be formatted as a Doe m b ale Se txt or
14. nt ooo 4 Log Wino analyses the methylation dataset is saved as a Opening Phenotype file di database file on your local hard drive For subsequent The Methylation file you last used will be use Goig iiaia Hiedi l analyses MethLAB will use the most recently opened methylation file for the new session without having to reload the entire file unless the user selects a new methylation file Note that for files gt 6GB MethLAB will not save the file to the local drive If a methylation file has been opened previously the Log Window will display The most recently opened methylation file will be used by default unless a new file is selected d The annotation file is optional and can be loaded by selecting Open Annotation File from the Files menu Check the output window to confirm that the file has loaded Note An annotation file with the fields CHR and MAPINFO is required to produce a Manhattan plot see 4 0d e Selecting Output Directory MethLAB outputs a number of files as part of each analysis Select the location for these files to be saved by clicking the Select Output Directory option of the Files Menu 7 amp MethLAB on x Open Files DataManipulation Analysis Inclusions amp Exclusions ew 4 2 Data Manipulation MethLAB allows users to specify selection criteria within a large dataset By default all available subjects and CpG sites will be
15. ovides a graphical user interface GUI to facilitate analysis of DNA methylation microarray data allowing users with no experience using statistical software to implement flexible and powerful analyses of array based DNA methylation data 2 Statistical Analyses Microarrays such as the Illumina GoldenGate and Infinium platforms typically interrogate DNA methylation of an individual sample across the genome and output beta values that represent the proportion of DNA methylated at an individual CpG site MethLAB evaluates the association between beta values and a designated continuous or categorical phenotype by fitting a separate linear fixed or mixed effects model for each CpG site This package can incorporate both continuous and categorical covariates as well as fixed or random batch or chip effects The package produces quantile quantile Q Q plots with confidence intervals to allow users to visually assess whether there is an excess of associated CpG sites and can also produce Manhattan plots and CpG specific scatterplots and boxplots It accounts for multiple tests by controlling false discovery rate FDR at a user specified level using one of many optional FDR methods and automates plotting of the beta values against the phenotype for top CpG sites Bonferroni adjustments are also provided Results for all CpG sites analyzed are output in a manageable txt file format that can be opened with standard spreadsheet software 3 Getting Started
16. rst column of the annotation file inno should contain unique CpG site labels consistent with frst a a m a the methylation file MethLAB uses the CpG site Clipboard Font labels to match the annotation file and the J a fe Name e B C D E methylation file However its not necessary for the mm eer em jat CpG sites in the annotation file to be in the same z a fee order as those in the methylation file and it is fine if 4 cg20969242 1 154093668 GON4 your annotation file has a different number of rows eae See eectoes than the methylation file If a Manhattan plot is 7 g25748127 11 833228 POLR2L desired the file should include columns with the ae eee headers CHR and MAPINFO to indicate 10 g09300114 17 70595924 SLC16A5 1 cg07665060 19 43486438 C19o0rf33 cg00625653 3 13896228 WNT7A 3 cg08020381 10 1085554 IDI1 cg08223235 18 59054814 BCL2 5 cg22885821 20 56899316 GNAS chromosome 1 22 X Y and position must be numeric A sample Annotation file is available at http genetics emory edu conneely MethLAB pb pmb pmb ie Fo N 4 1 Loading Files MethLAs Ser Note for your first time running MethLAB you may sweety o Open Methylation File want to do a trial run using the sample phenotype and Select Output Directory Select Annotation File methylation files available at Qut http genetics emory edu conneely MethLAB a In the main menu click the Files button and select the option
17. st procedure Scand J Stat 6 65 70 13
18. tatistically significant CpG sites and 23998 CpG sites dialogue box and press OK For example if How many CpGs would you like to plot 5 is entered plots will be generated for the Fok 5 most significant CpG sites Scatterplots will be made for continuous outcomes Box plots will be made for categorical outcomes if the independent variables are coded as factor 12 cg00822607 in ZNF688 cg0 229960 in EMD t S 5 5 Log File A log file containing helpful information about the analysis can be found in the results folder This file is designed to provide a record of the analysis performed and includes the name of the phenotype file used the name of the methylation file used the linear model the FDR method and several summary statistics 6 Errors Any errors generated during your MethLAB analysis will be displayed in the Log window We hope you find MethLAB useful and easy to use Please contact vkilaru emory edu with any questions or comments References 1 Benjamini and Hochberg 1995 Controlling the false discovery rate a practical and powerful approach to multiple testing J R Statist Soc B 57 289 300 2 Benjamini and Yekutieli 2001 The control of the false discovery rate in multiple testing under dependency Annals of Statistics 29 1165 1188 3 Storey 2002 A direct approach to false discovery rates J R Statist Soc B 64 479 498 4 Holm S 1979 A simple sequentially rejective multiple te
19. y Due to their speed fixed effects analyses are the best choice for the initial analysis of a dataset this is particularly true for large datasets e g Illumina 450K e Inclusion of random effects implemented through the nlme package slows the analysis considerably but may increase power In general random effects are appropriate when the number of chips is large ie gt 10 and the number of samples per chip is not too small ie lt 5 For analyses with small samples or sample exclusions this condition may not be met and random effects analyses are likely to crash In these cases fixed effects analyses are a better choice e FDR Method Default BH method Multiple testing is controlled via a user defined FDR method The user may choose from three FDR methods the Benjamini Hochberg BH method 1 the Benjamini Y ekutieli BY method 2 and the qvalue function by Storey et al 3 f FDR Criterion Default 05 Multiple testing is controlled via FDR To specify the FDR cutoff enter a number between 0 and 1 In addition a stepdown version of Bonferroni significance Holm significance 4 is calculated for each of the CpG sites g Methylation Dataset Default Complete CpG Dataset CpG data subset if the user selects a subset file or Global Analysis to perform a global methylation analysis Users may select a smaller number of CpG sites to be analyzed by selecting the CpG data subset option see
Download Pdf Manuals
Related Search
Related Contents
Rittal PMC12 UPS – User Manual MANUAL DE INSTRUCCIONES INSTRUCTION MANUAL MLC9000 User Guide MVH-X360BT MVH-160UI MVH-16UI GUIA DO USUÁRIO P3800 Series Panasonic MC-UL427 vacuum cleaner Perfect Fit T76-200 Use and Care Manual Copyright © All rights reserved.
Failed to retrieve file