Home
        WGCNA user manual
         Contents
1.         User should specify the    Expression Data    file location first  It   s a comma or tab delimited file  where rows are genes  probe sets  and columns correspond to microarray samples  The first  column should contain gene identifiers     GeneName Person  Person  Persons Persond Persons  Persone  Person  Persona Persond  Personi    G  0 346676 1 161723  0 66371 1 420032 1 937326  0 66351 0 296692  0 18615  2 44964 1 293704  G2 2 202590 0 664077 0 365667  0 19999 0 16648  0 51231  0 63656  0 95634  1 9177 2 301971  Ga 1 464672  1 07043 0 761256 0 220236   4 31713 0 204570  1 59509 2 424614 3 100237  0 566301   54 0 425564 2 21047 1 115272 0 400406 3 166945 0 545041  1 54641  1 46429  4 21591 1 162176  G5 0 523249  2 227033  0 34264  0 2664 1 469669 0 735608   0 21567  0 93992  3 70007 1 242306  GE  0 46661  1 038671  0 22651  0 70261  3  20909  0 956619 1 169791 0 266063  1 56066  0 249675     Example expression data     In order to screen for genes or modules that are biologically significant  WGCNA needs to define  a gene significance measure    Option 1  based on a correlation with a microarray sample trait T that corresponds to a column in  the trait file as    Sample Trait Data    file    Then GeneSignificance 1  lcor Gene 1  T l  the trait could be a binary outcome  case control  status  or a quantitative outcome  body weight    Option 2  based on pre defined gene significance measure that corresponds to a column in the  gene information file  It must contain the
2.    File Tools Help    Module significance   p value  0 16    grey brown yelka    turquoise ble    y  o     m              gt    a  a  z  a   o       1  Load Data   2  PreProcess   3  Network Construction 4  Module Detection   5  Gene Selection       10 List Modules    Static Height   F  Branch Cutting   Static Height Cutoff  0 95 M erge 2      Dynamic cut parameters preter  rag ee uto  Dynamic Height     Branch Cutting   Max Height  0 to 1  0 99 Merge is   Deep Split Level 7   Choose 0 1 2 or 3  Dynamic Hybrid  Branch Cutting    fo  Max PAM Distance   negative to disable   0 95  mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup    grey brown yellow turquoise blue  163 74 23 559 mg     Module Merge    Module Detection Min Module Size    Plot Custer Tree    Plot ME Pairs    mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise blue   163 74 zel aael EL  mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise blue   163 74 esi deel o EL    Once modules are defined  user can generate module significance plot by clicking    plot module  significance     And the    plot cluster tree    button offers an easy way to re draw hierarchical tree  and module definitions without conducting module detection again     Fie Tools Help    Relationship between module eigengenes    D4 00 G2 04 23  01 01 03 03  Q1 QI 03    i cy        Le   ard     Pade ates   bis
3.   the slope of the regression line between log p k   and log k  should be  negative  typically smaller than  2   In practice  we find the relationship between R 2 and B is  characterized by a saturation curve  In most applications  we use the lowest power B where  saturation is reached    As a caveat  we mention that sometimes scale free topology cannot be reached for reasonably low  values of B  say smaller than 20   For example  severe array outliers or globally distinct groups of  arrays may lead to strong correlations between the expression profiles  and very large co expression  modules   In this case  we simply recommend going with the default choice of B  for an unsigned  network B 6  for a signed network B 12                 lt   D  ao  gt  el  Oo         o  D  n    a  5S 2 5 2  O O Do om   gt      gt  O  8  O     s z g               LO      LL      b   o N  O  D O   O   5 10 15 20 5 10 15 20  Soft Threshold  power  Soft Threshold  power     Figure a  Scale free topology fit  R 2  y axis  as a function of different powers  In many  but not all   applications  one observes a saturation type curve  Here we would choose a power of 6 since the  saturation level is reached at this point  The analysis is highly robust with respect to the choice of the  power  b  Mean connectivity  y axis  versus the power  The higher the power  the lower the mean  connectivity     Page 6 of 20    WGCNA User Manual    Module detection    We use average linkage hierarchical clustering coupled
4.  Selection    blie module cor  0 04       Manual Gene Selection  Choose the Module Plot Gene Significance    grey ivs  Intramodular Kk    brown  yellow Choose the Genes      ll  Automatic Gene Selection  turquoise GS gt   o2  Auto Gene Selection    K  Kmax  gt   0 5 Mel  Select Genes      Plot Module Heatmap Output Gene List      a  o  c     C      nh  a      a  a    385 48 15 348 204  mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise blue   163 74 23 559 181  mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise blue   163 74 23 559 181  mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise blue   163 74 23 559  181       The user can save any plot in the display panel to a local file using    Save Image    function under     File    menu  The image format is    emf     EMF  Enhanced Meta File  is a vector based image format  designed for and popularized by Microsoft Windows  which could be easily transfer to other  desirable format     To convert EMF files into PDF format  the user can insert the EMF file into Word  and then print it  using    Acrobat PDFWriter    to crate a PDF file  If the user has Adobe Illustrator  the user can then  edit the PDF file and save it to EPS format     Page 18 of 20    WGCNA User Manual    How to access help    File Tools 56m    Get Latest Version    STE 1  Load Data   2  Pre
5.  WGCNA User Manual    Short glossary of network concepts    Term    Coexpression  network    Module    Connectivity    Intramodular  connectivity   KIN     Module  eigengene    Eigengene  significance    Module  Membership    also known as    eigengene based    connectivity   kME     Definition   We define coexpression networks as undirected  weighted gene networks  The  nodes of such a network correspond to gene expressions  and edges between  genes are determined by the pairwise Pearson correlations between gene  expressions  By raising the absolute value of the Pearson correlation to a power  B  gt  1  soft thresholding   the weighted gene coexpression network construction  emphasizes large correlations at the expense of low correlations  Specifically  a jj     cor x   x  P represents the adjacency of an unsigned network  Optionally  the  user can also specify a signed co expression network where the adjacency is  defined as follows  a j  10 5 0 5 cor x   x  JP    Modules are clusters of highly interconnected genes  In an unsigned coexpression  network  modules correspond to clusters of genes with high absolute correlations   In a signed network  modules correspond to positively correlated genes     For each gene  the connectivity  also known as degree  is defined as the sum of  connection strengths with the other network genes  k      gt  4a iu  In coexpression  networks  the connectivity measures how correlated a gene is with all other  network genes     Intramodular con
6.  also provides the file    MEResults     which reports the values of the module eigenes   columns  for different arrays  rows   The top two rows in the file MEResults reports the  elgengene significance of each module      2   Analysis using gene significance data    Page 17 of 20    WGCNA User Manual    Output contains two different files   1     LazyGenelist csv    which contains results for each gene  rows    2     MEResults   csv    which contains results for each array sample   LazyGenelist csv contains the following columns   Module membership information  see the columns MM blue  etc  For each gene and each automatically detected module  WGCNA outputs a module membership   MM  value  For example  if a gene has an MMblue value close to   or  1  the gene is assigned  to the blue module     Module colors are assigned according to module size  turquoise denotes the largest module  blue  next  then brown  green  yellow  etc  The color grey is reserved for non module genes     GS Weighted is the weighted network estimate of the corresponding pre defined GS measure   GS Weighted takes account of module membership information and the hub gene significance  measure  HGS  of each module     The file MEResults reports the module eigengenes  columns  across the arrays  rows    The top two rows in the file MEResults report the hub gene significance of each module     Save image    ji  Tools Help    1  Load Data   2  PreProcess   3  Network Construction   4  Module Detection 5  Gene
7.  correlation between log p k   and log k   1 e  the model fitting index R 2 of the  linear model that regresses log p k   on log k   If R 2 of the model approaches 1  then there is a  straight line relationship between log p k   and log k   Many co expression networks satisfy the  scale free property only approximately     Page 5 of 20    WGCNA User Manual    Most biologists would be very suspicious of a gene co expression network that does not satisfy  scale free topology at least approximately  Therefore  a soft threshold power B  in  a i j  Icor x i  x j  IP   that give rise to a network that does not satisfy approximate scale free  topology should not be considered  There is a natural trade off between maximizing scale free  topology model fit  scale free fitting parameter R 2  and maintaining a high mean number of  connections  High values of B often lead to high values of R 2  But the higher the power fp  the lower  is the mean connectivity of the network    These considerations have motivated us to propose the following scale free topology criterion for  choosing the power B  Only consider those powers that lead to a network satisfying scale free  topology at least approximately  e g  R42 gt 0 80  In addition  we recommend that the user take the  following additional considerations into account when choosing the adjacency function parameter   First  the mean connectivity should be high so that the network contains enough information  e g  for  module detection   Second
8.  gene  ontology information to assess their biological plausibility  it is not required  Because the modules  may correspond to biological pathways  focusing the analysis on intramodular hub genes  or the  module eigengenes  amounts to a biologically motivated data reduction scheme  Because the  expression profiles of intramodular hub genes are highly correlated  typically dozens of candidate  biomarkers result  Although these candidates are statistically equivalent  they may differ in terms of  biological plausibility or clinical utility  Gene ontology information can be useful for further  prioritizing intramodular hub genes  Examples of biological studies that show the importance of  intramodular hub genes can be found reported in  Horvath et al 2006  Carlson et al 2006  Gargalovic  et al 2006  Ghazalpour et al 2006  Miller et al 2008      Analysis overview    Construct a network w  Rationale  make use of interaction patterns between genes     Identify modules MWIN  Rationale  module  pathway  based analysis I    Relate modules to external information me   Array Information  Clinical data  SNPs  proteomics   Gene Information  gene ontology  EASE  IPA li    g g    Rationale  find biologically interesting modules       Find the key drivers in  nteresting modules  Tools  intramodular connectivity    correlation 0 56  pvalue 2 20 16     highly related to module membership    gene significance    Rationale  experimental validation  therapeutics  biomarkers       Page 2 of 20   
9.  gene and each module  WGCNA outputs a module membership   MM  value  MM values close to I or  1 suggest that the gene is a member in the respective    Page 11 of 20    WGCNA User Manual    module  By using    I   m feeling lazy     WGCNA will include all the available genes in its analysis   regardless of their variance or connectivity  Regarding to the output of the automatic analysis   please refer to the    5  Gene Selection    section for details     2   the manual WGCNA analysis allows the user to proceed in a step wise fashion to define  modules and significant genes  In general  we recommend the manual analysis but it has a  limitation  only 3600 genes can be included for the module definition     3  Network Construction    Fie Tools Help    Help on Step 3  Network Construction       Specify whether you want to use a signed or an unsigned gene co    expression similarity measure  A signed network defines modules as  Signed Network  clusters of pos tively correlated genes  An unsigned network defines   modules based on the absolute value of the correlation coefficient        Specify a power  soft threshold  larger than or equal to 1  Raising the Beene please choose Use Adjacency  co expression similarity to this power would result in a weighted co  Power Selection   Power  6 power  lt  30   instead of TOM  expression network  The default power is 6  but   can also use the for Clustering  scale free topology to pick a power  Zhang and Horvath 2005        To use the scal
10.  module membership measure   K measures how centrally located the gene is inside the module      To select genes  you can set the threshold parameters and manually  output the genes that meet the selection criterion  Or you can choose  aitomatic gene selection procedure  which selects genes based on their  gene significance and their membership to significant modules        1  Load Data   2  PreProcess   3  Network Construction   4  Module Detection 5  Gene Selection    Plot Gene Significance  vs  Intramodular K    Choose the Genes    GS gt    a5    K  Kmax  gt   0 5  Select Genes      Output Gene List         Manual Gene Selection  Choose the Module    ll  Automatic Gene Selection    Auto Gene Selection  Help    Plot Module Heatmap      mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise   163 74 zal ats  E  mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise   163 74 ae  aael o EN  mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise   163  4 ae  LSE  ET     blue    blue    blue    Based on the userr module significance analysis from the previous step  please choose a module    of interest first     File Tools Help    brown module heatmap and eigengene    O01 00 O14 O2    shld iat Bi       1  Load Data   2  PreProcess   3  Network Construction   4  Module Detection 5  Gene Selection    Plot Gene Significanc
11.  same gene sets as in the expression data     Page 9 of 20    WGCNA User Manual    In order to proceed  the user needs to load a trait file or a gene info file  or both  In the next step   the user can choose a column of the trait file as sample information trait or a column of the gene  info file as pre defined gene significance measure      seneName Outcome Outcomes  Person  0 07 0 732  Persone 2 19 0 234  Persona 0 29 Ufo  Person  0 07 0 43  Persona 0 26 0 52  Personb 0 79 0 79  Person   0 51 0 28   Example trait data        Gene information Data    file can contain additional gene information like gene names   chromosome location etc that user wants to keep in the analysis     GenelD GeneName Genesignificancealue    G1 ATG O f4  G2 IL10 0 13  GJ OPTR 0 46  Ga  Lb 0 26     Example gene information data     PathwayR elated  mRNA Processing  Immune Response    immune Response    Please follow the data format as in sample files  WGCNA can take either comma or tab    delimited text file as input     After loading the expression data and trait data  or gene information data   user should click the       Next  gt     button in order to move to data           Preprocess       step     2  Data preprocessing and Im Feeling Lazy analysis    File Tools Help    Help on Step 2  Pre processing       Specify either a microarray sample trait or a pre defined gene  significance measure so that WGCNA knows which gene  significance  GS  measure should be used       Next decide on whether you
12.  that the default parameters of the automatic  lazy  analysis work quite well in  many real applications  there is a danger that a module may be an artefact e g due to an array  outlier    The manual analysis allows the user to interact with the program regarding module detection and  gene Selection but it can only deal with relatively few genes  on our laptop computer fewer than  3600 genes   Since module detection 1s computationally intensive  the user can filter genes based  on the variance  across the microarrays  and the user can select the most connected genes among  the most varying genes  Since module genes tend to be highly connected  restricting the analysis  to the most connected genes is a reasonable gene filtering criterion  when it comes to module  detection   At the end of the analysis  WGCNA outputs module membership measures which are  defined for all genes in the input data  not just the 3600 most connected genes      Fie Tools Help    1  Load Data 2  PreProcess   3  Network Construction   4  Module Detection   5  Gene Selection    H e lp O n Ste p 2    P re  pro ce SS   n g Sample Trait  Outcome  Selection Gene Significance Selection     Specify either a microarray sample trait or a pre defined gene a  y  N      Biss      significance measure so that WGCNA knows which gene l   truemodule  C   significance  GS  measure should be used  as trait to define the GS measure as PIel SiqnalGenelndicator IN  Genelnfo Cor  Standard  N       ide o ou want to pro ith th
13.  want to proceed with the automatic or  the manual WGCNA analysis        The automatic analysis     ImFeelingLazy     uses default parameters  for finding modules and significant hub genes  For each gene and  each module  WGCNA outputs a module membership a value    M values close to 1 or  1 suggest that the gene is a member in the  respective module       The manual WGCNA poh yis allows the user to proceed in astep   ul    wis e fashion to define moc i  recommend the manual analysis but it has a limitation  only 3600  genes can be included for the module definition  Therefore  we  implement two approaches for restricting genes    i  based on variance across samples   ii  based on whole network connectivity      Since module genes tend to be highly connected  little is lost by  restricting the analysis to highly connected genes     es and significant genes  In general  we       Sample Trait  Outcome  Selection    Choose   y  N         as trait to define the GS measure    Gene Significance Selection    Choose        as pre defined GS measure    Automatic WGCNA  Manual WGCNA    Expression Data Filter  l m feeling lazy   Hel Keep 8000 most varing genes    of those 3600   lt 3600  most connected genes  keep    Automatically detects me aen a    modules and significant hub  genes based on all genes Next   gt     Expression Data Loaded  Found 50 samples and 3000 genes  Trait Data Loaded  Found 1 available traits for analysis   Geneinfo Data Loaded     First  the user needs to specif
14.  with a gene dissimilarity measure to define a  dendrogram  cluster tree  of the network    Once a dendrogram is obtained from a hierarchical clustering method  we choose a height cutoff to  arrive at a clustering  Modules correspond to branches of the dendrogram   WGCNA implements two network dissimilarity measures  The default choice is the topological  overlap matrix based dissimilarity measure  Ravasz et al 2002  Zhang and Horvath 2005  Li and  Horvath 2006  Yip and Horvath 2007   The use of topological overlap serves as a filter to exclude  spurious or isolated connections during network construction     The topological overlap dissimilarity is used  as input of hierarchical clustering    Yin Ly 1   TOM           _   ___    DistTOM    1 TOM    e Generalized in Zhang and Horvath  2005  to the case of weighted  networks    e Generalized in Yip and Horvath  2006  to higher order interactions    As alternative dissimilarity measure  we also define dissA i j  1 aQ j  G e   1 minus the adjacency  matrix   This alternative measure is computationally much faster than the topological overlap  measure and often leads to approximately similar modules    WGCNA defines modules by cutting  pruning  branches off the dendrogram  A common but  inflexible method uses a static  constant  height cutoff value  this method exhibits suboptimal  performance on complicated dendrograms  Therefore  WGCNA also implements dynamic branch  cutting methods for detecting clusters in a dendrogram dependi
15.  zs 3      ah   7   gt  2  s           Ca aes  S      A des ILo    io i  I8 03 01 Qi 02 00 02       1  Load Data   2  PreProcess   3  Network Construction 4  Module Detection   5  Gene Selection      jio List Modules    Serato da   Static Height Cutoff  9 45 Merge 2      Dynamic cut parameters Auto  H cut      u 2  Dynamic Height   i ies    Branch Cutting eee Na  aa eoe 0 20  Deep Split Level  Choose 0 1 2 or 3  Dynamic Hybrid  Branch Cutting Next   gt   ext        Module Merge    Module Detection Min Module Size    o Plot Module  Significance   Max PAM Distance    negative to disable   0 95    mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise blue   163  4 are  tafe  ILE   mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise blue   163 74 23 559 181  mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup    grey brown yellow turquoise blue  163  4 zel SS lel    And the    plot ME Pairs    function plots the relationship between module eigengenes  It   s useful for  studying module similarities  and help merging similar modules  Message  the module eigengenes   first PC  of different modules may be highly correlated  WGCNA can be interpreted as a  biologically motivated data reduction scheme that allows for dependency between the resulting  components  Compare this to principal component analysis that would impose orthogonality  between th
16. 3  Network Construction   4  Module Detection 5  Gene Selection    bhie module cor  0 04       Manual Gene Selection    Choose the Module Plot Gene Significance  grey vs  Intramodular K    brown  yellow Choose the Genes      ll  Automatic Gene Selection  turquoise GS gt   o2    Auto Gene Selection Hel  K  Kmax  gt   0 5 fon    gene Significance    Plot Module Heatmap   Output Gene List      grey brown yellow turquoise blue  163 74 zel Se  mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise blue   163 74 ge o aael  EN  No gene meets the criteria    geneid K GS truemodule SignalGenelndicator Genelnfo Cor  Standard   1 Gene638 14 229 0 207 blue 0  0 2067922  2 Gene639 17 886 0 241 blue 0 2406203  3 Gene64115 1330 266 blue  0 2662961  4 Gene646 14 191 0 258 blue  0 2584501       Page 16 of 20    WGCNA User Manual    To select genes  the user can set the threshold parameters and manually output the genes that  meet the selection criterion   The user can also save the list to a file   Or the user can choose  automatic gene selection procedure  which selects genes based on their gene significance and  their membership to significant modules     In gene selection step  user can generate heat map for each specific module  The sample order is  exact the same as in expression data        Auto Gene Selection    function allows user to obtain a gene lists with most significant genes  ranked by putting both gene significance and intr
17. Genetics and Molecular Biology  Vol  4  No   1  Article 17   Page 20 of 20    
18. Process   3  Network Construction   4  Module Detection 5  Gene Selection    module cor  0 04  About WGCNA    Manual Gene Selection    Choose the Module   Plot Gene Significance    i vs  IntramodularK    Choose the Genes    turquoise GS gt   0 2    K  Kmax  gt   0 5  Select Genes    Plot Module Heatmap   Output Gene List      ll  Automatic Gene Selection    Auto Gene Selection  Help    a    o  c  y  c  ma  nm  a  Cc  a        385 48 15 348 204  mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise blue   163 74 elo was ET  mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise blue   163 74 eL o masl EN  mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise blue   163  4 ye  datas  e       If needs to read help for current step  the user can display the help by click the    Show Help    menu  under    Help        Network function library update    WGCNA utilizes a set of network functions  which is included in the WGCNA package called     NetworkFunctions WGCNA  library     Since we keep updating the library  please download the  latest version from our website  The user can check the versions of both WGCNA program and its  network library using    About WGCNA    menu under    Help       In order to update the network library  the user can simple download the latest version and save it to     C WGCNA    folder     
19. Recovery from unexpected errors    If WGCNA crashes due to any unexpected error  please use    Windows Task Manager    to kill the  R server process  whose name is displayed as    STATCO 1 EXE     Otherwise  the user may  encounter error like    Unable to create metafile    when the user re run WGCNA     Page 19 of 20    WGCNA User Manual    Information       References    e Albert R  Jeong H  Barabasi AL  2000  Nature 406 378 382     e Carlson MRJ  Zhang B  Fang Z  Mischel P  Horvath S  Nelson SF  2006  Gene Connectivity   Function  and Sequence Conservation  Predictions from Modular Yeast Co Expression  Networks  BMC Genomics 2006  7 40  3     e Fuller TF  Ghazalpour A  Aten JE  Drake TA  Lusis AJ  Horvath S  2007  Weighted Gene  Co expression Network Analysis Strategies Applied to Mouse Weight  Mamm Genome  18 6  463 472    e Dong J  Horvath S  2007  Understanding Network Concepts in Modules  BMC Systems  Biology 2007  June 1 24   e Gargalovic PS  Imura M  Zhang B  Gharavi NM  Clark MJ  Pagnon J  Yang W  He A   Truong A  Patel S  Nelson SF  Horvath S  Berliner J  Kirchgessner T  Lusis AJ  2006   Identification of Inflammatory Gene Modules based on Variations of Human Endothelial  Cell Responses to Oxidized Lipids  PNAS 22 103 34  12741 6   e Ghazalpour A  Doss S  Zhang B  Wang S  Plaisier C  Castellanos R  Brozell A  Schadt EE   Drake TA  Lusis AJ  Horvath S  2006  Integrating Genetic and Network Analysis to  Characterize Genes Related to Mouse Weight  PloS Genetics  Volum
20. Rete ceases ama A E AE eae seems 20    Page 1 of 20    WGCNA User Manual    Background information    WGCNA begins with the understanding that the information captured by microarray experiments 1s  far richer than a list of differentially expressed genes  Rather  microarray data are more completely  represented by considering the relationships between measured transcripts  which can be assessed by  pair wise correlations between gene expression profiles  In most microarray data analyses  however   these relationships go essentially unexplored  WGCNA starts from the level of thousands of genes   identifies clinically interesting gene modules  and finally uses intramodular connectivity  gene  significance  e g  based on the correlation of a gene expression profile with a sample trait  to identify  key genes in the disease pathways for further validation  WGCNA alleviates the multiple testing  problem inherent in microarray data analysis  Instead of relating thousands of genes to a microarray  sample trait  it focuses on the relationship between a few  typically less than 10  modules and the  sample trait  Toward this end  it calculates the eigengene significance  correlation between sample  trait and eigengene  and the corresponding p value for each module  The module definition does not  make use of a priori defined gene sets  Instead  modules are constructed from the expression data by  using hierarchical clustering  Although it is advisable to relate the resulting modules to
21. WGCNA User Manual    WGCNA User Manual   for version 1 0 x     A systems biologic microarray analysis software for finding important genes and pathways     The WGCNA  weighted gene co expression network analysis  software implements a systems  biologic method for analyzing microarray gene expression data  gene information data  and  microarray sample traits  e g  case control status or clinical outcomes   WGCNA can be used for  constructing a weighted gene co expression network  for finding co expression modules  for  calculating module membership measures  and for finding highly connected intramodular hub genes   WGCNA facilitates a network based gene screening method that can be used to identify candidate  biomarkers or therapeutic targets  The gene screening method integrates gene significance  information  e g  correlation between gene expression and a clinical outcome  and module  membership information to identify biologically and statistically plausible genes  The software has a  graphic interface that facilitates straightforward input of microarray and clinical trait data or pre   defined gene information  The software can analyze networks comprised of tens of thousands of  genes and implements several options for automatic and manual gene selection   network  screening       To cite the software  please use Zhang and Horvath  2005   Horvath et al  2006   and Langfelder et  al  2007      Table of Contents    Back round IN Or Alt Oi cassis cetsstre yasetsd N E uaeiea i
22. a modular connectivity into consideration     Gene selection output files    Both    I am feeling lazy    and    auto gene selection    functions generate two files as output  They  are of the same format      1   Analysis using Trait data    Output contains two different files  1     LazyGenelist csv    which contains results for each gene  rows   2     MEResults   csv    which contains results for each array sample     LazyGenelist csv contains the following columns    Module membership information  see the columns MM blue  etc   For each gene and each automatically detected module  WGCNA outputs a module membership   MM  value  E g  if a gene has an MMblue value close to 1 or  1  the gene is assigned to blue  module    Module colors are assigned according to module size  turquoise denotes the largest module  blue  next  then brown  green  yellow  etc  The color grey is reserved for non module genes     Cor Weighted denotes the weighted network estimate of the standard correlation cor x 1  y   between the i th gene and the outcome y  It use of module membership info as well as the  module eigengene significance     Analogously  p Weighted is a    weighted    version of a p value  and q Weighted is a weighted  version of a q value  local false discovery rate  see the qvalue library in R   Z Weighted is a  weighted version of the Fisher Z transform of a correlation coefficient     In order to learn about which module eigengenes affect the calculation of these measures   WGCNA
23. ation test p value for  module membership  denoted by PvalueMMblue   The module membership  measure can be defined for all input genes  irrespective of their original module  membership  It turns out that the module membership measure is highly related to  the intramodular connectivity kIN  Highly connected intramodular hub genes tend    Page 3 of 20    WGCNA User Manual    Term Definition  to have high module membership values to the respective module     This loosely defined term is used as an abbreviation of    highly connected gene        Hub gene ee ans   oa  re By definition  genes inside coexpression modules tend to have high connectivity     To incorporate external information into the co expression network  we make use  of gene significance measures  Abstractly speaking  the higher the absolute value  of GS i   the more biologically significant is the 1 th gene  Examples  GS i  could  encode pathway membership  e g    if the gene is a known apoptosis gene and 0  otherwise   knockout essentiality  or the correlation with an external microarray  sample trait  A gene significance measure could also be defined by minus log of a  p value  The only requirement is that gene significance of 0 indicates that the gene  is not significant with regard to the biological question of interest  The GS can  take on either positive or negative  When the user specifies a microarray sample  trait y  e g  case control status or a quantitative outcome   WGCNA defines the  gene significanc
24. distance is less than 0 1  colorgroup   grey brown yellow turquoise blue   163  4 ae  tafe  EL       WGCNA implements three different approaches for defining modules based on a hierarchical  clustering tree of the genes  All of the three methods required a minimum module size setting     1  The static branch cutting method simply defines modules as branches that lie below the  height corresponding to the y axis of the cluster tree     2  The dynamic height branch cutting method automatically chooses a height cut off for each  branch based on the shape of each branch  Langfelder et al  Bioinformatics 2007     3  The dynamic hybrid method is a hybrid between the dynamic method and partitioning around  medoid  PAM  clustering     The button    Plot Module Significance    plots the average gene significance for each module  The  user can go back to step 2 to specify a different gene significance  GS  measure in the gene info  file or pick a different clinical trait  column  in the sample trait file  In this way  the user can   Page 13 of 20    WGCNA User Manual    study module significance based on different significance measurements without re calculating  TOM  needs to keep    Re Calculate Adjacency TOM box unchecked      The module merge functions allow the user to manually or automatically merge closely related  modules  User can go ahead to merge close modules using    Merge 2 modules    function  Besides      Auto Module Merge    offers an automatic module merging process  
25. e  vs  Intramodular K    Choose the Genes    GS gt   0 5    K  AKmax  gt   0 5  Select Genes      Output Gene List         Manual Gene Selection  Choose the Module    ll  Automatic Gene Selection    Auto Gene Selection  Help    mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise   163  4 aal ml ILS   mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise   163 74 23 559 161  mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise   163 74 aa  tale     blue  blue    blue  1    The user can use a heatmap button to plot the standardized expression values of the module  genes  rows  across the arrays  columns   The top row shows the heatmap of the brown module  genes  rows  across the microarrays  columns   The lower row shows the corresponding module  eigengene expression values  y axis  versus the same microarray samples  Note that the module  eigengene takes on low values in arrays where a lot of module genes are under expressed  green    Page 15 of 20    WGCNA User Manual    color in the heatmap   The ME takes on high values for arrays where a lot of module genes are  over expressed  red in the heatmap   ME can be considered the most representative gene  expression profile of the module    These plots may allow the user to understand the meaning of the module  High values of the  module eigengene suggest that the m
26. e 2   Issue 8   August   e Horvath S  Zhang B  Carlson M  Lu KV  Zhu S  Felciano RM  Laurance MF  Zhao W  Shu   Q  Lee Y  Scheck AC  Liau LM  Wu H  Geschwind DH  Febbo PG  Kornblum HI   Cloughesy TF  Nelson SF  Mischel PS  2006   Analysis of Oncogenic Signaling Networks in  Glioblastoma Identifies ASPM as a Novel Molecular Target   PNAS   November 14  2006    vol  103 I no  46   17402 17407   e Langfelder P  Zhang B  Horvath S  2007  Defining clusters from a hierarchical cluster tree   the Dynamic Tree Cut library for R  Bioinformatics  November btm563   e lLangfelder P  Horvath S  2007  Eigengene networks for studying the relationships between  co expression modules  BMC Systems Biology  BMC Syst Biol  2007 Nov 21 1 1  54   e Li A  Horvath S  2006  Network Neighborhood Analysis with the multi node topological  overlap measure  Bioinformatics  do1 10 1093 bioinformatics btl58 1   e Miller JA  Oldham MC  and Geschwind DH  2008  A Systems Level Analysis of  Transcriptional Changes in Alzheimer s Disease and Normal Aging  J  Neurosci  28  1410   1420   e Oldham M  Horvath S  Geschwind D  2006  Conservation and Evolution of Gene Co   expression Networks in Human and Chimpanzee Brains  PNAS  2006 Nov  21 103 47  17973 8   e Yip A  Horvath S  2007  Gene network interconnectedness and the generalized topological  overlap measure  BMC Bioinformatics 8 22   e Zhang B  Horvath S  2005   A General Framework for Weighted Gene Co Expression  Network Analysis   Statistical Applications in 
27. e automatic o  an p a n an Ei ant to proceed with the auto matic or Automatic WGCNA Manual WGCNA       The automatic analysis     ImFeelingLazy     uses default parameters Expression Data Filter   for finding modules and significant hub genes  For each gene and l m feeling lazy Hel Keep 000 most varing genes   each module  WGCNA outputs a module membership  M value  cee       M values close to 1 or  1 suggest that the gene is a member in the Automatically detects    respective module    f modules and significant hub   The manua WGCNA akp a allows the user to proceed in a step  genes based on all genes Next   gt   wise fashion to define modules and significant genes  In general  we  recommend the manual analysis but it has a limitation  only 3600  genes can be included for the module definition  Therefore  we    implement two approaches for restricting genes  Expression Data Loaded  Found 50 samples and 3000 genes   i  based on variance across samples Trait Data Loaded  Found 1 available traits for analysis    Geneinfo Data Loaded     of those  3600   lt 3600  most connected genes  keep for module detection    ii  based on whole network connectivity      Since module genes tend to be highly connected  little is lost by  restricting the analysis to highly connected genes        Figure  To proceed with the automatic or the manual WGCNA analysis     1   the automatic analysis     I m feeling lazy     uses default parameters for finding modules and  significant hub genes  For each
28. e components  Since modules may represent biological pathways there is no biological  reason why modules should be orthogonal to each other     Page 14 of 20    WGCNA User Manual    With modules detected  user can click    Next  gt     button to move to gene selection step     5  Gene Selection    File Tools Help      Based on your module significance analysis from the previous step   please choose a module of interest first      You can use a heatmap button to plot the standardized expression  values of the module genes  rows  across the arrays  columns   Red  means over expression  green means under expression        Undemeah the heatmap plot  we also plot the expression values  y     axis  of the module eigengene across the samples  x axis   The  samples  columns  of the heatmap plot line up with those of the  eigengene po These plots my allow you to understand the  meaning of the module  High values of the module eigengene  suggest that the module is    up    in the comesponding sample  If the  correlation structure of the module is due to asingle array  the array  may be an outlier      Gene selection  in principle  all genes of a significant module are  interesting  However  it can be useful to study the relationship between  gene significance and intramodular connectivity K using the button     Plot Gene significance versus intramodular connectivity         The scaled version of the intramo dular connectivity K max K  turns out  to be highly related to the comesponding
29. e could select the 200 genes with highest absolute module  membership values  The selected genes could be used as input of a functional enrichment analysis  software  EASE  KEGG  Webgestalt  Ingenuity  etc     For example  we often use the software EASE  David    http   david abec ncifcrf gov summary jsp    Relating modules to each other and to a microarray sample trait   WGCNA also outputs the module eigengenes in a separate file    By correlating the module eigengenes one can determine how related  co expressed  the modules are  to each other  Module eigengenes form the nodes of an eigengene network  Langfelder and Horvath  2007   which may reveal that modules are organized into meta modules  clusters of co expressed  modules   The module eigengenes can also be used as covariates of a multivariate regression models  that regresses the microarray sample trait y on the eigengenes     Installation requirements    1  Windows operating system  Win2000  NT  WinXP or Vista  with  NET Framework installed   NET Framework could be freely downloaded from Microsoft windows update     2  All necessary software is listed in the file    WGCNA_Installation_Guide doc           which is  included in WGCNA package  Please follow the installation steps   Page 8 of 20    WGCNA User Manual    3  There   s no hardware requirement for running WGCNA  However  considering the  computation task of network construction  we recommend computers with CPU frequency higher  than 2 0 GHz and memory bigger t
30. e free topology criterion push the    power Re Calcul  selection    button  This will result in a graph of scale free topology Cluster Samples j ReLalculate   fit  R  2  y axis  versus different power  x axis   Choose the Adjacency TOM  smallest power for which R 2 gt 08        The button    ClusterSamples    can be used to assess whether arrays  are outliers  average linkage hierarchical cluster tree  Euclidean Next   gt     distance   If you find very distinct branches or outlying arrays  the  ow R 2 values   In  this case  amiy choose a power  e g  6  or consider removing       scale free topology criterion may be meaningless    efore proceeding  Expression Data Loaded  Found 50 samples and 3000 genes       Once you have chosen a power  the    next    button will automatically aerala    as T avalable trake for analysis     create a weighted co expression network and a corresponding cluster     tree  which will be used for module detection    ea ale e eas     The default network dissimilarity measure is based on the Teta a a  topological overlap matrix  TOM   A less time consuming altemative is   Calculating variance    _   to use check box    Use Adjacency instead of TOM     Warners Salil ann fraser     outlying arrays       First  the user needs to specify whethe a signed or an unsigned gene co expression network  should be constructed  Asigned network defines modules as clusters of positively correlated  genes  An unsigned network defines modules based on the absolute va
31. e measure as follows GeneSignificance 1  cor     y      Gene  significance    Module significance 1s determined as the average absolute gene significance  measure for all genes in a given module  This measure is highly related to the  correlation between module eigengene and the outcome y     Module  significance    Construction of weighted gene co expression networks and    modules    Genes with expression levels that are highly correlated are biologically interesting  since they imply  common regulatory mechanisms or participation in similar biological processes  To construct a  network from microarray gene expression data  we begin by calculating the Pearson correlations for  all pairs of genes in the network  Because microarray data can be noisy and the number of samples  is often small  we weight the Pearson correlations by taking their absolute value and raising them to  the power B  This step effectively serves to emphasize strong correlations and punish weak  correlations on an exponential scale  These weighted correlations  in turn  represent the connection  strengths between genes in the network  By adding up these connection strengths for each gene  we  produce a single number  called connectivity  or k  that describes how strongly that gene is  connected to all other genes in the network  We use the general framework of weighted gene co   expression network analysis presented in  Zhang and Horvath 2005  Horvath et al 2006   Briefly  the  absolute value of the Pear
32. ess time consuming alternative is to use check box    Use Adjacency instead of TOM        Because calculation of TOM or adjacency matrix is very time consuming  WGCNA allows user  to skip this step by un checking    Re Calculate Adjacency  TOM     It   s particularly useful when  user switch back to Step 2 to pick another trait or gene significance column for analysis without  re calculating the same gene network     After choosing power  user can click    Next  gt     button to move to module detection step     4  Module Detection    Fie Tools Help    Gene Network by Dynanic Tree Cutting 1  Load Data   2  PreProcess   3  Network Construction 4  Module Detection   5  Gene Selection      1 0    Module Detection Min Module Size 1 0 hiema laia List Modules    Static Height        Branch Cutting Static Height Cutoff  0 95 f Merge 2    bl    Dynamic cut parameters H   gt     Dynamic Height     Ms zo      Branch  cutting _  Max Height  0 to 1   0 99 erge   0 20  Deep Split Level Plot Module  Choose 0 1 2 or 3 fo Significance    a of o8 O89    Colored by modules Dynamic Hybrid M      ax PAM Distance  Branch Cutting    ricesstn tata i  0 95 Plot Custer Tree TN    Plot ME Pairs   mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise blue   163  4 2 eel IE  mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup    grey brown yellow turquoise blue  163 74 aa aal 1B    mergeCloseModules  Merging modules whose 
33. han 2 GB     Detailed description of the analysis steps    1  Load data    File Tools Help    H el p on Step 1 L oa d D ata 1  Load Data   2  PreProcess   3  Network Construction   4  Module Detection   5  Gene Selection        Load a comma or tab delimited file where rows are genes Expression Data      wGCNA example_expression csv leci     probe sets  and columns correspond to microarray samples   he first column should contain gene identifiers     In order to screen for genes or modules that are biologically Sample Trait Data  c  WGCNA example_trait csv Load    significant  you need to define a gene significance measure   optional with Genelnfo      Toward this end  WGCNA implements two options    Option 1  based on a correlation with a microarray sample Gene Info Data      c  wGCNA example_geneinto  csv Load    trait T that corresponds to a column in the trait file  optional with Trait     Then GeneS ignificance i    cor Gene i  T        The trait could be a binary outcome  case control  status  or a quantitative outcome  body weight     Option 2  based on pre defined gene significance measure Next   gt     that corresponds to a column in the gene information file  It  must contain the same gene sets as In the expression data      In order to proceed  you need to load a trait file or a gene  info file  or both  In the next step  you can choose a column  of the trait file as sample information trait or a column of the  gene info file as pre defined gene significance measure
34. hip  Thus  a  systems biologic gene screening method that combines gene significance and connectivity  module  membership  measure amounts to a pathway based gene screening method  Empirical evidence  shows that the resulting systems biologic gene screening methods can lead to important biological  insights  Horvath et al 2006  Carlson et al 2006  Gargalovic et al 2006  Ghazalpour et al 2006      Fuzzy module annotation of the genes   Apart from detecting co expression modules  WGCNA also provide a comprehensive annotation of  all genes on the array with regard to module membership  For each gene  the module membership  table reports the module membership with regard to the identified modules  Instead of forcing genes  into distinct modules  the fuzzy module assignment allows the user to identify genes that may be  close to two or more modules  These fuzzy module annotation tables form a resource for biomarker  discovery  The annotation tables can also be used to determine how close a given gene of interest 1s  to the identified modules  We report both the module membership measure  correlation between the  gene expression profile and the module eigengene  and the corresponding correlation test p value     Functional enrichment analysis of module genes   It is natural to use the module membership measure to come up with lists of genes that comprise the  module  For example  one could select blue module genes on the basis of MMBlue gt 0 6 or  MMBlue lt   0 6  Alternatively  on
35. inition of the network adjacency matrix  we make use of the  fact that gene expression networks  like virtually all types of biological networks  have been found to  exhibit an approximate scale free topology  Albert et al 2000     To choose a particular power P  we used the scale free topology criterion described in  Zhang and  Horvath 2005      beta  6   scale free R42  0 88   slope   1 61   trunc R   2  0 98     0 5     1 0    log10 p k       2 0       0 8 1 0 1 2 1 4 1 6 1 8    log 10 k     Figure  Assessing the scale free topology of a weighted gene co expression network  constructed  using B 6   If the dots form an approximate straight line relationship then the network forms a scale  free network  The black curve corresponds to the regression line with model fitting index R 2  The  red curve describes a truncated exponential fit  see Zhang and Horvath 05  for more details     Scale free topology criterion  this technical section may be skipped at first reading     The network exhibits a scale free topology if the frequency distribution p k  of the connectivity  follows a power law  p k  k      Incidentally  the power gamma has nothing to do with the soft  threshold beta that is used to define the co expression network   To visually inspect whether  approximate scale free topology is satisfied  one plots log p k   versus log k   A straight line is  indicative of scale free topology  To measure how well a network satisfies a scale free topology  we  use the square of the
36. lue of the correlation  coefficient     To construct a weighted network  a power  soft threshold  larger than or equal to 1 should be  specified  Raising the co expression similarity to this power would result in a weighted co   expression network  The default power for an unsigned and a signed network is 6 and 12   respectively  To choose a power  the WGCNA also implements plots for the scale free topology  criterion  Zhang and Horvath 2005   This criterion is described in a separate section    The    power selection    button results in a graph of scale free topology fit  R 2  y axis  versus  different power  x axis   Choose the smallest power for which R 2 gt 0 8 or if a saturation curve  results  choose the power at the kind of the saturation curve     The button    ClusterSamples    can be used to assess whether arrays are outliers  average linkage  hierarchical cluster tree  Euclidean distance   If the dendrogram has two or more very distinct  branches or outlying arrays  the scale free topology criterion may be meaningless  low R 2  values   In this case  simply choose a power  e g  6  or consider removing outlying arrays before  proceeding     Page 12 of 20    WGCNA User Manual    Once the user have chosen a power  the    next    button will automatically create a weighted co   expression network and a corresponding cluster tree  which will be used for module detection     The default network dissimilarity measure is based on the topological overlap matrix  TOM   A  l
37. nectivity measures how connected  or coexpressed  a given gene  is with respect to the genes of a particular module  The intramodular connectivity  may be interpreted as a measure of module membership     The module eigengene corresponds to the first principal component of a given  module  It can be considered the most representative gene expression in a module   Example  MEblue  also denoted as PCblue  denotes the module eigengene of the  blue module     When a microarray sample trait y is available  e g  case control status or body  weight   one can correlate the module eigengenes with this outcome  The  correlation coefficient is referrred to as eigengene significance  The WGCNA  software outputs the eigengene significance of each module  eigengene  and the  corresponding correlation test p value     For each gene  we defined a    fuzzy    measure of module membership by  correlating its gene expression profile with the module eigengene of a given  module  For example MMblue i   cor x  MEblue  measures how correlated gene  1 is to the blue module eigengene  MMBlue i  measures the membership of the i   gene with respect to the Blue module  If MMBlue 1  is close to 0  then the im gene  is not part of the Blue module  But if MMBlue 1  is close to 1 or  1  it is highly  connected to the Blue module genes  The sign of module membership encodes  whether the gene has a positive or a negative relationship with the Blue module  eigengene  WGCNA also outputs the corresponding correl
38. ng on their shape  Langfelder  Zhang  and Horvath 2007   Compared to the constant height cutoff method  dynamic branch cutting   offers the following advantages   1  it is capable of identifying nested clusters   2  it is flexible    branch shape parameters can be tuned to suit the application at hand   3  they are suitable for  automation  WGCNA implements two types of dynamic branch cutting method  The first only  considers the shape parameters  The second method is hybrid method that combines the advantages  of hierarchical clustering and partitioning around medoids     Research aims that can be addressed with WGCNA    Identification of co expression modules with high module significance   Based on the gene significance measure  we define two types of module significance measures     Page 7 of 20    WGCNA User Manual    The first type is simply the average gene significance of the module genes  The second type of is  referred to as eigengene significance  which is only defined for a microarray sample trait y    When a microarray sample trait y is available  e g  case control status or body weight   one can  correlate the module eigengenes with this outcome  The correlation coefficient is referred to as  eigengene significance  WGCNA also outputs the eigengene significance of each module  eigengene   and the corresponding correlation test p value     Identification of intramodular hub genes    Intramodular connectivity can be interpreted as a fuzzy measure of module members
39. nterns tusqusesseomseeuelles 2  SBOE lossar y Ol MClWOlk CONCEDES are a E A iamone Niaselaak 3  Construction of weighted gene co expression networks and modules             nessssssssoeerssssssssseerrsssssssees 4  Module STE iOi a E T 7  Research aims that can be addressed with WGCNA            cccccccccccccsssseeseeceeeeeeaaeessseeceeeeeesaaaeeeeeeeeeeeaaas 7  Tisai atOms te QUIT emeni S aeriene a E a sandal stantenmeteiie eciaosnenrs 8  Detailed description OL theanalysis Steps secesii a Mela aerdaetiee Micah  9   ds AOAC AVA capa eeasbeitudeeniatee ucuteiaet E O E tat iuealeadiacanats 9   2  Data preprocessing and Im Feeling Lazy analysis           cc cecccccccccccccssessseeecceececeaeeeseeeeceeeeeeaaaeeeees 10   DEANE LW ORK C ONS CLIO MN sagas caso aeca ates oleate E 12   Bhs MOME DELEC AON e arate tence strep E atta ue teeta ates 13   BOL  5  EE    HK  E ere mera ace ne OR eae mR Ce eS Renae ere RL cee RR eee eer ee 15  Gene sclechon ouput TICS ures cou iainnretGcweces ten tes owes ute denen ude eae arn aii desun team eee aes 17  SVS IMA E acca ciioetadeloa asc a Dron E E NE 18  TOW  toges S he ID isetace tact stead isetaettehsies ncusnaeiaat tea naceiar eh ton saaeeusariah siendaaayiaiseh ten paauseateitannuesee  19  Network Tunction Hbrtary update cissussssssoecocesiantt ex ioecdddhadanvssuavedadesuaadbausioecdlaaaiackaaussedelasundbaxsieectiaaniantst 19  Recovery ION  unexpected CEL OTS cc auewdaratasea dee estety sed a a 19  PRS E S AEE asia atte es eae E E A aetna 
40. odule is    up    in the corresponding sample  If the correlation  structure of the module is due to a single array  the array may be an outlier     Gene selection     File Tools Help    1  Load Data   2  PreProcess   3  Network Construction   4  Module Detection 5  Gene Selection    bhie module cor  0 04       Manual Gene Selection    Choose the Module   Plot Gene Significance    i   s  Intramodular Kk    Choasa iha Genes Il  Automatic Gene Selection  turquoise GS gt   o2      Auto Gene Selection Hel  K Kmax  gt   0 5 ee    Select Genes    Plot Module Heatmap   Output Gene List      a  o  C     C  a  aul  a  c  a        385 48 15 348 204  mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise blue   163 74 2 zasl UE  mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise blue   163 74 2 l EL  mergeCloseModules  Merging modules whose distance is less than 0 1  colorgroup   grey brown yellow turquoise blue   163 74 aal eael o EN       It can be useful to study the relationship between gene significance and intramodular  connectivity K using the button    Plot Gene significance versus intramodular connectivity        The scaled version of the intramodular connectivity K max K  turns out to be highly related to  the corresponding module membership measure  K measures how centrally located the gene is  inside the module     File Tools Help    1  Load Data   2  PreProcess   
41. son correlation coefficient is calculated for all pairwise comparisons of  gene expression values across all microarray samples  The Pearson correlation matrix is then  transformed into an adjacency matrix A  1 e   a matrix of connection strengths by using a power  function  Thus  the connection strength  adjacency  a i j  between gene expressions x i  and x j  1s  defined as a i j  Icor x 1  x Qj  IAB    Optionally  WGCNA can also be used to construct a signed network  which keeps track of the sign  of the correlation coefficient  a i j   0 5 0 5 cor x 1  x QJ    B     Page 4 of 20    WGCNA User Manual    Because microarray data can be noisy and the number of samples is often small  we weight the  Pearson correlations by taking their absolute value and raising them to the power B    The resulting weighted network represents an improvement over unweighted networks based on  dichotomizing the correlation matrix  because  1  the continuous nature of the gene coexpression  information is preserved and  11  the results of weighted network analyses are highly robust with  respect to the choice of the parameter B  whereas unweighted networks display sensitivity to the  choice of the cutoff  The network connectivity k i  of the ith gene expression profile x 1  is the sum  of the connection strengths with all other genes in the network  1 e  it represents a measure of how  correlated the i th gene is with all the other genes in the network    To determine the power    used in the def
42. y how the gene significance measure should be defined     Page 10 of 20    WGCNA User Manual    For example a microarray sample trait  e g  a numeric outcome  can be chosen from the sample  trait file  If there is more than one trait in the trait data  user needs to choose a certain trait to be  used for gene significance calculation  As an alternative  user can also specify a pre defined gene  significance column in gene info data  WGCNA automatically detects whether a column is  numeric  denoted by    N     or character  denoted by    C         Please make sure to choose a numeric  column as trait or gene significance values    Alternatively  the user can choose a pre specified gene significance measure from the gene  information file  For example  the pre specified gene significance measure indicates pathway  membership  knock out essentiality  or the T test statistic from a prior study     After specifying the gene significance measure  the user can either choose the manual WGCNA  analysis  which allows the user to identify modules and select intramodular hub genes  or the  user can choose and automatic WGCNA analysis by pushing the    I   m feeling lazy    button    The automatic analysis will automatically choose modules and rank the entire genes according to  network screening results  The automatic analysis has one major advantage  it can deal with tens  of thousands of genes  However  the user does not get to see a cluster tree  module heatmaps etc   Although we find
    
Download Pdf Manuals
 
 
    
Related Search
    
Related Contents
User Manual  Miele B999790 Refrigerator User Manual  Bedienungsanleitung  Sonic Alert SB300SS Clock User Manual  Issy-les-Moulineaux : TIRU mars 2010  Repeater Input Check  BÉTON EXPRESS  Tesco.com DAB109FD Marine Radio User Manual  1 MANUAL DE INSTRUCCIONES Introducción    Copyright © All rights reserved. 
   Failed to retrieve file