Home

Decision Trees

image

Contents

1. was chosen by the current method Splitters You can browse all the splitters all possibilities for splitting the data in the current node You can see detailed information about selected splitter on the right panel Find the best splitter Click on this button if you lost the best splitter and you would like to select it to display its detail Clicking on the blue name of the best splitter causes the same action ocument Decision ree age ES HE KNOCKER 3 Decision Trees Tutorials 3 1 Building the Decision Tree This tutorial shows you step by step how to build your own decision tree from your training set 1 Prepare your data You need a table of classified records You can see an example of such table on the picture below humidity windy play Overcast Hot High False Yes Overcast Cool Normal True Yes Overcast 11d High True Yes Overcast Hot Normal False Yes Rainy Mild High False Yes Rainy Cool Normal False Yes Rainy Coal Normal True No Rainy 1d Normal False Yes Rainy 1d High True Mo Sunny Hot High False No Sunny Hot High True No Sunny 1d High False Flo Sunny Coal Normal False Yes Sunny Mild Normal True Yes In the table there are several categorical attributes columns and the last one is the goal class of record 2 Add your table as version into the main application This process is very simple and it is described in another part of documentation 3 Run the module Decision Trees 4 Set the da
2. E f zac Laquatic U 3 ad arurmal id Feb A arimal id zeasnak 5 3 The picture shows you how to select the method of building tree 3 3 Classification data Classification of your data has a couple simple steps as described in this tutorial 1 Prepare the decision tree You can perform this step by building a tree from training set see above or by loading an existing tree from XML file ocument Decision ree age ES HE KNOCKER 2 Set data for classification Click Settings Data for classify The dialog described above is shown 3 Click Command Classify Now you have to choose a name for your new table new version which will contain your classified data Classification will start after the name confirmation It is usually a very quick process 4 Enjoy your classified data You can see the classified data in the main application under the name you chose before The new table version will be descendant of starting data table 4 Requirements Necessary components for correct running of this module e all common components of main application Knocker e DecisionTree dll e DMTransformStruct dll e PtreeViewer dll e Guikxt dll e Guidll The main rumnable class is DecTreeMain in DecisionTree dll 5 Samples You will find some interesting sample data in CSV Knocker friendly format as a part of the distribution e iris 4 numerical attributes 3 classes 150 records suitable for ID3 e m
3. ES Decision Trees User Manual Author Ji Vitinger Library DecisionTree dll Runnable class DecTreeMain ocument Decision ree age IRA SA KNOCKER Content 1 INTRODUCTION TO DECISION TR EE Sonia 3 2 DECISION TREES USER INTERFACE aia aid 4 Zd MAN WINDOW CARE AV O O O A O O O O O 4 220 0 M0 P O O O O AEO A A E OE 4 A hal PPS R O O O O O O O O 6 22 GENER AL PROPERTIES DIALOG snm eto orale 7 2 3 CARI TR PERI DY P G e OPOP T O A ETE T 7 2 4 DOS TPROPERTIES DIALOS od O O R O 7 25 DATATOR PUTO SS asain POP Lo O 8 2 6 DATATOR CLASSE DIALOC ao or oro sia 9 2 7 DECISION TREE INFORMATION DIALOG sra 10 2 8 BA IN OS MORAVE ORP O E A A 10 2 DECISION TREES TUTORIALES ogc coca sos os ideas cede ncsc seas acacacecoucesencocacconcaeccscncoecseacseacacescnscsaceweteteeceaecs 12 3 1 PETE I DECISION M 32 ini EAE AAA O 12 3 2 CHOOSING BETWEEN IDS AND CA RI METHODS asirios ii is 14 3 3 CLASSIC HONDA TA ion aislada 14 A KEQUIREMEN TS instan tar oia 15 Y DAVIES ido 15 ocument Decision ree age ES HE KNOCKER 1 Introduction to Decision Trees e Decision tree is a graph of decisions nodes and their possible consequences edges e Decision trees are constructed in order to help with making decisions e Decision making with decision tree is a common method used in data mining outlook Overcast outlook Sunny TEZ humidity High humidity Morral windy windy True Mo TEZ TEZ Mo An e
4. atic Ll E dAarumal id zeasnak 3 A window containing all attributes which are associated with the node appears after clicking at some node You can see an example of this window below InfoWindow Mame Condition from parent Split Result class Probability Number of records 2 1 1 5 Buttons in the tree view tool bar Zoom in and zoom out x ws Zoom in and zoom out axis independently m Set exact space of each node in the pixels m it the tree to the window eT y TY ocument Decision ree Value milk W fins Y 4 AA O it the tree to the window always when resizing the window Toggle button default is pushed age 2 2 General Properties dialog You can set parameters which are important for all methods in this window General Properties Maximal depth of the tree Zero tolerance for real numbers 0 00001 aw i a Maximal depth of the tree This parameter restricts maximal tree depth This is useful especially in case of large trees Default value is 20 and usually you don t need to change it Zero tolerance for real numbers This value is used during the comparison of two real numbers double If abs number1 number2 lt Zero then we say that number1 number2 2 3 CART Properties dialog CART Properties Method to compute Diversity index Cpl log pl p2 log p2 Ok Cancel There is a value called Diversit
5. d All Clear Select numerical attributes The others are categorical Available columns Selected columns sepal_lenght double sepal width double petal length double 2 Remove petal width double class varchar Add All Column with the class information class varchar Working column Will be created treenode OR Cancel Get data Click on this button to choose the data source called version as it is defined in the main application Knocker Select columns for building the tree Select columns attributes for building the tree algorithm will ignore the others Select numerical attributes The others are categorical The numerical attributes are managed in a different way in ID3 algorithm The splitter doesn t split the data in so many branches as is the count of attribute values but into two intervals Usually for numerical attributes is much better to select them in this dialog The result decision tree will be simpler and smaller Column with the class information This field represents name of the column which will not be used to split the data and whic contains the goal class of record ocument Decision ree age ES R KNOCKER Working column will be created It is the column for temporary information It should have a name different from all other attribute names in the table It is created automatically 2 6 Data for classify dialog You have to
6. he tool bar are connected to the most important commands of menu You may use several buttons for setting the view in the tree view details can be found bellow 2 Decision rees ig E Main View Settings Command outlook outlook Sur ity 7 2 1 1 Menu 2 1 1 1 Main e Open tree 154 Open tree This command loads a previously saved tree from XML file The XML file has to be in a Save Cree special format Use only files created by this software It has an equivalent on the tool bar e Save tree El This command saves current tree to the XML file in a special format It has an equivalent on the tool bar Close e Close This command closes the Main Window and returns user to the main application Knocker 2 1 1 2 View w Split case From parent You can find all settings for the tree view in this section Each item in this split condition for chidren submenu represents an attribute of a tree node You can make decision whether this item should be displayed directly in the tree view or not You can turn on off each item individually or switch all items at once w Result class Probability of class in node Count of records View all View nothing ocument Decision ree age ES HE KNOCKER General properties 2 1 1 3 Settings CART properties Each item in this sub menu opens a special dialog for specification of some settings IDS properties These items are described bellow
7. in more detail Data for build Ho Data For classify Item Tree information has an equivalent on the tool bar Ls Tree information 2 1 1 4 Command e Set active method Set active method k Build tree Step building tree ClasifFy You can choose one of the implemented methods for tree building in this submenu Currently CART and ID3 methods are available e Build tree gt This command builds a decision tree with respect to the actual settings Selected method will be used This command has an equivalent on the tool bar e Step building tree This command makes a first step of building tree process Then you may explore details of all algorithm possibilities You can also watch the way in which the best possibility is chosen See the Step Info Window section for details It has an equivalent on the tool bar e Classify This command classifies your data using the current decision tree The result is stored in a new table ocument Decision ree age 2 1 2 Tree view Fn You can see nodes edges and some important attributes associated with the nodes in the tree view Which attributes should be visible can be changed in submenu View je wheathers Ll a 1 milk 1 milk U hn 1 fins U Stal on 5 3 Lsquatic LFanimal_ id Feb 5 leather 1 2 L backbone W gtarbome U Ldarborne 1 H E Apredator 1 Zpredator W f Ltlegs E gleges W f aqu
8. ormation Used columns predator Type of a result class STRING 2 8 Step info dialog You can see the details of tree building process in this window Everything is read only Current method 3 According to the current settings the best splitter is the splitter with the biggest Gain Ratio but the Information Gain cannot be smaller than the Average Gain Current rode Depth 1 Condition from parent node Best splitter petal length lt 3 Splitters Description of selected splitter sepal_width lt 3 6 7 Mame petal length lt 37 sepal width lt A Count of subnodes 2 sepal width lt 3 0 7 sepal width lt 3 9 7 Type NUMERICAL petal length lt 37 Information 1 58496250072116 petal length lt 4 5 7 Subnodes Information C 66666666666666 petal length lt 4 6 7 Information Gain 0 918235834054409 petal length lt 4 8 7 Average Gain 0 3585308069645638 petal length lt 4 9 7 Gan Ratio 1 petal length lt 5 Split Information 0 91829583405449 petal length lt 5 1 7 petal length lt 5 2 7 Find the best splitter Next step Aun Cancel Current method This text describes you current method and its settings Current node There is a short description of the current node whereas the splitters are defined This node has a red color in the tree view for your better orientation ocument Decision ree age ES E KNOCKER Best splitter Name of the best splitter splitter
9. rties Progress bar increases its value just when the algorithm reaches some node and places there some records Then according to the number of these placed records progress bar changes its value Because of this progress bar updating there can be a situation when the value rises very slowly or stops rising for a while ocument Decision ree age Fn 3 2 Choosing between ID3 and CART methods You can select one of the two implemented algorithms before you start building process Detail specifications of these methods are following CART e only 2 classes e only binary trees are possible each splitter creates two sub nodes attr val attr val e cannot work with numerical attributes even if you set them in Data for build dialog as numerical they will be considered categorical ID3 e more classes e works with both categorical and numerical attributes e categorical splitter select one categorical attribute and makes one sub node for each its present value e numerical splitter makes two sub nodes attr lt val attr gt val e this is the default method and usually has better results than CART 3 Decision Trees Main View Settings Ae Set active method Build tree Step building tree Clasiffy Heathers 0 Liteatherz 1 1 2 L backbone Libackbone W ral 1 gA milk U E aborne U k abom 4 b rz plz W Apredator 1 Kpredator W j 3 a dbal U pla Legs 5 Legs U 5 3
10. set up data in this dialog before every data classification Data for classify Columns used tn the tree Columns in the table animal id animal id hair hair feathers feathers egg egg milk milk airborne airborne aquatic aquatic predator predator toothed toothed backbone backbone The columns tn the tree has to be a subset of the columns in the table This condition is now OK Working column will be created treenode Column for a result class information will be created result class OR Cancel Get data Click on this button to choose the data source called version as it is defined in the main application The columns in the decision tree classification have to be a subset of the columns in the data source table The big colored label in the middle part of the dialog tells you whether this condition is met OK or not Working column will be created It is the column for some temporary information It should have a different name from the all other attributes names in the table It will be created automatically Column for result class information will be created The result class will be stored to this column It should have a different name from all other attribute names in the table It will be created automatically ocument Decision ree age 2 7 Decision tree information dialog You will find some properties of currently displayed decision tree in this window Decision Tree Inf
11. ta for building Click Settings Data for build You will see the dialog described above For example set it to the state as is shown in the picture below Notice that there is no numerical attribute The last column play is chosen as the goal class column Working column is still default ocument Decision ree age Data for build Select columns for building the tree Available columns Selected columns outlook varchar outlook varchar temperature varchar temperature varchar humidity varchar Remove humidity varchar windy varchar windy varchar play varchar play varchar Add All Clear Select numerical attributes The others are categorical Available columns Selected columns outlook varchar temperature varchar humidity varchar 2 Remove windy varchar Add All Clear play varchar Column with the class informatiorr play varchar Working column will be created treenode OR Cancel 5 Click gt or La Pressing the first button invokes running the whole building process at once the second one means stepping this process Stepping is described in more detail in the section about Step info dialog above When you are building the tree from some big training set more than 1000 records be very patient It may last for several minutes For this reason you can set the smaller tree depth limit in the General prope
12. ushroom 22 categorical attributes 2 classes 8124 records suitable for ID3 and CART it can take a lot of time e weather 4 categorical attributes 2 classes 14 records suitable for ID3 and CART e zoo 16 categorical attributes 1 numerical attribute 7 classes 101 records suitable for ID3 the first column is unique name of the animal so it should not be used in a building process but you can try it and you will see the problem ocument Decision ree age
13. xample of a decision tree according to the weather we would like to know if it is good time to play some game Decision tree describes a tree structure in which leaves represent classifications and edges represent conjunctions of features that lead to these classifications A decision tree can be learned built by splitting the source data set training set into subsets This splitting is based on an attribute value test This process is repeated on each derived subset in a recursive manner The recursion is completed when the splitting is either non feasible or a singular classification can be applied to each element of the derived subset Two methods of learning are implemented in this software CART and ID3 You can build a tree directly or you can see each step of this process Then you can browse by the tree zoom shift etc Finally you can classify your data using the prepared tree Detailed information about decision trees can be found at the following links http en wikipedia org wiki Decision tree http www cs ubc ca labs lci Clspace Version4 dTree http www cise ufl edu ddd cap6635 Fall 97 Short papers 2 htm http www doc ic ac uk sec teaching v231 lecture11 html ocument Decision ree age ES R KNOCKER 2 Decision Trees User Interface 2 1 Main Window You can see a menu tool bar and tree view in the main window You can manage all tasks details can be found bellow in the menu Buttons of t
14. y which is counted for each possible branch in the CART algorithm You can choose one of the three ways to count it The value p1 is probability of first result class etc CART method works only with two result classes its tree is always a binary tree 2 4 ID3 Properties dialog IDS Properties i Gain ratio protection against ver multivalue attributes i Gain ratio with average gain control Ok Cancel There is a counted Information Gain for each splitter possibility how to split data in node in the ID3 algorithm Usually the more attribute values the more Information Gain First check box is used for defense against multi valued attributes like unique id of each record ocument Decision ree age Te R KNOCKER If the second check box is checked each chosen splitter will have to have the Information Gain higher or egual as the average Gain Information of every splitter in the node In most cases the best choice is to have the both check boxes checked 25 Data for build dialog Data should be set up in this dialog before the tree is built Data for build Get data Iris Select columns for building the tree Available columns Selected columns sepal_lenght double sepal lenght double sepal width double sepal width double petal length double Remove petal length double petal width double petal width double class varchar class varchar Ad

Download Pdf Manuals

image

Related Search

Related Contents

Ultra2 SCSI Controller for CompactPCI Applications CP360  Simulação de Probabilidades Manual do Utilizador  imc C-SERIE User`s Manual  Samsung CS-29A10HEQ Инструкция по использованию  TAFCO WINDOWS OCT2222-G-LG Installation Guide  Gigaset SL780/SL785 – more than just a telephone  USER GUIDE - VLE - University of Leeds  90_806_69--DBL-02-de    

Copyright © All rights reserved.
Failed to retrieve file