Home

WEKA User Manual

1. CfsSubsetEval weka attributeSelection BestFirst D 1 N 5 tblSubwayData2007 ms cz Status See error log Log u A WEKA explorer guide Figure 7 EB weka Explorer weka attributeSelection CfsSubsetEval veka attributeSelection BestFirst D 1 N 5 tblSubwayData2007 266 10 AssaultID IncidentNo IncidentDate Duration LateTrains TerminalCancel EnrouteCancel StationID TroubleCode TrainLine Evaluation mode evaluate on all training data Microsoft Word WE B WEKA explorer guide Visualization The last tab in the window is the visualization tab Within the program calculations and comparisons have occurred on the data set Selections of attributes and methods of manipulation have been chosen The final piece of the puzzle is looking at the information that has been derived throughout the process The user can now actually see the fruit of their efforts in a two dimensional representation of the information The first screen that the user sees when they select the visualization option is a matrix of plots representing the different attributes within the data set plotted against the other attributes If necessary there is a scroll bar to view all of the produced plots The user can select a specific plot from the matrix to view its contents for analyzation A grid pattern of the plots allows the user to select the attribute positioning to their liking and for better understanding Once a specific plot
2. from vwSubwayData2007 Volume where TroubleCode lt gt 4011 group by datepart month IncidentDate order by count IncidentNo desc
3. cast datepart month Date as varchar 2 cast datepart day Date as varchar 2 cast datepart year Date as varchar 4 Queries used to produce results Using one of the two software tools mentioned above TOAD or MS SQL Server Express a user can execute these queries to produce results from the data table tables and views created above Trouble Code Lists all trouble codes with occurrence count and average duration of train delays for year 2007 and having more than one occurrence Select Year sd TroubleCode TroubleDesc Count sd TroubleCode as Incidents Avg CAST Duration AS int as AvgDuration From vwSubwayData sd inner join tbITroubleList tl on sd TroubleCode tl TroublelD Where Year 2007 Group By Year sd TroubleCode TroubleDesc Having Count sd TroubleCode gt 2 Order By Incidents desc AvgDuration desc Train Line Lists all train lines with occurrence count and average duration of train delays for the trouble code of Person Holding Doors and year 2007 Select TrainLine Count TrainLine as Incidents Avg CAST Duration AS int as AvgDuration From vwSubwayData Where Year 2007 and TroubleCode 0741 Group by TrainLine Order by Incidents desc Stations Lists all stations with incident occurrence count for trouble code of Employee Assaulted By Cust and year 2007 and having more than one occurrence Select Station Count sd StationID as Incidents From vwSubwayD
4. files into our database tables Six packages were created one for each file to table transfer Local Packages 7 Items Description 22 ImportSubwayIncidentsData2006 SubwayIncidents SB ImportSubwayIncidentsStationlist SubwavIncidents Eu ImportSubwayIncidentsTroubleCodes SubwayIncidents SB Subwaylncident_PredictTrainLines SubwayIncidents SB SubwayIncidents_PredictLateTrains SubwavIncidents SE subwayIncidents PredictStationID SubwayIncidents These packages read the input file data and copy the data into the appropriate table columns Here is a graphical design of one of the DTS packages Connection 1 Create Table Sub Connection 2 This package starts with the Create Table icon which runs the code shown above to create the table Connection 1 is the text file with tab delimited data in it Connection 2 is the destination table in our relational database The black line between the two represents the SQL code that copies the data The data copy is graphically represented here Transform Data Task Properties Source Destination Transformations Lookups Options EM Define the transformations between the source and destination E dit Delete Test Source Destination AssaultlD IncidentNo IncidentD ate Duration LateTrains TerminalCancel aL Select All Delete All Cancel Help Table Relationships Now we have our six tables that are populated with the data tha
5. the application is to utilize a computer application that can be trained to perform machine learning capabilities and derive useful information in the form of trends and patterns WEKA is an open source application that is freely available under the GNU general public license agreement Originally written in C the WEKA application has been completely rewritten in Java and is compatible with almost every computing platform It is user friendly with a graphical interface that allows for quick set up and operation WEKA operates on the predication that the user data is available as a flat file or relation this means that each data object is described by a fixed number of attributes that usually are of a specific type normal alpha numeric or numeric values The WEKA application allows novice users a tool to identify hidden information from database and file systems with simple to use options and visual interfaces Installation The program information can be found by conducting a search on the Web for WEKA Data Mining or going directly to the site at www cs waikato ac nz ml WEKA The site has a very large amount of useful information on the program s benefits and background New users might find some benefit from investigating the user manual for the program The main WEKA site has links to this information as well as past experiments for new users to refine the potential uses that might be of particular interest to them When prepared to download the soft
6. WEKA User Manual Contents WEKA Introduction SEE staged C Pl 3 e Background information 3 Installationen RTS a ma a a M Ra Se dee ote poet um 3 e Where to get WEKA necem bete Baer e Dn pde ade 3 e Downloading Information 3 Opening the program 4 e Chooser Menu iii 4 6 PIE PLOCSSSING ses io ood CR DE Qe a tv teens nae Ode e RE E DELL deviate 6 7 IECIT ILE 7 8 CIUSIGE 2 12er Carolo doge en dre in diss HIS RU Vener Ede oio sete ev rude T deer da 8 tfi e ECHTE 9 SelectAttnbutes m EE 9 10 MisHallZatiOE 2 52 2904222922 c1 E ER tentant Ede ia ca fe 11 13 Microsoft SQL Database Relational Database 7 iiic E Led E edv Eo oe eee c 15 Procedures to access and use Database 15 Loading flat files into relational database 15 16 Xe 16 18 e Table Relationships nnns 18 19 Creating VIEWS als een 19 20 Queries to produce results 20 21 e Temp Table example 21 22 e Vol me Dala 5 ane date nan iv uper centaine Ce 22 Introduction WEKA formally called Waikato Environment for Knowledge Learning is a computer program that was developed at the University of Waikato in New Zealand for the purpose of identifying information from raw data gathered from agricultural domains WEKA supports many different standard data mining tasks such as data preprocessing classification clustering regression visualization and feature selection The basic premise of
7. ab opens a window to select the options for associations within the data set The user selects one of the choices and presses start to yield the results There are few options for this window and they are shown in Figure 6 below Preprocess Classify Cluster Associate Select attributes Visualize ks 1 0 Lj i PredictiveApriori Tertius Lations Apriori N 10 T 0 C 0 9 D 0 05 U 1 0 M 0 1 S 1 0 hra2007 peer bei Status See error log vo JM Figure 6 Select Attributes The next tab is used to select the specific attributes used for the calculation process By default all of the available attributes are used in the evaluation of the data set If the use wanted to exclude certain categories of the data they would deselect those specific choices from the list in the cluster window This is useful if some of the attributes are of a different form such as alphanumeric data that could alter the results The software searches through the selected attributes to decide which of them will best fit the desired calculation To perform this the user has to select two options an attribute evaluator and a search method Once this is done the program evaluates the data based on the sub set of the attributes then performs the necessary search for commonality with the date Figure 7 shows the opinions of attribute evaluation Figure 8 shows the options for the search method weka attributeSelection
8. ata sd inner join tblStationList sl on sd StationID sl StationlD Where Year 2007 and TroubleCode 4011 Group by Station Having Count sd StationID gt 2 Order by Incidents desc Person Holding Doors incidents that lead to an assault on an employee This is a Cursor It queries all assaults and individually loops over the record set to find associated Person Holding Doors incidents The cursor first selects all assaults trouble code 4011 for year 2007 It loops over this record set to find associated trouble code 0741 Persons Hold Doors Drop table Templincidents Create Table Templncidents AssaultID Varchar 6 Duration Varchar 3 StationID int TrainLine Varchar 2 DECLARE assault varchar 6 DECLARE Assault cursor CURSOR FOR Select AssaultID From vwSubwayData sd Where sd Year 2007 and TroubleCode 4011 OPEN Assault_cursor FETCH NEXT FROM Assault_cursor INTO assault WHILE FETCH_STATUS 0 BEGIN Insert Into Tempincidents Select AssaultiD Duration StationID TrainLine From vwSubwayData sd Where TroubleCode 0741 and AssaultID assault FETCH NEXT FROM Assault_cursor INTO assault END CLOSE Assault_cursor DEALLOCATE Assault_cursor Select From Templincidents Queries based on the temporary table Templincidents Station Lists the Station and the train line with how many occurrences and average duration of the delay during a door holding incident Select Station Tra
9. entify clusters within the data file 4 Association used to apply different rules to the data file that identify association within the data 5 Select attributes used to apply different rules to reveal changes based on selected attributes inclusion or exclusion from the experiment 6 Visualize used to see what the various manipulation produced on the data set in a 2D format in scatter plot and bar graph output Once the initial preprocessing of the data set has been completed the user can move between the tab options to perform changes to the experiment and view the results in real time This provides the benefit of having the ability to move from one option to the next so that when a condition becomes exposed it can be placed in a different environment to be visually changed instantaneously Preprocessing In order to experiment with the application the data set needs to be presented to WEKA in a format that the program understands There are rules for the type of data that WEKA will accept There are three options for presenting data into the program Open File allows for the user to select files residing on the local machine or recorded medium Open URL provides a mechanism to locate a file or data source from a different location specified by the user Open Database allows the user to retrieve files or data from a database source provided by the user There are restrictions on the type of data that can be accepted into the pro
10. fy specific information The ability to pick from the available attributes allows users to separate different parts of the data set for clarity in the experimentation The user can modify the attribute selection and change the relationship among the different attributes by deselecting different choices from the original data set There are many different filtering options available within the preprocessing window and the user can select the different options based on need and type of data present Classify The user has the option of applying many different algorithms to the data set that would in theory produce a representation of the information used to make observation easier It is difficult to identify which of the options would provide the best output for the experiment The best approach is to independently apply a mixture of the available choices and see what yields something close to the desired results The Classify tab is where the user selects the classifier choices Figure 4 shows some of the categories Weka Explorer DER nn Preprocess Classify Cluster Associate Select attributes Visualize Classifier la weka 3 0 classifiers C bayes amp Q functions amp C3 lazy kmation H D meta H E misc weka classifiers rules M5Rules M 4 0 trees weather S rules 14 ConjunctiveRule 5 DecisionTable outlook RP temperature iz i humidity Abe windy OneR l PART pa e P
11. gram Originally the software was designed to import only ARFF files newer versions allow different file types such as CSV C4 5 and serialized instance formats The extensions for these files include csv arff names bsi and data Figure 3 shows an example of selection of the file weather arff Preprocess Classify Cluster Associate Select attributes Visualize Lookin data Open file Open URL Open DB ae p Jl SEED er Fiter uu cpu My Recent f cpu with vendor None Documents ee iris Current relation t labor Relation weather L f segment challenge Instances 14 Attributes 5 Desktop seament test soybean weather amp weather nominal Attributes All None J Invert My Documents 9 2 _Itemperature 3 Clhumidity My Computer 4E windy 5 Cplay E gt File name weather arff Open My Network Places Files of type Eee vi Cancel AI Fies rary seized instances IC45 names files CSV data files Arff data files Remove Status ok X Microsoft Word WE R WEKA explorer guide ka Explorer Figure 3 Once the initial data has been selected and loaded the user can select options for refining the experimental data The options in the preprocess window include selection of optional filters to apply and the user can select or remove different attributes of the data set as necessary to identi
12. has been selected the user can change the attributes from one view to another providing flexibility Figure 9 shows the plot matrix VIEW B weka Explorer FERN Preprocess Classify Cluster Associate Select attributes Visualize Plot Matrix AssauktiD IncidentHo IncidentDate Duration LateTrains TerminalCancel EnrouteCancel StationiD TroubleCode TrainLine TrainLine TroubleCode di StationiD EnrouteCancel vi Y Plotsize 100 i PointSize 1 0 Update sitter g Select Attributes Colour Trainline Nom x SubSampese Jio Class Colour AE 2 0 Status Problem evaluating classifier P a x0 Microsoft Word WE E WEKA explorer guide EB weka Explorer Figure 9 The scatter plot matrix gives the user a visual representation of the manipulated data sets for selection and analysis The choices are the attributes across the top and the same from top to bottom giving the user easy access to pick the area of interest Clicking on a plot brings up a separate window of the selected scatter plot The user can then look at a visualization of the data of the attributes selected and select areas of the scatter plot with a selection window or by clicking on the points within the plot to identify the point s specific infor
13. inLine Count sd StationID as Incidents Avg CAST Duration AS int as AvgDuration From Templncidents sd inner join tblStationList sl on sd StationID sl StationID Group by Station TrainLine Having Count sd StationID gt 2 Order by Incidents desc Train Line Lists the train line with how many occurrences and average duration of the delay during a door holding incident Select TrainLine Count TrainLine as Incidents Avg CAST Duration AS int as AvgDuration From Templncidents Group by TrainLine Order by Incidents desc Queries based on Volume data The following queries are based on the volume data using the volume views created earlier Average number of riders by month select datepart month Date as month avg Rider from tblVolume2007 group by datepart month Date order by avg Rider desc Total number of assults by month select datepart month IncidentDate as month count IncidentNo as Assault from vwSubwayData2007Volume where TroubleCode 401 1 group by datepart month IncidentDate order by count IncidentNo desc Total number of DELAYED BY TRACK WORK GANGS incidents by month select datepart month IncidentDate as month count IncidentNo as Incidents from vwSubwayData2007 Volume where TroubleCode 8204 group by datepart month IncidentDate order by count IncidentNo desc Total number of non assault incidents by month select datepart month IncidentDate as month count IncidentNo as Incidents
14. ity as Explorer with drag and drop functionality The advantage of this option is that it supports incremental learning from previous results While the options available can be useful for different applications the remaining focus of the user manual will be on the Experimenter option through the rest of the user guide After selecting the Experimenter option the program starts and provides the user with a separate graphical interface Weka Explorer ss Preprocess Open file Open URL Open DB Filter Current relation Selected attribute Relation None Name None Type None Instances None Attributes None Missing None Distinct None Unique None Attributes Visualize All L Status Welcome to the Weka Explorer Zs an Microsoft Word WE T WEKA explorer guide Weka Explorer RJE Je 5 11 48 am Figure 2 Figure 2 shows the opening screen with the available options At first there is only the option to select the Preprocess tab in the top left corner This is due to the necessity to present the data set to the application so it can be manipulated After the data has been preprocessed the other tabs become active for use There are six tabs 1 Preprocess used to choose the data file to be used by the application 2 Classify used to test and train different learning schemes on the preprocessed data file under experimentation 3 Cluster used to apply different tools that id
15. lties or clusters of occurrences within the data set and produce information for the user to analyze There are a few options within the cluster window that are similar to those described in the classifier tab They are use training set supplied test set percentage split The fourth option is classes to cluster evaluation which compares how well the data compares with a pre assigned class within the data While in cluster mode users have the option of ignoring some of the attributes from the data set This can be useful if there are specific attributes causing the results to be out of range or for large data sets Figure 5 shows the Cluster window and some of its options Weka Explorer Cluster Preprocess Classify Cluster Associate Select attributes Visualize Clusterer 9 weka 2 5 dusterers CIEI Clusterer output EM FarthestFirst Run information MakeDensityBasedClusterer SimpleKMeans Scheme weka clusterers Cobweb A 1 0 C 0 0028209479177387815 Relation tblSubwayData2007 Instances 26 Attributes 10 AssaultID Incidentllo IncidentDate Duration LateTrains TerminalCancel EnrouteCancel StationID i TroubleCode Ignored f TrainLine Test mode Classes to clusters evaluation on training data Status Problem evaluating clusterer Log aS x0 Microsoft Word WE a WEKA explorer guide Weka Explorer Figure 5 Associate The associate t
16. mation Figure 10 shows the scatter plot for two attributes and the points derived from the data set There are a few options to view the plot that could be helpful to the user It is formatted similar to an X Y graph yet it can show any of the attribute classes that appear on the main scatter plot matrix This is handy when the scale of the attribute is unable to be ascertained in one axis over the other Within the plot the points can be adjusted by utilizing a feature called jitter This option moves the individual points so that in the event of close data points users can reveal hidden multiple occurrences within the initial plot Figure 11 shows an example of this point selection and the results the user sees Select Instance Clear Il Save me 7 cluster2 ag Start wy EB Figure 10 Y Duration Num Select Instance de ih m Titer Plot Master Plot IncidentNo IncidentDate TerminalC LateTrains 1 Duration 180 clusterl x JET DR 4 7 cluster Figure 11 E ERE T There are a few options to manipulate the view for the identification of subsets or to separate the data points on the plot Polyline can be used to segment different values for additional visualization clarity on the plot This is useful when there are many data points represented on the graph Rectangle this tool is helpful to select instances within the graph fo
17. r copying or clarification Polygon Users can connect points to segregate information and isolate points for reference This user guide is meant to assist users in their efforts to become familiar with some of the features within the Explorer portion of the WEKA data mining software application and is used for informational purposes only It is a summary of the user information found on the programs main web site For a more comprehensive and in depth version users can visit the main site http www cs waikato ac nz ml WEKA for examples and FAQ s about the program Microsoft SQL Server User Manual Relational Database Our team decided to load the provided data into a Microsoft SQL Server database Mr Washington gave us five text files four of which are tab delimited files of New York Transit Authority data List of Files Data field descriptions A list of 11 column headers for 2006 and 2007 data 2006 incident data 11 column list of incident data for year 2006 2007 incident data 11 column list of incident data for year 2007 Station list two column list of station ID and station descriptions Trouble list two column list of trouble ID and trouble descriptions Later in the project two more files were provided These files had rider volume by day for years 2006 and 2007 List of Files Volume2006 xls Volume 2007 xls Procedures to access and use Database The database server is an internal server which mean
18. rism 10 fold cross validation N Ridor Zero Status Problem evaluating classifier 3 Start Microsoft Word WE T WEKA explorer guide Weka Explorer Figure 4 Again there are several options to be selected inside of the classify tab Test option gives the user the choice of using four different test mode scenarios on the data set 1 Usetraining set 2 Supplied training set 3 Cross validation 4 Split percentage There is the option of applying any or all of the modes to produce results that can be compared by the user Additionally inside the test options toolbox there is a dropdown menu so the user can select various items to apply that depending on the choice can provide output options such as saving the results to file or specifying the random seed value to be applied for the classification The classifiers in WEKA have been developed to train the data set to produce output that has been classified based on the characteristics of the last attribute in the data set For a specific attribute to be used the option must be selected by the user in the options menu before testing is performed Finally the results have been calculated and they are shown in the text box on the lower right They can be saved in a file and later retrieved for comparison at a later time or viewed within the window after changes and different results have been derived Cluster The Cluster tab opens the process that is used to identify commona
19. rom databases that are far too large to be analysed by hand WEKA s users are ML researchers and industrial scientists but it is also widely used for teaching Our objectives are to e make ML techniques generally available e apply them to practical problems that matter to New Zealand industry e develop new machine learning algorithms and give them to the world contribute to a theoretical framework for the field internet R 1009 E WEKA explorer guide Microsoft Word WE C Weka 3 Data Mining Machine Learning Pro RJE CA Mi 10 45AM Opening the program Once the program has been loaded on the user s machine it is opened by navigating to the programs start option and that will depend on the user s operating system Figure 1 is an example of the initial Opening screen on a computer with Windows XP cux Weka GUI Chooser BEES Waikato Environment for Knowledge Analysis Version 3 4 12 c 1999 2007 University of Waikato New Zealand Figure 1 Chooser screen There are four options available on this initial screen Simple CLI provides users without a graphic interface option the ability to execute commands from a terminal window Explorer the graphical interface used to conduct experimentation on raw data Experimenter this option allows users to conduct different experimental variations on data sets and perform statistical manipulation Knowledge Flow basically the same functional
20. s 2006 and 2007 contained information about stations and trouble codes from station list and trouble code list we wanted create relationships between those tables Here are the table structures column Name Data Type Length Allow Nuls AssaultID varchar 6 v varchar 6 v datetime 8 v varchar 3 v varchar 3 v varchar 2 v varchar 2 v int 4 v varchar 4 v varchar 2 v k Station varchar Train Station Table tblStationList Data Type Length Allow Null leI1D varchar TroubleDesc varchar v Trouble List Table tbITroubleList Column Name Data Type Length Allow Nulls b smalldatetime 4 v Rider int 4 v Volume Table tblVolume2006 and tblVolume2007 These six were created by executing SQL code Here is an example of how the Incident table was created CREATE TABLE Subwaylncidents pcronin tblSubwayData2006 AssaultlD varchar 6 NULL IncidentNo varchar 6 NULL IncidentDate datetime NULL Duration varchar 3 NULL LateTrains varchar 3 NULL TerminalCancel varchar 2 NULL EnrouteCancel varchar 2 NULL StationID int NULL TroubleCode varchar 4 NULL TrainLine varchar 2 NULL DTS After the tables were created with desired data types and structure we had to import the data from the text files to the relational tables inside SQL Server We used DTS Data Transformation Services in Microsoft SQL Server to move the data from the text
21. s it can only be directly access within Pace University grounds For access off Pace campuses the team would need to use the university s VPN dialer which would give us network access as if we were inside the university Once the dialer is downloaded and installed we can connect using our Pace University issued user accounts email address Now that we are virtually connected to the Pace University network we can access the database server There are a number of tools that can be used with Microsoft SQL Server We narrowed our preferences down to two Quest Toad for SQL Server http www toadsoft com toadsqlserver toad_sqlserver htm or SQL Server 2005 Express Edition http msdn2 microsoft com en us express bb410792 aspx Both software applications are free to use Once the software is downloaded and installed we can connect to the database server and to our specific database Here is the connection information Server Name csis rose 172 20 138 37 Authentication SQL Server Authentication Database Name Subwaylncidents User Name Subwaylncidents PW ask for password Loading flat files into relational database Before loading data into your relational database system we first had to consider some preliminary questions Table structure data types and relational integrity We decided to take the six data files and import them into their own tables Deciding on data types was more complex Because the two incident data file
22. t Mr Washington provided us The last step is to create relationships between them to protect data integrity The main incident data are contained in two tables tblSubwayData2007 and tblSubwayData2006 and the tblStationList and tblTroubleList are lookup or validation tables These tables provide full text descriptions for numeric data in the incident tables Here are the graphical relationships tblStationList pcronin tblTroubleList pcronin Column Name Data Type Length Allow Nulls A yu StationID int 4 TroubleID Station varchar 256 TroubleDesc 4 256 varchar varchar tblSubwayData2006 pcronin Column Name Data Type Length Allow Nulls AssaultID varchar IncidentNo varchar IncidentDate datetime Duration varchar LateTrains varchar TerminalCancel varchar EnrouteCancel varchar StationID int TroubleCode varchar TrainLine varchar BO OR PS PR Co Co CO C Ch NN amp The relationships are bound by the tables Primary Key PK and Foreign Key FK fields The relationships 1 tblStationList StationID tblSubwayData2006 StationID 2 tbiTroubleList TroublelD tblSubwayData2006 TroubleCode Creating Views To simplify comparing data between the two years we thought it would be a good idea to create a view that would combine the data The code below shows how we union the data together from the two tables CREATE view vwSubwayData as Select 2006 as Year AssaultiD IncidentNo IncidentDa
23. te Duration LateTrains TerminalCancel EnrouteCancel StationID TroubleCode TrainLine From tblSubwayData2006 Union all Select 2007 as Year AssaultiD IncidentNo IncidentDate Duration LateTrains TerminalCancel EnrouteCancel StationID TroubleCode TrainLine From tblSubwayData2007 Most of the queries that we constructed below are based on this SQL Server view Instead of creating relationships for the volume data we created views that joined the two volume data tables with the two incident tables by year The volume data contains a date field that represents the day of the year with total volume ridership The views join the volume data with the incident data by the date Here is the view creation of the 2006 data Create view vwSubwayData2006Volume AS select s v Rider from tblSubwayData2006 s Inner Join tblVolume2006 v ON cast datepart month IncidentDate as varchar 2 cast datepart day IncidentDate as varchar 2 cast datepart year IncidentDate as varchar 4 cast datepart month Date as varchar 2 cast datepart day Date as varchar 2 cast datepart year Date as varchar 4 Here is the view creation of the 2006 data Create view vwSubwayData2007Volume AS select s v Rider from tblSubwayData2007 s Inner Join tblVolume2007 v ON cast datepart month IncidentDate as varchar 2 cast datepart day IncidentDate as varchar 2 cast datepart year IncidentDate as varchar 4
24. ware it is best to select the latest application from the selection offered on the site The format for downloading the application is offered in a self installation package and is a simple procedure that provides the complete program on the end users machine that is ready to use when extracted Machine Learning Project Windows Internet Explorer G Af http www cs waikato ac nz ml v i x es TIE File Edit View Favorites Tools Help We afe a Machine Learning Project E r a The University Pro ject b of Waikato E project software book publications people related Weka Machine Learning Project An exciting and potentially far reaching development in computer science is the invention and application of methods of machine learning These enable a computer program to automatically analyse a large body of data and decide what information is most relevant This crystallised information can then be used to automatically make predictions or to help people make decisions faster and more accurately The overall goal of our project is to build a state of the art facility for developing machine learning ML techniques and to apply them to real world data mining problems Our team has incorporated several standard ML techniques into a software workbench called WEKA for Waikato Environment for Knowledge Analysis With it a specialist in a particular field is able to use ML to derive useful knowledge f

WEKA User Manual

Contents

Download Pdf Manuals

Related Search

Related Contents