Home
        User Guide - SAP Service Marketplace
         Contents
1.              5amples K5C H Browse    Events   fie view csv P d   a     2l Browse      Gn Wr       4 Click the Next button     Describing Events Data    The screen Events Data Description lets you describe your Transaction data  offering you the same  options as the screen Data Description     For Sequence Coding to function properly  there must be a variable in the Transaction data set that is  the same as the primary key declared for the Reference data set  referred to as a    Join Column     The  name of the variable can be different  but the storage and value must be the same  The values of  this variable need not be unique  since each Reference key can have O  1  or several associated  transactions     In addition to a suitable join column  the Transaction data set must have at least one datetime  variable  The datetime variable will be used by Sequence Coding to order the transactions     One of the datetime variables must absolutely be ordered and declared as such by setting to 1 the  Order column for this variable in the description file     When the data source comes from a database  Infinitelnsight   uses a query with an order by on  the variable set as Order to retrieve the data  But when the data source is a file   txt   csv         Infinitelnsight   verifies if the variable set as Order is actually ordered in the file  if not an error  message is displayed     For detailed procedures on how to set parameters on this screen  see Describing the Data  on page  9     
2.     Inthe advanced parameters  keep 75  of the hits     Note   To know how to set the parameters go to section To Set the Parameters  on page 16  in scenario 1     Selecting Sequence Coding Statistics    For this Scenario   In this scenario  you decide to calculate for each session which pages have been visited on the web  site and what page led the internaut to another  By adding page transactions count to the model   more information on the internauts behavior will appear     You decide to calculate for each session which pages have been visited first and last on the web site  and what pages had been visited in between  That way  you should be able to determine when a  visitor is going to leave your web site and decide on which pages to make a  5 reduction offer to  keep the visitor and encourage him to make a purchase     33    Scenario 2  Predict End of Session Using Intermediate Sequences otep 3   Generating and Validating the Model    You must use the following settings     For the variable Page  select the function FirstLast  which will create two states columns for each  session  one containing the first page visited  the other the last page visited     Note   To know more about Sequence Coding Statistics  go to section Selecting  nfinitelnsight   Explorer    Sequence Coding Statistics  see  Selecting Sequence Coding Statistics  on page 22  in scenario 1     Checking the Transactions    For this Scenario    After the transactions are checked  Sequence Coding should ha
3.    CUSTOMER    End User Documentation       Document Version  1 0   2014 11                S   AN ua  SS Lg   ANS y      F    ESA                   ws      y L  E     ava       f   j  1     j i    M Ia  SU ee             Wear  ii    N     UL           i  N Ya   NN       a o A Y H d N K N  ni adi s         M  t x    N Ne      N LN v  A A r    f         i        p A   P   TA UE  i   Nc  vw M  1 N  E    N    Table of Contents    Introduction to Application Scenarios 3  Oh T                                                          4   hic277 1 4     XX                                                            QRm   4  Introduction to Sample Files 5  Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts 6  TIN EE TED T                                                      RE      pelectmo a Data SOUFOOu ebrietate tnni taper bui dicia iuto s Mee abri oan cri eM UM a 8   ASS CMOS WS  P IC E Y 9   oe ct ESVEIES JU Ed oiiorosdden unie rei E sc eu de bur 13   IBI re iori ass cung Boc C               Y 14   Siep 2  Delinme the Modeling Pardel eT S ceis ran ERI A TUE CEREASIS OVE E NUAVERU WAV EVU RUE 15   Seting Sequence Coding Paramieletrs 2 e omen ete t re ae Ep quee oU Eg E HI URINE FEE UN EDU E UP ERAS ed 15   5elec  ng Sequence Coding S Latis IQ S dures eedote omi ute emi t HE aE 22   Checking the TranSaCu OU  cicecscassndevescinsescecsnncvencusadivedessesionadeddacnedawctnadsndddsadondesaacheddicusdesetencunndics 24   Selce Variables sree i E a a EEE O TNE
4.    Each variable is described by the fields detailed in the following table     The Field    Gives information on     Name the variable name  which cannot be modified   Storage the type of values stored in this variable       Number  the variable contains only  computable  numbers  be careful a telephone number  or an account  number should not be considered numbers      String  the variable contains character strings      Datetime  the variable contains date and time stamps     Date  the variable contains dates    Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 1   Selecting the Data    The Field       Value    Key    Order    Missing    Group    Description    Structure    Gives information on       the value type of the variable      Continuous  a numeric variable from which mean  variance  etc  can be computed     Nominal  categorical variable which is the only possible value for a string     Ordinal  discrete numeric variable where the relative order is important     Textual  textual variable containing phrases  sentences or complete texts   Warning   When creating a text coding model  if there is not at least one textual variable  you will not be able to go to  the next panel    whether this variable is the key variable or identifier for the record      0 the variable is not an identifier      1 primary identifier      2 Secondary identifier      whether this variable represents a natural order   0  the variable does not represent a n
5.   checkout process  The presence of order5 tmpl indicates that a purchase has occurred  Since  the goal of the analysis is to gain new insights into what behaviors lead to a purchase  these  order pages and other similar information must be excluded from the analysis     Vl To Select a Target Variable  On the screen Selecting Variables  in the section Explanatory Variables Selected  left hand  side   select the variables you want to use as target variables          KXEN Modelling Assistant   New Model with Sequence Analysis E i E E EN ini xj    Selecting Variables    Target Variables     _ Alphabetic Sort    contact contactHowToOrder html Page    cc CU contact contactMain html Page    an Wr       26    Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 2   Defining the Modeling Parameters    MV To Exclude Explanatory Variables  1 Onthe screen Selecting Variables  click the button Open a Saved List located under the  section Excluded Variables     i KXEN Modelling Assistant   New Model with Sequence Analysis    Selecting Variables    xc CO Member creditinfo html Page     c CO Member frameMember html Page     c CO Member frameMemberHome html Page  xc CO Member frameSecure html Page   xc CO Member navigationMember html Page    c CO Member arderstat html Page   xc CO Member arderStatResults html Page  kc CO Member signIn html Page   xc CO Member signUp html Page   xc CO Member update html Page     c CU contact contactFaq html Page   kc CO contact 
6.  26   Setting the Number of C Iustets   uoo cioe tren oe Preterea e as too isin kaniinia iiaeaa vn aia aai ia 27   Step 3   Generating and Validating the Model                     esses eene eene 28  Gencratina the                             28   V alidatim   the Model sersseseiisssnssasn aagi a a TE onia 29   Step 4   Analyzing and Understanding the Model                     sse eee eene 31   SCE DOSOLIDEOBS siessen nonien onae A MEN nea ERNST ae 31   Scenario 2  Predict End of Session Using Intermediate Sequences 32  Step d   Selecting the EE                                                        33   Step 2   Defining the Modeling Parameters                eese eene eene near nnne ness n nns 33   Setting Sequence Coding Parameters    retient rose edP FEE E pEES UE dUr dT ERE PHI UPI E EUR LE ePEESQdEdS 33   Selecting Sequence Coding Statistics odes e ero derum een retenta abicadevasidelavsecademnntedolessetsdaeass 33   Checking the Transactions            s ssseeeoeeeeeeessssssssssssseceteerrrsssssssssssseettreeesossssssssssseeetrerreessssssssseees 34   Soken INCISO LEER 34   Step 3   Generating and Validating the Model                      esses eene eene enne 34  Gencrdune he Model PR TTRRRR      m 34   vatidaune  he Mode sssr A EEEn EE EE ARER 35   Step 4   Analyzing and Understanding the Model                     sees 36  Contributions by Variables                   eeseeessssssssssssssssseseee nennen nennen nennen eee sssnee nn nnn nnn nennen nnns 36   Significance O
7.  For this Scenario    Use the description file file view desc csv     14    Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 2   Defining the Modeling Parameters    M To Describe Events Data    1    2  3    4    On the screen Event Data Description  click the button Open Description  The following  window opens        t Load a Description for file_view csv  Data Type   Folder     Description  e E Browse    cx   Cancel            In the window Load a Description  select the file file view desc csv     Click the OK button  The window Load a Description closes and the description is displayed on  the screen Event Data Description     i KXEN Modelling Assistant   New Model with Sequence Analysis    Description  file view desc csv    De ee ae  ae Te oe  men  one  oem           Sem  sting     mw          Q8     L       iuune e  ee    R O O o        3mm feme ws p        1                d4Pue     pem      ps p     pb          f  9      Add Filter in Data Set    e Analyze   e Open Description   LJ Save Description     Q Ten Baa    Ee vibe       Note that the Order column is set at 1 for the Time variable  thus indicating that this variable is used as a  natural order     Click the Next button     Step 2   Defining the Modeling Parameters    Setting Sequence Coding Parameters    The screen Sequence Analysis Parameters Settings enables you to set some Sequence Coding  parameters by performing the following tasks     19    Scenario 1  Segment Visitors 
8.  Note   The folder selected by default is the same as the one you selected on the screen Data to be Modeled     4 Inthe Description field  select the file containing the data set description with the Browse  button     Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 1   Selecting the Data    5 Click the OK button  The window Load a Description closes and the description is displayed on  the screen Data Description         KXEN Modelling Assistant   New Model with Sequence Analysis    fe Description  session_purchase_desc csv     _ Add Filter in Data Set    uj Analyze   e Open Description   LJ Save Description l A View Data    an xr UA       6 Click the Next button     Selecting Events Data    The screen Events Data lets you specify the data source to be used as the Transaction data set     For this Scenario    The Folder field should already be filled in with the name of the data source that you  specified on the Data to be Modeled screen        Selectthe file file view csv     Vv  To Select Events Data  1 Select the format of your data source  Text Files  ODBC           2 Inthe Folder field  specify the folder where your data source is stored     13    Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 1   Selecting the Data    3 Inthe Events field  specify the name of your data source     i KXEN Modelling Assistant   New Model with Sequence Analysis    Events Data Source    Data Type   rext Files    Folder
9.  data set     For this Scenario    The model generated possesses     29    Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 3   Generating and Validating the Model      A quality indicator KI equal to 0 98      Arobustness indicator KR equal to 0 99         KXEN Modelling Assistant   Purchase session purchase    Training the Model    Overview    Model  Purchase session purchase  Initial Number of Variables  2  Number of Selected Variables  110  Humber of Records  50581  Building Date  2012 05 15 14 57 30  Learning Time   1mn 44s  Engine Mame   Kxen SmartSeqmenter  Author denise o  Minimum Requested Number of Clusters  10  Maximum Requested Number of Clusters  10  SQL Expressions  enabled    Suspicious Variables Detected  Yes    Data Aggregation    Kxen SequenceCoder  Events Data  Total Number of References  Humber of Matching References  Humber of Processed Transactions  Total Number of Transactions    Nominal Targets    Target Key 1  0  Frequency 95 95   1 Frequency 4 05     Performance Indicators    Target  Purchase    kc Purchase    Cluster Counts    Initial Number of Clusters 10  Final Number of Clusters 10       This means that Clustering found a reliable grouping  KR is greater than 0 90  that does a reasonable  job of partitioning the purchasing visitors and the non purchasing visitors  KI of 0 98   It is safe to  look at the descriptive results of the segmentation to gain insight     Scenario 1  Segment Visitors to Understand Purchas
10.  list of session IDs and whether each session has led to a  purchase or not  This will be referred to as the Reference data set for Sequence Coding  A Sequence  Coding Reference data set must have a single variable unique primary key  If the primary key is  non unique or spread out over several variables  Sequence Coding will not function properly     Vl To Select a Data Source  1 Onthe screen Data to be Modeled  select the data source format to be used  Text files  ODBC     p         KXEN InfiniteInsight       Data to be Modeled       Default Mode    Data Set Factory Mode    Data Type   Text Files     Folder            Samples    ql Browse      baase j O aL E Browse      di Cutting Strategy      create a Target          Metadata   No Single Metadata Repository Enabled  um           2 Use the Browse button on the right of the Folder field to select the folder where you have saved  the sample files     3 Click the Browse button next to the Estimation field and select the file session purchase csv   The name of the file will appear in the Estimation field     4 Click the Next button     Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 1   Selecting the Data    Describing the Data  Why Describe the Data Selected     In order for the Infinitelnsight    features to interpret and analyze your data  the data must be  described  To put it another way  the description file must specify the nature of each variable     determining their       Storage f
11.  model has been generated  you must verify its validity by examining the performance  indicators       The quality indicator KI allows you to evaluate the explanatory power of the model  that is   its capacity to explain the target variable when applied to the training data set  A perfect  model would possess a KI equal to 1 and a completely random model would possess a KI  equal to O      The robustness indicator KR defines the degree of robustness of the model  that is  its  capacity to achieve the same explanatory power when applied to a new data set  In other  words  the degree of robustness corresponds to the predictive power of the model applied  to an application data set     For this Scenario    The model generated possesses     35    Scenario 2  Predict End of Session Using Intermediate Sequences Step 4   Analyzing and Understanding the Model      A quality indicator KI equal to 0 70      Arobustness indicator KR equal to 0 98     A KXEN InfiniteInsight      ksc_Session_continue_session_purchase    ally Model Overview    sla  Report Type   Model Overview       Model  ksc_Session_continue_session_purchase   Data Set session purchase csv  Initial Number of Variables  2   Humber of Selected Variables   Number of Records   Building Date   Learning Time   Engine Name     2012 07 09 14 41 35  4mn 10s  Kxen RobustRegression  Author denise o    Modeling Warnings    Monotonic Variables Detected  Yes    Data Aggregation    Kxen SequenceCoder    Events Data file view csv    Tot
12.  value is between the two selected dates will  be used    Between two date columns Only the events for which the Log Date Column value is between the values of the two    selected date columns will be used     For example  you can select the date columns corresponding to the beginning and the end of  a trial period  dates that can be different for each customer     Relative to a date column Only the events for which the Log Date Column value fits in the range defined with respect  to the selected date column will be used     For example  you can use the purchase date of a credit card as the reference and select all  events that occurred in the three months leading to this date     WARNING   Be careful when choosing a period  the selected period must contain events existing in the data set  or else you will  obtain aberrant results for your model  negative KI  KR equal to 1           vl To Use All the Events  Keep the Infinite option     Time Window        Infinite      qo oo  SS   xo  SOSA    C Between two dots coms Fn To   C Relative ma dotecolmn Date  semeen E  e    M To Use Only the Events Occurring in a Fixed Time Window  1 Check the Fixed option     Time Window     Infinite     Fixed From   2012 07 09 15 18 44 gi   To   2012 07 09 15 18 44 aii      C  Between two date columns     From    z  To    7   C Relative to a date column Date    z  Between   1 and        Month z   Advanced         2 Inthe From field  select the date before which no events should be used     3 Inth
13.  wjer  tis id e M RR RR                                                                                                           4  sje  U o P PERENNE E E I E E E E T EAT ANAA E OEE E E E T E E E E EEN 4    Introduction to Application Scenarios Scenario 1    Scenario 1    You will start by using Sequence Coding to create counts of each Web page that was viewed by each  visitor  followed by a targeted segmentation with  purchase  as the target  This will give you a simple  description of the different groups browsing your Web site  and the different conversion rates for  each group     Scenario 2    In this scenario you want to predict when a visitor is goingto leave your Web site  Your idea is to  offer a  5 coupon to visitors who are likely to leave in the hope of increasing the site stickiness  To  achieve that  you will create a Sequence Coding model using intermediates sequences with the  FirstLast option for the pages viewed  The intermediate sequence option will automatically create  an appropriate target variable for determining which behaviors indicate the end of a session     Introduction to Sample Files    Scenario 2    Introduction to Sample Files    This data set contains a single day of Web traffic from an E commerce site in December 1999  The  site content was served by a Broadvision server  but no  cookies  or login was required  making the    sessions effectively anonymous     File   session purchase csv  session purchase desc csv  file view csv   file view d
14. F  Ate COMES ioooido dri iioi ted FIR NERIS CEN OI iode RE SUR Imam SPEM P in a COSE ar EEn aai 37    Introduction to Application Scenarios Scenario 1    Introduction to Application Scenarios    In these scenarios  you are the Marketing Director of an E Commerce company and you want to  increase the profitability of your Web site  You have the budget to launch a major marketing  initiative  but you re not sure what kind of campaign would be the most effective  Due to market  pressures  you only have the time and money to test a few campaigns before launching a major  initiative  The two key metrics that are being used to measure the performance of the Web site are  the  conversion rate  and  stickiness   The conversion rate of a site is the percentage of visits that  result in a purchase  At this time  your Web site has a conversion rate of 496  meaning that 4 out of  every 100 visitors purchase at least one item  The stickiness of a Web site is a measure of the number  of pages viewed by each visitor  The more pages a visitor views  the more likely they are to purchase  something  Your Web site is averaging about 10 pages per visit     In order to achieve rapid insight into the different groups of visitors to your Web site  you have  decided to use  nfinitelnsight    Modeler   Segmentation Clustering to group the population with  respect to their buying behavior and site abandonment  The goal of the analysis is to get descriptions  of the groups of visitors who tend to purch
15. al Number of References  Number of Matching References  Humber l eProcessed Transactions  Total Number of Transactions    Nominal Targets    50561  50537  532498  532498    Target Key O0  D  Frequency 9 58   1 Frequency 90 42     Performance Indicators    Target  ksc Session continue    rr ksc Session continue    Kl  KR    Gr a      Previous       This means that Classification Regression found a robust model  KR is greater than 0 90  that does a  reasonable job of predicting the end of a session  KI of 0 70   It is safe to look at the variables  contributions to gain insight     Step 4   Analyzing and Understanding the Model    Contributions by Variables    The following graph presents the variables contributions     Scenario 2  Predict End of Session Using Intermediate Sequences otep 4   Analyzing and Understanding the Model        KXEN InfiniteInsight      ksc Session  continue session purchase    ally Contributions by Variables       Asada soe    Chart Type   Maximum Smart Variable Contributions      MaxContrib    b L1 Ae ge ge    ge Eo a  ge ge    ge ge     v ge ge         d oue E us P  a ex eat Qe 9 av gat tal que  eat ew oe oie eat eX eat   M LAM M  uM MEM uu uM tthe athe ath gah gat   fd JD GE Lut f x A MP VUE a Saa  a ger SU gU    AT s    Variables       The pages having the more impact  positive or negative  on the buying act are listed in the following  table     Page viewed This variable indicates      KSC Page LastState the last page the internaut has viewed befo
16. ant to calculate on transaction or event data     For this Scenario    You decide to calculate for each session which pages have been visited on the web site  That way   you should be able to determine and understand which pages led the visitors to make a purchase     22    Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 2   Defining the Modeling Parameters    You must use the following settings       For the variable Page  select the function Count  which will create a state column for each  page visited     vl To Select Sequence Coding Statistics   1 TheSequence Analysis Variables Selection for Functions screen lists all the variables for  which statistics can be calculated  For each variable listed  select the functions to use  You can  choose among the three functions Count  CountTransition and FirstLast     i KXEN Modelling Assistant   New Model with Sequence Analysis    Gn Wr       2 Click the Next button     Operations Definition    Several standard Sequence Coding columns are created for each reference ID  For reference Ids that  have no transactions associated with them  the standard Sequence Coding columns will have null  values     KSC Start Date  The timestamp of the first transaction in the log for each reference ID    KSC End Date  The timestamp of the last transaction in the log for each reference ID    KSC TotalTime  The seconds between the KSC Start Date and KSC End Date    KSC Number Events  The number of transactions in t
17. ase items frequently  and the indicators that a session is  about to end  You already know the following basic facts about your Web site       An average of 50 000 visitors come to the Web site each day    For the 2000 sessions that result in a purchase each day  the average amount spent is  181      The average profit margin for the Web site is 596  so each purchase results in an average  profit of  9 05  resulting in  18 100 of profit per day       There are four main entry points for the site   The home page  the members home page  the  sweepstakes page  and the specials page       The checkout process has five steps  all with the word  order  in the file name       Your site does not use  cookies  or require a login for your members  so each session is  effectively anonymous unless a purchase is made     The information that is available for analysis consists of the Web logs  Your DBA has pulled out a list  of the sessions from a single day of traffic  along with a flag indicating if the session resulted in a          purchase  the existence of    order5 tmpl    in a session indicates a purchase   Along with the list of  sessions  the parsed log from the day is also available  Since the information from the Web log is  not aggregated for analysis  you will need to use the  nfinitelnsight    Explorer   Sequence Coding prior  to running the  nfiniteInsight   Modeler   Segmentation Clustering or Infinitelnsight    Modeler      Regression Classification     IN THIS CHAPTER   
18. at will indicate the end of the time window   In the last drop down list  enter the unit to be used to define the time window     For example  if you have set the parameters Date CardPurchaseDate Between  3 ando  Month  only events occurring in the three months leading to the date of purchase will be kept for  each customer     19    Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 2   Defining the Modeling Parameters    Understanding Advanced Parameters    The advanced parameters allows you to configure the following elements       the prefix to be added to Sequence Coding generated variables     the location where the temporary files generated by the modeling are stored       the amount of information that will be kept for the modeling     Sequence Coding Generated Variable Prefix                You can define a specific prefix that will be used to identify variables created by Infinitelnsight     Explorer  By default  this prefix is set to ksc       i KXEN Modelling Assistant   New Model with Sequence Analysis    Advanced Sequence Analysis Parameters    Sequence Analysis Generated Variable Prefix    Prefix   ksc      Filter Transitions greater than    z     Previous       Storage Type       When creating a model  Sequence Coding generates large quantities of temporary columns  you can  select whether the data generated will be stored in a memory space or on a disk     The option In memory is selected by default     20    Scenario 1  Segme
19. atural order  1 the    variable represents a natural order   If the value is set at 1  the variable is used in SQL expressions in an  order  by  condition     There must be at least one variable set as Order in the Event data source     Waming   If the data source is a file and the variable stated as a natural order is not actually ordered  an error  message will be displayed before model checking or model generation     the string used in the data description file to represent missing values  e g   999  or   Empty    without the  quotes     the name of the group to which the variable belongs  Variables of a same group convey a same information and  thus are not crossed when the model has an order of complexity over 1   This parameter will be usable in future  version     an additional description label for the variable    this option allows you to define your own variable structure  which means to define the variables categories  grouping     Viewing the Data    To help you validate the description when using the Analyze option  you can display the first    hundred lines of your data set     10    Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 1   Selecting the Data    M To View the Data  1 Click the button View Data  A new window opens displaying the data set top lines   DataSet  O         Data     statistics      oes   mot   eaen NETTEN rerasane  oea        mee   2st   Sie ju          a   Eee   amah    7 a5 Private 160187 sare       2 u
20. contactHowToOrder html Page  kc CO contact contactMain html Page    A bel C Alphabetic Sort    gj Open a Saved List jp  um              2 The window Load Excluded Variables List opens  In the Variables field  select the file  containing the variables to skip        X Load Variables List E X   Data Type   Text Files    Folder             5amples K5C H Browse            File   session _purchase_skip csv  d    H Browse    OK   Cancel         3 Click the OK button  the window closes  The list of excluded variables has been populated     Setting the Number of Clusters    Before generating the model  you need to set the number of clusters you want to create     For this Scenario      Set the number of clusters to 10  which is the default number     Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 3   Generating and Validating the Model    M To Set the Numer of Clusters  In the panel Summary of Modelling Parameters  type the number of clusters you want to  generate in the field Find the best number of clusters in this range     i KXEN Modelling Assistant   Purchase session purchase    Summary of Modelling Parameters    Previous Generate       Step 3   Generating and Validating the Model  Generating the Model    Once the modeling parameters are defined  you can generate the model  Then you must validate its  performance using the quality indicator KI and the robustness indicator KR         fthe model is sufficiently powerful  you can analyze the res
21. e Behavior Using File CountsStep 4   Analyzing and Understanding the Model    Step 4   Analyzing and Understanding the Model    Segment Descriptions    On the screen Cross Statistics  you can look at the logical definition and or the cross statistics of  each variable to gain an understanding of what kind of visitors belong to each cluster  Three clusters  are particularly informative for your business problem  which is to determine which kind of  population you should try to attract to increase your profit        the two clusters that have the highest conversion rates     the cluster that has the lowest conversion rate     The chart below summarizes these clusters  and gives them each a label based on the cluster  definition     Freq  eme  Demon fiaa  1 9  31 4   shop shipChart html  0 5  Shippers    3 5  25 96    welcome html  1 20   11 8  0 1       Iholiday holidaySweeps tmpl  1     The cluster Shippers is defined by sessions in which the shipping chart   shop shipChart htm  has  been seen between 1 and 5 times  Actually  this cluster does not give you much information  It just  tells you that visitors that go to the shipping chart will probably make a purchase  which is rather       logical  If you don t intend to buy  why would you look at the shipping information      The cluster Members is more informative  It shows that people visiting the member home page   welcome html  are more likely to buy  This is an interesting piece of information  It means that  members are m
22. e To field  select the date after which no events should be used     Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 2   Defining the Modeling Parameters    M To Use Only the Events Occurring Between Two Date Columns    1    ooh     2    Check the option Between two date columns     Time Window     Infinite    C reed von  A 3j o aS       Between two date columns From         To          Shame sie Oo     ME       In the From field  select the date column containing the date before which no events should be  used     In the To field  select the date column containing the date after which no events should be used     To Use Only the Events Occurring in a Range Relative to a Date Column  Check the option Relative to a date column     Time Window     C Infinite       Fed von  A   vo  A     C Between two date columns From  To     Relative to adate coume     Date   E  Between   0  ana   18  Month 7        In the Date list  select the column that contains the date to use as a reference for the time  window     In the Between field  enter the number of units that will indicate the start of the time window   The following table sums up the values you can use to define the beginning of the time window     Value Significance   negative integer the time window begins before the reference date  0 the time window begins at the reference date  positive integer the time window begins after the reference date    In the and field  enter the number of units th
23. ed Variable Prefix    me 1       ee for Holding Counting Information    Filtering  Percentage of Hits Kept  75        Understanding Infinitelnsight    Explorer   Sequence Coding Parameters    J oining Your Data  To aggregate the reference data with the events data  you have to join both tables and indicate  which column of each table corresponds to the reference ID        In the fields Columns for Join  select the variables corresponding to the customer ID in both data  sets  The information contained in both selected variables must be the same     In the field Log Date Column  select the variable corresponding to the date and or time of the log  data     Calculating the Intermediate Sequences  The mode Intermediate Sequences provides you with additional information about the transitions       and sequences existing in your data sets       order of the steps    details of the steps      continuity of the session for each step    17    Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 2   Defining the Modeling Parameters    Filtering the Events by Time Window   The section Time Window allows you to filter the events on which the model will be built by setting a  period defined either by fixed dates or by values existing in the data set  The following options are  available to filter the events data set     Option Description   Infinite No time window is defined  all the events will be used    Fixed Only the events for which the Log Date Column
24. esc csv    session purchase skip csv    session continue skip csv    Description    list of sessions and binary purchase target  50581 rows   description for session purchase csv   log of files requested from Broadvision server  532860 rows   description for file view csv    variable skip list for Scenario 1  these are the variables where the value  would not be known until the session had ended     variable skip list for Scenario 2    These sample files can be downloaded on  nfinitelnsight   Sample Files Download Center     http   www kxen com sample_data       Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts ocenario 2    Scenario 1  Segment Visitors to  Understand Purchase Behavior Using File  Counts    1 In Infinitelnsight    main menu  select the option Perform a Sequence Analysis in the Explorer  section         KXEN InfiniteInsight       X InfiniteInsight       Version  6 1 0   Explorer Modeler    Create or Edit Explorer Obj Create a Classification  R     Create a Clustering Model  Create a Time Series Analysis    Perform a Sequence Analysis Load a Model    Perform a Text Analysts um ence Analysis  Aggregate Events into a Series of Transitions    Create a Social Network Analysi Open the Data Viewer    Load a Social Network Analysis Model Perform a Data Transfer  List Distinct Values in a Data Set    Get Descriptive Statistics for a Data Set       2 The screen Add a Modeling Feature is displayed         KXEN InfiniteInsight     Add a Modeli
25. he log associated with each reference ID     In addition to the standard  nfiniteInsight   Explorer   Sequence Coding columns  three types of  operations are available       Count     Count the transitions     First and last     23    Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 2   Defining the Modeling Parameters    Count  When you select the Count option  Sequence Coding creates a new column for each value of the       inserted variables     Count encodes the sequences using one column per valid category in the specified nominal column   Each valid category is referred to as a  state   Categories that are seen only once for the transactions  associated with the reference id present in the Estimation data set are discarded     CountTransition                   When you select the CountTransition option  Sequence Coding creates a new column for each  transition of categories in the selected data set     CountTransition encodes the sequences using one column per valid pair wise category transition in  the specified column  Each valid category transition is referred to as a  state transition   State  transitions that are seen only once for the transactions associated with the reference id present in  the Estimation data set are discarded  A separate KxOther column will be created for rare  transitions  using the threshold set by the Filter slider bar in the same way a KxOther column is  created for the counts     FirstLast    The FirstLas
26. ly analyze the sequences or add extra transformations such  as a  Classification Regression  J nfinitelnsight    Modeler    Hegression Classification or a  Clustering Segmentation   nfinitelnsight   Modeler   Segmentation Clustering      IN THIS CHAPTER    Step 1  Selecting BiU ADI RETE                                                          R  Step 2   Defining the Modeling Parameters              cccccecccccccesseceecceesecceeceeseceeeceuseceecseeseceeesaeseceesseeeeceeesaueceeeseaneeees  Step 3   Generating and Validating the Model                  cccccesseccceceeseceeeceeseceeeceeseeceeseesceseseeaceesseeaeceesseaeceeeseaeees  Step 4   Analyzing and Understanding the Model                           sseeeessssssssssssseeeeneeeenne nennen nennen nnns nna nnns    Step 1   Selecting the Data    To know how to select and describe the data go to section Selecting the Data  on page 8  and  Describing the Data  on page 9  in Scenario 1     For this Scenario    Select the Random cutting strategy       Use the file session purchase csv as the reference file and use the file  session purchase desc csv as its description file       Selectthe file file view csv and use the description file file view desc csv     Step 2   Defining the Modeling Parameters    Setting Sequence Coding Parameters    For this Scenario    Select the SessionID column as the join column for both the log and reference data sets     Select Time as the Log Date Column     Check the option Intermediate Sequences 
27. m  contactsap       2014 SAP SE or an SAP affiliate company  All rights  reserved     No part of this publication may be reproduced or  transmitted in any form or for any purpose without the  express permission of SAP SE or an SAP affiliate  company  The information contained herein may be  changed without prior notice     Some software products marketed by SAP SE and its  distributors contain proprietary software components of  other software vendors  National product specifications  may vary   These materials are provided by SAP SE or an SAP  affiliate company for informational purposes only   without representation or warranty of any kind  and SAP  or its affiliated companies shall not be liable for errors or    omissions with respect to the materials  The only  warranties for SAP or SAP affiliate company products  and services are those that are set forth in the express  warranty statements accompanying such products and  services  if any  Nothing herein should be construed as  constituting an additional warranty     SAP and other SAP products and services mentioned  herein as well as their respective logos are trademarks  or registered trademarks of SAP SE  or an SAP affiliate  company  in Germany and other countries  All other  product and service names mentioned are the     trademarks of their respective companies   Please see  htt   W Sap C  X additional trademark information and notices              i  I          i      ve              
28. ng Feature    Add a Classification   Regression    Standalone Data Transformation    gai   Cancel   Previous       Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts ocenario 2    3 Click on the option Add a Clustering     Note   When building a model you can either simply analyze the sequences or add extra transformations such  as a  Classification Regression  J nfinitelnsight    Modeler     Hegression Classification or a  Clustering Segmentation   nfinitelnsight   Modeler   Segmentation Clustering     IN THIS CHAPTER    otep Ee  e  10 0 RR RP tc eee ee enn ee en                       8  Step 2   Defining the Modeling Parameters             ccccccccssseeeececeeeseeeeeceeceeeeeeeseeeeeeessseaeeeeeeeeeeseaeseeeeeeessseaeeeeeeeessaas 15  Step 3   Generating and Validating the Model                  cccccssssccceceeecceeceeeecceeceeseeceeseeseceeeeeeseceesseeseceeesaueceesseaeees 28  Step 4   Analyzing and Understanding the MOdel                 cccccccccessecceeceeseeeeeceeeeeecceeseceessueeeceesseeeceeesaugeceeeeeaneeees 31    Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 1   Selecting the Data    Step 1   Selecting the Data    IN THIS CHAPTER    SCICCING a Dala  SO UNCC ERE TER E                             8  Bi teglelteBi 28 i  MNRMR                               9  FECE eN Dc RE Omer 13  BS SGC Eyents DII ENTRE TERR 14    Selecting a Data Source    For this Scenario   The file session purchase csv contains a
29. nt Visitors to Understand Purchase Behavior Using File Counts Step 2   Defining the Modeling Parameters    Xx KXEN Modelling Assistant   New Model with Sequence Analysis 1     oe Analysis Generated Variable Prefix    mee o o i E    Storage Used for Holding Counting Information     In Memory    On Disk    Ao Cancel Previous       Filtering the Events   The Filtering option allows you to group rare categories into a single category labeled KxOther  It is  very common for transaction logs to have many infrequently occurring categories that by themselves  will not make reliable predictors  A predictive benefit can often be achieved by combining these rare          categories into a single group  The Filtering slide allows you to select the categories to keep as  separate columns based on percentage of the overall transaction log  The categories corresponding  to the remaining percentage of transactions are grouped in the KxOther column  which is  automatically generated by Infinitelnsight    Explorer   Sequence Coding     Advanced Sequence Analysis Parameters    Sequence Analysis Generated Variable Prefix     wea    Storage Used for Holding Counting Information      v InMemory 6  On Disk      Filtering  Percentage of Hits Kept  905    D 10 20 30 40 50 60 70 80 30 100      Filter Transitions greater than       gai  Cancel Previous       Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 2   Defining the Modeling Parameters    For example  if you se
30. ore likely to make a purchase than other visitors  So increasing the number of  members should increase your profit     The cluster Sweepstakers gives you information on a previous attempt at increasing the number of  purchase through a sweepstake  You can see that only 0 196 of the people visiting the sweepstake  page actually make a purchase  You can infer from this that your previous campaign had the effect  opposite to the one expected     31    Scenario 2  Predict End of Session Using Intermediate Sequences otep 4   Analyzing and Understanding the Model    Scenario 2  Predict End of Session Using  Intermediate Sequences    1 Ini nfinitelnsight   main menu  select the option Perform a Sequence Analysis in the Explorer  section         KXEN InfiniteInsight       X InfiniteInsight       Version i  6 1 0 m Explorer In Modeler    Create or Edit Explorer Obj Create a Classification  Re  Create a Clustering Model  Create a Time Series Analysis  Create Association Rules      X Toolkit  Open the Data Viewer  Perform a Data Transfer  List Distinct Values in a Data Set    Get Descriptive Statistics for a Data Set       2 The screen Add a Modeling Feature is displayed     ec InfiniteInsight       Add a Modeling Feature    Adda d   Regression  Standalone Data sformation       3 Click on the option Add a Classification   Regression     z 32    Scenario 2  Predict End of Session Using Intermediate Sequences Step 1   Selecting the Data    Note   When building a model you can either simp
31. ormat  number  number   character string  string   date and time  datetime  or  date  date      Notes      When a variable is declared as date or datetime  the KXEN Date Coder feature  KDC  automatically extracts  date information from this variable such as the day of the month  the year  the quarter and so on  Additionnal  variables containing this information are created during the model generation and are used as input variables  for the model     KDC is disabled for Time Series       Type  continuous  nominal  ordinal or textual   For more information about data description  see Types of Variables and Storage Formats in the  Introductory Guide to  nfiniteInsight       How to Describe Selected Variables    To describe your data  you can       Either use an existing description file  that is  taken from your information system or saved  from a previous use of InfiniteInsight   features       Orcreate a description file using the Analyze option  available to you in  nfinitelnsight    In  this case  it is important that you validate the description file obtained  You can save this  file for later re use  If you name the description file KxDoc_ lt SourceFileName gt   it will be  automatically loaded when clicking the Analyze button     Important   The description file obtained using the Analyze option results from the analysis of the first 100 lines of the initial data  file  In order to avoid all bias  we encourage you to mix up your data set before performing this analysis  
32. ponses that it provides in  relation to your business issue      Otherwise  you can modify the modeling parameters in such a way that they are better  suited to your data set and your business issue  and then generate new  more powerful    models     28    Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 3   Generating and Validating the Model    vl To Generate the Model    On the screen Summary of Modelling Parameters  click the Generate button   The screen Training the Model will appear  The model is being generated  A progress bar will  allow you to follow the process         KXEN Modelling Assistant   Purchase session purchase    Training the Model    Beginning of learning for Default   Building transition vectors    Stop Current Task       Validating the Model    Once the model has been generated  you must verify its validity by examining the performance  indicators       The quality indicator KI allows you to evaluate the explanatory power of the model  that is   its capacity to explain the target variable when applied to the training data set  A perfect  model would possess a KI equal to 1 and a completely random model would possess a KI  equal to O      The robustness indicator KR defines the degree of robustness of the model  that is  its  capacity to achieve the same explanatory power when applied to a new data set  In other  words  the degree of robustness corresponds to the predictive power of the model applied  to an application
33. re ending  his session   KSC Last duration duration of the session from the first page viewed to    the previous state    KSC LastStepNumber the number of pages the internaut has viewed before  ending his session    Count holidaySweepsEntry ht  the number of time the page holidaySweepsEntry  ml  access to holiday promotions  has been viewed    The impact of each page on the purchase is detailed in section Significance of Categories     Significance of Categories    KSC Page LastState    37    Scenario 2  Predict End of Session Using Intermediate Sequences otep 4   Analyzing and Understanding the Model    This is by far the strongest predictor  This is similar to a low order Hidden Markov Model  where the  current state is used to predict the next one     Last duration and LastStepNumber    The length of the session and the number of pages viewed are also important  If the internaut has  viewed only one page  he has not yet entered the site and may end his session because the site may  not seem of interest to him  but if he has viewed more than 12 pages  he has probably found what he  was looking for and will end his session  If he has seen between 2 and 11 pages  he is probably  shopping and thus should continue his session     Count holidaySweepsEntry html    If the page has been viewed it is a good indicator that the session will continue  Since this page is the  entry point of a holiday promotion  the internaut will at least go to the promotion page     38       WWW Sap co
34. sistant   New Model with Sequence Analysis    Model Checking    Checking transactions file    ct  ae      Stop Current Task          2 When the process is over  click the button  Show Detailed Log   The number of columns    created by Sequence Coding is indicated     ESC kept 98 state columns for variable Page    Date Coder module inserted in the processing chain   el Checking is Finished   ESC kept 98 state columns for variable Page    Date Coder module inserted in the processing chain   el Checking is Finished   KSC kept 98 state columns for variable Page    Date Coder module inserted in the processing chain   el Checking is Finished       Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 2   Defining the Modeling Parameters    3 Click the Next button     Selecting Variables    Once the reference data set  the events data set and their descriptions have been entered  you must  select different variables         oneor more Targets Variables     possibly a Weight Variable        andthe Explanatory Variables     For this Scenario     Keep Purchase as the target      Use the session continue skip csv file to select the variables to exclude  This list of variables  includes the information that is not known about a session until a purchase has occurred or is  very likely to occur  For this Web site  the checkout process included five order pages  The  presence of any of the five order pages in the log indicates that they have already started the
35. t option creates two columns  the categories of the selected variable from the first and       last transactions in the log for each reference ID  called FirstState and LastState respectively  The  FirstState and LastState columns are created automatically when either the Count or  CountTransition options are selected     Checking the Transactions    At this stage   nfinitelnsight   analyses the data sets and creates a number of new variables  or  columns  Depending on which operations you chose during the previous step  Sequence Coding  creates       four standard columns   ksc Start Date  ksc End Date  ksc TotalTime  and  ksc Number Events       one column for each state  if you have selected Count      one column for each transition  if you have selected CountTransitions        Two columns  FirstState and FinalState  if you have selected  Count  CountTransitions  or  FirstLast        Six columns  LastStepNumber  Last date time  Last duration  Session Continue   LastState  and NextState  if you have selected Intermediate Sequences      For this Scenario    24    Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 2   Defining the Modeling Parameters    After the transactions are checked  Sequence Coding should have kept 99 state columns for the Page  variable  plus the four standard columns and the FirstState and LastState columns     vl To Check the Transactions  1 During the model checking a progress bar is displayed     i KXEN Modelling As
36. t the Filtering slider at 9096  it means that the total number of transactions  when adding all the categories assigned to separate columns must not exceed 90  of the total  number of transactions  The categories that make up the remaining 1096 of the transactions will be  grouped under KxOther     You can also define a threshold so that transitions which duration between two events is higher than  the defined threshold will be ignored in the transition count     vl To Set a Threshold  1 Check the box Filter Transitions greater than       Tu Modelling Assistant   New Model with Sequence Analysis    Sequence Analysis Generated Variable Prefix    Prefixfisc o         2 Inthe number field  enter the number of units defining the threshold     3 Inthe drop down list  select the unit to be used to define the threshold     For this Scenario    For the sample data  each row of the transaction log represents an HTML file requested by the  visitor   s browser  There are 10184 different files that are requested during the day  However  by  positioning the Filtering slide at 75   only 99 files are retained for separate count columns  and the  rows with the remaining 10085 files are grouped into the KxOther count  This means that the 99  most common files make up 75  of the log and the remaining 10085 files make up only 25  of the    log     Selecting Sequence Coding Statistics    The screen Sequence Analysis Variables Selection for Functions lets you specify the type of  Statistics you w
37. to Understand Purchase Behavior Using File Counts Step 2   Defining the Modeling Parameters    Join your reference data with your transaction data  Calculate the intermediate sequences    Filter your events by period    For this Scenario    Select the SessionID column as the join column for both the log and reference data sets   Select Time as the Log Date Column   In the advanced parameters  keep 75  of the hits     Select Infinite as the Time Window     Vl To Set the Parameters  1 Onthe screen Sequence Analysis Parameters Settings  select the join column for both the log  and reference data sets     2 Select the Log Date Column        eta Modelling Assistant   New Model with Sequence Analysis    Sequence Analysis Parameters Settings    Events Data Set  Reference Data Source   Columns for Join   SessionID v    SessionID x       Events Date Column   Time v         Intermediate Sequences    Time Window       Infinite     Fixed From   2012 05 15 12 00 55 a   To   2012 05 15 12 00 55 yi       Between two date columns From    J To   J     Relative to a date column Date     8 Between    oa and   A  Month       3 Click the Advanced button to set the advanced parameters     16    Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 2   Defining the Modeling Parameters    4 inthe Advanced panel  slide the filter to 7596     i KXEN Modelling Assistant   New Model with Sequence Analysis    Advanced Sequence Analysis Parameters    Sequence Analysis Generat
38. u E zt  LN   zc LL       B arried civ     Exec  i5   neme       izzape married  n c  H   spee       masce    iiNeiernered aes             2 Inthe field First Row Index  enter the number of the first row you want to display   3 Inthe field Last Row Index  enter the number of the last row you want to display     4 Click the Refresh button to see the selected rows     A Comment about Database Keys    For Sequence Coding to be able to join the Reference and Transaction data sets  the Reference data  set to be analyzed must contain a single variable that serves as a unique key variable     To Specify that a Variable is a Key    1 Inthe Key column  click the box corresponding to the row of the key variable     Scenario 1  Segment Visitors to Understand Purchase Behavior Using File Counts Step 1   Selecting the Data    2 Type inthe value  1  to define this as a key variable     n Description  Desc CensusO1 csv      Add Filter in Data Set       For this Scenario  Use the file session purchase desc csv as the description file     M To Describe the Data  1 Onthe screen Data Description  click the button Open Description  The following window  Opens         KXEN Modelling Assistant   New Model with Sequence Analysis      Add Filter in Data Set    Vg  Analyze   iJ Open Description   LJ save Description l       2 Inthe window Load a Description  select the type of your description file   3 Inthe Folder field  select the folder where the description file is located with the Browse button  
39. ve kept 98 state columns for the Page  variable     Selecting Variables    For this Scenario    Use the session continue skip csv file to select the variables to exclude     Use KSC Session continue as the target and remove Purchase from the targets     Note   To know how to select variables  go to section Selecting Variables  see  For this Scenario  on page 26   in scenario 1     Step 3   Generating and Validating the Model  Generating the Model    Once the modeling parameters are defined  you can generate the model  Then you must validate its  performance using the quality indicator KI and the robustness indicator KR         fthe model is sufficiently powerful  you can analyze the responses that it provides in  relation to your business issue      Otherwise  you can modify the modeling parameters in such a way that they are better  suited to your data set and your business issue  and then generate new  more powerful  models     34    Scenario 2  Predict End of Session Using Intermediate Sequences Step 3   Generating and Validating the Model    vl To Generate the Model    On the screen Summary of Modelling Parameters  click the Generate button   The screen Training the Model will appear  The model is being generated  A progress bar will  allow you to follow the process         KXEN Modelling Assistant   Purchase session purchase    Training the Model    Beginning of learning for Default   Building transition vectors    Stop Current Task       Validating the Model    Once the
    
Download Pdf Manuals
 
 
    
Related Search
    
Related Contents
Supermicro MBD-X7SBI-B  GasAlertMicro 5 - Keison Products    IPDU Slick.qxp  Envoy™ 460 Rolling Walker Marchette Andadera Rodante  Lirio by Philips 42240/93/LG  Pauvreté, exclusion : ce que peut faire l`entreprise  Bosch DH507 Use and Care Manual  Fisher-Price GROW WITH ME RC RALLY 77306 User's Manual  duomax n/ns, pn/pns - Certificazione Energetica    Copyright © All rights reserved. 
   Failed to retrieve file