Home
Performance Analysis of Business Processes through Process Mining
Contents
1. Strict equivalent Show diagram Fitter options Zoom 86 00 Figure G 8 Screenshot Sequence diagram 5 User selects an equivalence type At Pattern type choose between flexible equivalent and strict equivalent In this example we will use flexible equivalent Press the Show diagrams button and select the Pattern diagram tab at the top of the screen 6 Plug in shows the pattern diagram and the frequency of each pattern The first pattern is the most frequent one The window will now look similar to the one depicted in figure G 9 In the centre of the window the pattern diagram is shown The top pattern is the most frequent one In the example in figure the top pattern has a frequency of 153 i e there exist 153 sequences in the sequence diagram that belong to this pattern G 2 3 Find longest pattern The first five steps of this use case are equal to the first five steps of the previous use case At that point the window will look similar to the one depicted in figure 6 Plug in shows the pattern diagram and a list of the statistical values of the throughput time of each pattern On the right side of the window KPIs are displayed including the statistical values of the throughput time The pattern with the highest average throughput time is the longest pattern 84 Appendix G Test cases 6 Analysis Performance Seque
2. Tena ar Thegin ar 4 3 2 Deriving Pattern Diagrams To diminish the load of information that the user has to search through pattern diagrams should be available to the user In pattern diagrams sequences that follow similar behavior are represented by one pattern To be able to group sequences in this way however it first has to be clear how sequences can be distinguished from other sequences Comparing two sequences starts with checking whether the number of blocks in the two sequences is equal and if the number of arrows in the sequences is equal as well When this is the case it should be possible to compare each block in the one sequence with a block in the other sequence and an arrow in the one sequence with an arrow in the other sequence That is why the blocks and the arrows in these sequences should be sorted For each sequence so blocks should be sorted first alphabetically on the name of the data element instance that it is part of If there is more than one block active in the same data element instance the blocks should further be sorted ascendingly on their begin timestamp Since there are no two blocks active in the same data element instance at the same time in the same sequence the blocks do not have to be sorted any further By sorting each block b B gets a unique index When comparing two sequences each block in the one sequence should be compared against the block in the other sequence that has the same inde
3. we have tried to answer these questions The first research question was defined as Which process mining functionality is available in commercially developed process monitoring tools and which parts of this functionality would be enhancements to the ProM framework In order to answer this research question tools were investigated that are among the market leaders in the field of process performance analysis Aris Process Performance Manager Business Objects HP Business Process Cockpit Hyperion System 9 and ILOG JViews An overview of the most important functionality present in these tools was presented in section 3 2 of this thesis Next to the investigation of these commercial tools our research aimed at answering the second research question which was defined as Since we have a process model at our disposal within the ProM framework which additional process mining functionality would qualify for implementation into the ProM framework Although this question left room for investigation of functionality in other process mining areas our main focus here was also on the analysis of the performance of business processes The reason for this was that the need for performance analysis functionality in the ProM framework at that moment was particularly high During research we detected that the second question partly overlaps with the first since some commercial process monitoring tools such as ILOG JViews do make use of process mod
4. MMmersnsncs OSESWM393042 Process instances contained in 7ESWM392 7ESWM392 BESWM392 O8ESWWM393043 Sequences with throughput time 0 0 seconds OS5308ESWM393049 i BESWM393 1 1 2ESWM394 2ESWM394 05308ESWM393052 O T Patterns with mean throughput time above v lo 0 19 process instances selected 05312ESWM394702 530 530 262 530 906 530 46 05308ESWM393049 OPattens0 to 2 530 052 531 58 531 763 531 702 531 Figure H 9 Screenshot of filter options screen e All Process instances Selects all the process instances The following option is available after the Show Diagram button has been clicked at least once e Sequences with a certain throughput time Selects all process instances that have a throughput time above below the filled in bound value The following options are available after the pattern diagram has been drawn at least once e All Patterns Selects all process instances that match one of the patterns from the pattern diagram e Interval of patterns Selects all process instances that match one of the patterns in the interval e Patterns with a certain mean throughput time Selects all process instances that match one of the patterns of which the mean throughput time lies above below the filled in bound value After you have selected the process instances that you wish to use you can click the Use selected instances button after which the sequence diagram and
5. 1 00 00 Calculate number of needed goods James Production 1 2006 1 00 00 Request purchase of goods James Production 1 1 2006 1 25 00 Order goods Fred Purchase 1 2006 1 25 00 Deliver goods Fred Purchase 1 1 2006 2 20 00 Create product James Production 1 1 2006 2 20 00 Finish product James Production 1 2006 3 05 00 Sell product Nick Sales 1 1 2006 3 05 00 File order Nick Sales 1 2006 4 03 00 Case 1 TaskID Originator Department Timestamp Order goods David Purchase 1 1 2006 3 00 00 Deliver goods David Purchase 1 1 2006 3 51 00 Create product John Production 1 1 2006 3 51 00 Finish product John Production 1 1 2006 4 57 00 Use product David Purchase 1 1 2006 4 57 00 Request product sell David Purchase 1 1 2006 5 30 00 Sell product Nick Sales 1 1 2006 5 30 00 File order Nick Sales 1 1 2006 6 22 00 Case 2 TaskID Originator Department Timestamp Receive Order Jim Sales 1 2006 4 17 00 Send Production Request Jim Sales 1 1 2006 5 01 00 Calculate number of needed goods James Production 1 2006 5 01 00 Request purchase of goods James Production 1 2006 5 44 00 Order goods Fred Purchase 1 1 2006 5 44 00 Deliver goods Fred Purchase 1 2006 6 51 00 Create product John Production 1 1 2006 6 51 00 Finish product John Production 1 2006 7 42 00 Sell product Jim Sales 1 1 2006 7 42 00 File Order Jim Sales 1 2006 8 50 00 Table 3 1 Example event log however since it starts with the purchase d
6. A log in which events are registered is accessable through the ProM framework Main Path 1 User starts the plug in 2 Plug in provides a list of the available data elements that appear on the audit trail level in the log and requests the user to choose one 3 User selects a data element 4 Plug in asks which equivalence type to use flexible equivalence or strict equivalence 5 User selects an equivalence type 6 Plug in shows the pattern diagram and the frequency of each pattern The first pattern is the most frequent one Postcondition Plug in shows the pattern diagram and the frequency of each pattern The pattern at the top of the diagram is the most frequent one When there are a number say x most frequent patterns then of course the first x patterns in the diagram are the most frequent ones E 2 3 Find longest pattern Brief Description The longest pattern is the pattern with the highest average throughput time Preconditions A log in which events are registered is accessable through the ProM framework Main Path 1 User starts the plug in 2 Plug in provides a list of the available data elements that appear on the audit trail level in the log and requests the user to choose one 3 User selects a data element 4 Plug in asks which equivalence type to use flexible equivalence or strict equivalence 5 User selects an equivalence type 6 Plug in shows the pattern diagram and a list of the statistical
7. At that point the advanced settings are shown as displayed in figure G 2 3 User sets non conformance option Select the non conformance option for Time between two transitions In this test case we will set the option to All measurements taken The calculations of this time will then 79 Appendix G Test cases be based on the time between the two transitions of all cases fitting or not during replay of which the transitions both fire at least once Press the Apply changes button to proceed When the plug in returns to the general options window similar to figure G 1h press the Start analysis button 4 Plug in provides a way to select two transitions A window similar to the one displayed in figure G 3 appears Transitions can be selected in the Petri net or in the selectionboxes on the lower right side of the window 5 User selects the transitions s he is interested in Select the two transitions within the Petri net or in the selectionboxes In our example we have chosen transition Az complete and transiton Tireopel start 6 Plug in provides a way to select cases On the left side of the window the identifier names of the cases can be de selected in a table 7 User selects a case Select the name of the case in which you are interested in our example case 1001 and press the Update button below the table 8 The plug in shows the time between the two selected transitions for the selected case The time
8. Number fitting 336 cases Arrival rate v f 4922 44896 cases per day wA i Knead2 a ___Throughput time days complete complete w 10 2013 0 0 0 50625 Spit 10 12439 complete Kneadt fast 25 00 0 04818 complete Slow 25 00 0 36758 normal 50 0 10 19472 Ratin_cleaninge o px b gt i noe 000 for 0200 103 ooo og W050 N06 o7 og ee Hog T W100 SSS mae CE mao g Change Export Percentages Time Metrics 4 I gt Selected zi Figure H 3 Screenshot of the Performance Analysis with Petri net plug in a visualization of the used Petri net is displayed at the center of the screen Within the shown Petri net at XOR splits i e places with more than one outgoing arc the probability that a token is consumed by the transition to which the arc leads is shown At the XOR split in the example above you can see that the probability that a token is consumed by transition Raising Cleaning3 Start is 0 51 and the probability that a token is consumed by Raising Cleaning2 Start is 0 49 Furthermore places in the Petri net are colored according to their average waiting time This should make it easier for you to spot the location of possible bottlenecks within your process Below the Petri net a legend is shown in which you can see which color belongs to which waiting time level In the example the purple colored places are said to have a high average waiting time
9. Waiting time Synchronization time Waiting time the statistical values of the time that passes from the full en abling of a transition until its firing i e time that tokens spend in the place waiting for a transition to which the place is an input place to fire and consume them Synchronization time the statistical values of the time that passes from the partial enabling of a transition i e at least one input place marked until full enabling i e all input places are marked Time that tokens spend in a place waiting for the transition to which this place is an input place to be fully enabled Probability at each outgoing arc of XOR splits The probability that tokens that are in the XOR split i e place with multiple outgoing arcs are consumed by the transition that is connected to the place by the arc 87 Appendix H User manuals Two transition metrics For each two visible transitions in the Petri net the following metrics are available e Frequency the number of process instances in which both transitions fire at least once e Time in between the statistical values of the absolute time between the first fir ing of the one transition and the first firing of the other transition during replay of cases Activity metrics Often transitions are part of an activity i e a task For instance an activity clean can be represented by the transitions clean schedule clean start and clean
10. is then visible again For this example we will use the standard general settings and auto bottleneck settings Press the Start analysis button to continue 4 Plug in provides a way to select activities When the analysis i e the log replay method has completed a window similar to the one depicted in figure will appear An activity can be selected here by selecting one of the transitions that is related to an event class that refers to the activity A transition can be selected by selecting its name in the upper selectionbox on the lower right side of the window Another way to select a transition is by clicking on it in the Petri net visualization in the center of the window 5 User selects the activity s he is interested in Select one of the transitions that is related to the activity In our example we select 78 Appendix G Test cases the Az complete transition Plug in displays the activity related KPIs of the selected activity Below the Petri net the activity related KPIs of the selected activity will be displayed In the example in figure the KPIs related to activity Ax are displayed In this example bound values are given for the statistical values of the sojourn and the wait ing time of this activity bound values are colored red The statistical values of the execution time of activity Ax are exact values alysis Performance Analysis with Petri net Log Traces I gt Process informati
11. 6 00 7 00 8 00 Figure 3 7 Sequence diagram of communication between departments Sales Production Purchase Department Department Department Pattern 0 2X Pattern 1 1X Figure 3 8 Sequence diagram of communication patterns between departments 20 Chapter 8 Process Performance Analysis 3 4 2 Process model related functionality When we have a process model at our disposal there is some additional functionality that can be useful next to the functionality already discussed One of these additions involves XOR splits A XOR split in a process model is a decision point At such a location it is decided if a certain task if going to be performed on a case or another task but never both Often choice at these XOR splits is not done randomly but is dependent on certain case properties Suppose for instance that the cases in the business process under investigation are in fact orders A case property can then be the size of the order the client who made the order or the employee that approved the order At a certain XOR split choice can be based on such case properties It is for instance possible that orders bigger than a thousand pieces follow a different routing within a process than orders smaller than a thousand pieces For those event logs in which such case properties are registered it may be possible to determine the case properties that are responsible for choice at each XOR split When the case properties on w
12. In a Process Instance a number of events is described in Audit TrailEntry elements An AuditTrailEntry consists of a WorkflowModelElement an activity or task ID an EventType a optional Timestamp and an optional Originator a person performing the event Furthermore additional information about the event may be stored in Data elements There can exist various possible EventTypes in an event log In figure b the different event types that can occur for an activity are shown as well as the possible orderings in which they can appear for one activity An activity starts by being scheduled or by being skipped automatically autoskip When an activity has been scheduled the control over that activity is put into the information system which can then assign the activity to someone or to a group of persons It is possible to reassign an assigned activity which can be done by the information system or by a user A user can then start working on an activity that was assigned to him or can decide to withdraw the activity or skip it manualskip This can be done before the activity was assigned even A person that is working on an activity can suspend and resume it several times but in the end he either has to complete or abort ate_abort it An activity can get aborted pi_abort during its entire life cycle Some information systems do not keep track of the type of event that occurred though In such cases each event is said to be of type
13. Perform surgery Luigi 07 08 2006 09 15 13 4 Make X ray Kevin 07 08 2006 09 17 39 5 Examine blood James 07 08 2006 09 17 52 3 Examine blood James 07 08 2006 09 21 24 4 Nurse patient Kate 07 08 2006 09 25 07 2 Nurse patient Kate 07 08 2006 09 27 21 5 Make X ray Kevin 07 08 2006 09 28 42 1 Nurse patient Kate 07 08 2006 09 28 49 5 Nurse patient Kate 07 08 2006 09 30 01 3 Nurse patient Kate 07 08 2006 09 32 44 Table 2 1 Example of an event log Chapter 2 Process Mining Examine Patient Nurse Patient Figure 2 1 The derived process model in Petri net notation Deriving process models from event logs is not the only possible form of process mining There is much more information related to processes that can be obtained from event logs Other possible forms of process mining include conformance testing described in I7 social network analysis described in 22 and many more Conformance testing is a method which checks whether and to what extent activities actually execute as defined in a process model Social network analysis focuses on relationships between performers of activities in processes At 13 more information about the various forms of process mining can be found 2 2 Requirements To be able to perform process mining on event logs some requirements need to be put on the event logs that serve as input The following requirements are mandatory and are assumed to be met
14. a process model from an event log by using a mining technique the a algorithm as presented in The process model that results from this mining effort is a workflow net A workflow net is a Petri net which has exactly one start place and one end place where every other place or transition in the net is on a directed path from the start place to the end place 20 As noted in section in event logs different types of events can occur In workflow nets resulting from mining such event logs tasks are often not represented by one transition but by multiple transitions with places in between One transition then represents one event of the activity e g a start or a complete event In the example in figure for instance task D has been split up in transitions D Schedule and D Complete with a place in between these transitions EMiT uses the mined workflow net as an intuitive interface through which the values of KPIs can be accessed but also as a means to derive the values of these KPIs EMiT derives these values by simulating the cases that exist in the event log within the workflow net During simulation of a case the case is represented by tokens in the Petri net Each event that refers to the case is then considered in chronological order When an event that refers to the case is considered the transition to which the event corresponds is fired As argued in 23 firing of transitions in mined workflow nets is atomic because each event has a
15. and Implementation Furthermore we discovered that the log replay method in some exceptional situations may cause a transition to fail when it should not Take for instance the Petri net in figure 5 4 in which trace abc is replayed Using the current log replay method first transition a fires and then b Transition c is not enabled after this and cannot be enabled by firing a sequence of invisible transitions so c fails execution However when after firing transition a the invisible transition t is fired before transition b is then after firing b c is enabled and can fire normally Note that this a very specific situation which does not appear very often in practice Still research should be done to find a solution for this invisible t Figure 5 4 Exception 5 3 Performance Sequence Diagram Analysis The purpose of the Performance Sequence Diagram analysis plug in is to provide the user with a means to assess the performance of business processes The main focus here is to assist in finding the causes of certain e g extreme or common behavior This is done by means of diagrams in which the hand over of work between data element instances is displayed A design of this plug in was made based on the requirements that were established in section Again design started with creating use cases described in appendix E 2 and prototypes described in appendix F 2 for the plug in In figure 5 5 the class diagram that res
16. and may thus be bottlenecks Left of the Petri net visualization a table is displayed that contains the identifier names of all process instances in the input log You can select or deselect these identifier names of process instances Initially all process instances are selected If you press the Update button after selecting process instances the displayed performance information will be based on the selected instances only If you press the Invert Selection button all the process instances that are currently not selected get selected and all process instances that are currently selected get deselected On the right hand side of the window the values process related metrics are displayed Note that not only the average throughput time is displayed but also the minimum the maximum 94 Appendix H User manuals the standard deviation and the average time of the fastest process instances initially 25 of the total number of selected process instances which have the lowest throughput time the average time of the slowest process instances also 25 initially and the average time of all other process instances obviously 50 initially You can adjust the percentages by pressing the Change Percentages button below the throughput time table and filling in the new percentages in the dialogs that will appear Next to this you also have the possibility to export the throughput times of all selected process instance
17. are said to be invisible Such transitions can for instance be used for routing purposes Events in class e in the log may correspond to firing any transition t in the net satisfying e t R When the required input is present log replay can start The log replay method simulates cases within the Petri net We refer to the set of all cases in the log as C Replaying a case c E C starts with setting the Petri net to its initial marking Mo After this the events that refer to case c are replayed in the Petri net in chronological order When an event of class e is replayed and e is related to at least one transition in the net then one of the related transitions in t T e t R should fire When a transition fires a token is removed from each of its input places and a token is added to each of its output places To determine which transition in t T e t R should fire we need to establish which transitions in this set can fire A transition t can only fire when it is enabled i e when there is a token in each of its input places However the Petri net may contain enabled invisible transitions which can fire in such a sequence that t becomes enabled when it was not before See for instance the Petri net in figure 4 1 where the current marking is M p3 p4 Suppose that 24 Chapter 4 Requirements the cl Figure 4 1 log replay example ass of the event that is currently replayed is only related to t
18. are used in calculation This is the standard setting for process metrics Place metrics can be based on e All measurements taken all measurements taken in the process instances that fit in the Petri net as well as the ones that do not are used when calculation the values of place metrics e All measurements taken in each trace before the trace fails all measurements which are taken when replaying process instances that fit in the Petri net are used as well as 91 Appendix H User manuals the measurements taken in the traces that do not completely fit in the Petri net but which are taken before the moment at which an event occurs in the log that does not fit in the Petri net i e when the transition corresponding to the event is not enabled One can expect these measurements taken before a failure occurs to not be influenced by the non conformance of the trace in which they were taken That is why this is the standard setting e All measurements taken in each trace where no related transition has failed execution In case a related transition a transition of which this place is an input or output place is of the place fails in a trace all measurements which are taken in this trace are not used in calculations e All measurements taken in fitting traces only those process instances that fit are used in calculation Time in between metrics can be based on e All measurements taken all measurements taken in the
19. at the top of figure 3 9 Dotted charts like this one can be used for purpose of analysis From the top chart of figure 3 9 one can for instance derive that task Buy always is completed before task Sell It becomes more interesting when you compare a dotted chart with other dotted charts that are derived from the same event log and with the same event type but with another data element selected See for instance the other dotted charts in figure From these charts one can derive that Mike always performs task Sell and that Jack and Mike both can handle task Buy but that Jack seems to be faster in completing it Furthermore one can derive that Jack and Mike both work at the Production department Dotted charts can help in finding the causes of certain performance problems e g all cases where Mike performs task Buy results in a high average completion time of that task Task Time e Buy Sell Originator TMe p Jack Mike Production Figure 3 9 examples of dotted charts 22 Chapter 4 Requirements 4 1 Introduction Although the second research question as defined in section 1 3 leaves room for functionality that supports other forms of process mining the main focus of this graduation project has been on functionality that supports Process Performance Analysis PPA As seen in the previous chapter there is plenty of PPA functionality available that could be great additions to the ProM framework If all this functionality w
20. bo gt Tena b1 gt Tena sim bo sor gt Teng sim b1 sor Flexible equivalence instead adds the requirement that for each block b in sequence So that begins after block bo ends its peer in sg sim b1 8g should begin after or at the same time the peer of bg sim bo So ends Or formally c Vbo b1 Bo Thegin b1 gt Tena bo gt Trbegin sim b1 So Tena sim bo Sq Equivalent arrows should exist This again is dependent on the type of equivalence NS iii used At least the source block of each arrow in the one sequence must be equal to i e have the same index as the source block of the arrow with the same index in the other sequence and the same for the destination block The following function evaluates to true if this requirement is met a ar ARs VarcARs bsre ar sim Osre sim ar 85 So A baest ar sim baest sim ar so So This is the only requirement placed on arrows for flexible equivalence Strict equivalence also requires arrows to start and end in exactly the same order The following function evaluates to true if this is the case b Varo ar ARso Toegin aro gt Toegin ar1 Thegin sim aro So Thegin sim ari So A Tenalaro gt Tenalar gt Tenalsim aro so gt Tenalsim ar so In the example in figure 4 3 the requirements i iia iic and iiia are met but require ments iib and iiib are not The sequences are thus fle
21. comma separated text file Arrival rate of cases The arrival rate of cases is equal to the number of cases that arrive to the process per time unit Given replay output L and case c the arrival time T c of case c is MIN wer A case x c time x The average arrival rate of cases to the process then is c MAXcec T c MINeec T c Number of cases The number of cases in the log should also be available to the user This number is equal to FEC Number of fitting cases To give the user an idea about the conformance between the Petri net and the event log the number of cases that fit in the Petri net should be given as well The fitting cases are those cases in which no transition fails execution during log replay The number of fitting cases can be defined as cec Apex case x cA failed x true For more detailed information about the conformance between the used Petri net and event log the Conformance Checker plug in of the framework should be used 4 2 3 Place related KPIs Firing of transitions during log replay is atomic since events have a timestamp and not a duration attached to them Therefore we can say that the tokens that represent a case during log replay spend time in places In this section the KPIs that are related to places will be discussed Since the timestamp at which transitions fire is known also the creation timestamp of each token that is added to an output place of the tr
22. complete In such case activity metrics can be derived These are e Arrival rate rate at which work items arrive to the activity e Activity time metrics Waiting time the statistical values of the time between the moment at which the activity is scheduled and the moment at which execution of the activity is started Statistical values of the time between schedule and start events of the activity Execution time the statistical values of the time in which the activity is actually executed Statistical values of the time between start and complete events of the activity without the time the activity was suspended in between the start and complete Sojourn time the statistical values of the time between the moment at which the activity is scheduled and the moment at which it finishes execution Statistical values of the time between schedule and complete events of the activity The exact values of activity time metrics waiting execution and sojourn time can only be calculated when in the used log and Petri net there are schedule start and complete events present for all activities However when you are using a log and a Petri net which contain less information upper bounds of the values of activity time metrics are calculated and displayed instead This can be done because the process model that we have at our disposal defines how activities are interrelated For instance suppose the log and the Petri net only
23. possible URCAR 16 The user can view the sojourn time of each activity that 1 appears in the log If no exact values are recorded then bound values will be displayed instead if possible URCAR 17 The user can view for each time related KPIs e g throughput 1 time of the process waiting time of a place the average minimum maximum and standard deviation as well as the average of the xr of measurements that are counted as fast i e have a low value the average of the y of measurements that are counted as being slow i e have a high value and the average of the 100 x y of measurements that are counted as being normal URCAR 18 The user can adjust the percentage of measurements that are 2 counted as fast and the percentage of measurements that are counted as slow for each time related KPI URCAR 19 The user can change the classifications which define at which 1 average waiting time a place is considered to be a bottleneck URCAR 20 The user can view the routing probabilities per branch at each 1 XOR split in the Petri net URCAR 21 The user can select process instances 1 URCAR 22 The user can view the values of KPIs that are based on only 1 the selected process instances URCAR 23 The user can export the values of time related KPIs to comma 3 delimited text files Performance Sequence diagram functionality F Number Requirement Priority URCAR 24 The user can view the communication behavior between 1 instances
24. this option Read the help file For more detail C Use time saving option Apply changes Figure G 2 Screenshot Advanced settings window Here you can specify how the plug in should deal with measurements that are taken in those cases that do not fit in the Petri net i e where at least one transition fails during log replay Instead of one general option the plug in has a different option for each type of KPI where the types of KPIs are process related place related activity related and time between two transitions Since bottlenecks are places we focus on the settings for place related KPIs For this test case we select the following option All measurements taken in each trace before the trace fails 76 Appendix G Test cases The idea behind this option is that when a case does not fit in the Petri net all measurements that are taken during replay of the case before a deviation i e a failing transition occurs are not influenced by this deviation Those measurements can then be used normally in calculations Take for instance the average waiting time of a place This time is calculated based on the waiting times of the tokens that visit the place during log replay When during replay of a case a token is added to and removed from the place before a transition fails execution then the waiting time of that token can be included in the calculation of the average waiting time If a to
25. two sequences When using the strict equivalent type the sequences in the example above are considered to be different because in the left sequence block b0 starts and ends before block c0 and in the right sequence block c0 starts and ends before block 60 When using the flexible equivalent type though they are considered to be equal because although blocks b0 and c0 do not start in the same order the flexible equivalent requirement is met For both sequences we have the following After the end of block a0 blocks 60 c0 d0 and al start After the end of block 60 blocks d0 and al start After the end of block c0 blocks d0 and al start After the end of block d0 block al starts No block starts after block al ends When you have selected the settings that you wish to use you can click on the Show Diagram button The sequence diagram will then be displayed See for instance the example below in figure H 7p When you move your mouse cursor over a sequence information about the sequence will be displayed in the tooltip In the example the mouse is above two overlapping sequences Information about both these sequences is displayed This information encompasses 99 Appendix H User manuals E Analysis Performance Sequency Diagram Analysis Full diagram Pattern diagram AGO2 Voorbereiden AGO8 GBA afnemer AG14 Controle Thu Now 03 12 52 40 CET 2005 E N Process instance 05307ESWM392262 Thr
26. values should be calculated for the waiting time of a based on the waiting times of all registered executions of a Next to these values the number of waiting time measurements for activity a should also be available to the user Execution time of activity a The time between start and complete events of activity a without the time that a is suspended in between i e without the time between the suspend and resume events of a that occurred in between For each activity a the execution time of all registered executions of a should be calculated This is done as follows Each element x L where act x a and type x start is compared with the nearest following element x L for which act x a and type x complete When in list L in between x and g no element x L is registered for which holds act x a and type x start then the execution time with suspended time of the execution is defined to be time x time x The total suspended time in between x and 2 Tsusp x x needs to be deducted from this All elements y L located in between x and 2 in the list for which act y a and type y suspend holds should be compared against the nearest following y L registered before x in L for which act y a and type y resume holds When there is no y L located between y and y for which act y a and type y suspend holds then Tsysp z x should be
27. we have the log as displayed in table In cases 2 3 4 and 5 of this log the tasks examine patient examine blood make X ray and nurse patient are executed In case 1 the tasks examine patient perform surgery and nurse patient are executed Clearly all cases in the log begin with task eramine patient and end with task nurse patient These two tasks are thus the start and the finish of the process In between these two tasks the task perform surgery can be executed or the tasks make X ray and examine blood can be executed in arbitrary order Assuming that the log consists of completed cases only and that the cases are representative and a sufficiently large subset of behaviors is observed the process model as shown in figure 2 1 can be derived from this log Since the tasks examine blood and make X ray can occur in any order they are modeled as parallel tasks The shown process model is depicted in Petri net notation We will not explain the syntax of Petri nets here see 2 12 16 for a thorough description CaseID TaskID Originator Date 4 Examine patient Paul 07 08 2006 09 03 44 2 Examine patient Steve 07 08 2006 09 05 12 3 Examine patient Paul 07 08 2006 09 06 03 4 Examine blood Nick 07 08 2006 09 06 45 1 Examine patient Luigi 07 08 2006 9 07 33 5 Examine patient Steve 07 08 2006 09 08 12 3 Make X ray Kevin 07 08 2006 09 09 59 2 Make X ray Kevin 07 08 2006 09 13 14 2 Examine blood Nick 07 08 2006 09 15 11 1
28. Petri net Figure 5 2 Class diagram Results of performance log replay Based on the design the plug in was implemented and added to the ProM framework after thorough testing In figure 5 3 the main graphical user interface of the plug in is shown In the example shown here a place has been selected and the KPIs related to this place are depicted below the Petri net More details on how to use the plug in can be found in the user manual as presented in appendix H 1 During the tests the use cases that were defined in E 1 were performed with the plug in A description of these test cases can be found in appendix G 1 The test cases were performed with various logs and Petri nets as input In general the plug in performed well during these tests and results where obtained within reasonable time However in some situations problems arose In the cases where the input Petri net was large and contained many invisible transitions the log replay method took much time Research learned that the building of coverability graphs during log replay was the main cause of this Coverability graphs are built during log replay to obtain the shortest sequence of invisible transitions at every time the transition to fire is not immediately enabled The building of a coverability graph takes much time in those cases where many markings can be reached by firing invisible transitions This typically appears in large Petri nets that contain many invisible tran
29. R 2 The user can select a place in the Petri net 1 URCAR 3 The user can select up to two transitions in the Petri net 1 URCAR 4 The user can view the throughput time of the process 1 URCAR 5 The user can view the arrival rate of the process 1 URCAR 6 The user can view the total number of cases in the event log 1 URCAR 7 The user can view the waiting time of each place in the Petri 1 net URCAR 8 The user can view the synchronization time of each place in 1 the Petri net URCAR 9 The user can view the sojourn time of each place in the Petri 1 net URCAR 10 The user can view the total number of measurements taken for 1 each place URCAR 11 The user can view the arrival rate of each place 1 URCAR 12 The user can see in the Petri net which places are considered 1 to be bottlenecks 1This functionality is described in more detail in section 4 2 60 Appendix D User Requirements Number Requirement Priority URCAR 13 The user can view the the time it takes cases to get from one 1 selected transition to another URCAR 14 The user can view the waiting time of each activity that 1 appears in the log If no exact values are recorded then bound values will be displayed instead if possible URCAR 15 The user can view the execution time of each activity that 1 appears in the log If no exact values are recorded then bound values will be displayed instead if
30. TECHNISCHE UNIVERSITEIT EINDHOVEN Department of Mathematics and Computer Science Performance Analysis of Business Processes through Process Mining P T G Hornix Supervisors Dr A J M M Weijters TM IS Prof Dr P M E De Bra W amp I DH Dr M Voorhoeve W amp I AIS Eindhoven January 2007 Abstract Process Mining or workflow mining is an important research area of the section Information Systems IS TM of the department of Technology Management at Eindhoven University of Technology The starting point of process mining is registered information about tasks that are performed in a process handling of an insurance claim treatments of a patient in a hospital etc This registered information is typically called an event log The goal of process mining is deducing knowledge e g a process model average throughput time from such event logs The knowledge obtained this way can increase understanding of the processes within an organization and can help with redesigning them if needed Within the section IS TM a process mining framework has been developed called the ProM framework So far emphasis within this framework has been on the discovery and validation of process models workflows Herein the ProM framework is practically unique The analysis functionality that is currently present in the framework mainly consists of qualitative analysis Main concern of qualitative analysis is the logical correctness of proce
31. age waiting time measured against dimensions originator and task Process monitoring tools often start by displaying average values of KPIs e g average process cycle time average number of cases in the process but they also allow the user to view the statistical distribution of these KPIs An example of a statistical distribution is shown in figure 3 3 Using such statistical distributions the user can easily see how many and in some tools also which cases result in an extreme value for the KPI O5 640 11 15 16 20 21 25 26 30 3 35 36 40 41 45 46 50 Waiting time category Figure 3 3 Statistical distribution of the waiting time In Aris PPM the user has the possibility to select a group from such a statistical distribution After selection of a group the routings that the selected cases i e cases that belong to the selected group follow within the process definition of the business process can be displayed 12 Chapter 8 Process Performance Analysis Furthermore the user can compare these routings with the routings that other cases follow This can be useful for detecting whether odd behavior has occurred during processing of the selected cases e g a task is performed a number of times during processing of the selected cases Let us consider an example that illustrates how the described functionality can assist the user in finding the causes of extreme values for a KPI Suppose the user wants more details about the hig
32. ailable e Time in arrow ar average time between the begin and end of arrow ar Tend ar E Theginlar 40 Chapter 5 Design and Implementation 5 1 Introduction In this chapter the plug ins that have been implemented during this graduation project and their design will be discussed This graduation project has resulted in the implementation of two separate analysis plug ins which have been added to the ProM framework 1 Performance Analysis with Petri net described in section 5 2 2 Performance Sequence Diagram Analysis described in section 5 3 5 2 Performance Analysis with Petri net The purpose of the Performance Analysis with Petri net plug in is to provide the user with a means to assess the performance of business processes Main focus here is to provide the user with the values of key performance indicators in an intuitive manner A design of the plug in was made based on the requirements that were established in section Design of the plug in started with setting up use cases described in appendix and creating prototypes described in appendix F 1 After this UML class diagrams were designed that describe the structure and interfaces of the plug in In class diagrams the classes with their most important methods and the relationships between these classes are depicted Figures and depict the class diagrams of the Performance analysis with Petri net plug in The classes located in the lower part of the diagram
33. ansition is known For each token that is consumed by the firing transition the consumption timestamp is also known However when an input place of a firing transition contains multiple tokens then we cannot tell which token is removed exactly A pseudo random function with equal weight is used to select a token from each input place Note that this may affect the KPIs defined below Sojourn time of place p The sojourn time of a token in place p is equal to the total time a token spends in p i e the time between creation and consumption of the token Statistical values should be calculated for the sojourn time of place p based on the sojourn times of all tokens that are added to place p during log replay This calculation can be done in a similar manner as for the throughput 27 Chapter 4 Requirements time of the process Note that all statistical values besides the average sojourn time of p may be influenced by the random choice when p contains multiple tokens at firing of a transition This also holds for the synchronization and waiting time of p in which the sojourn time can be divided Synchronization time of place p The synchronization time of a token is the time between the moment at which the token is created and the moment at which the transition that consumes the token is fully enabled ie there is a token in each input place Statistical values should be calculated for the synchronization time of place p based on the s
34. as UI easily relate this model to the process and the placing of performance information herein will thus make sense to the user Another advantage is that bottlenecks can be detected at a glance Next to this ProM its greatest strength is deriving process models from event logs Making use of the process model as UI thus fits within the concept of ProM Petri nets are chosen because all other process models can be converted to Petri nets in ProM Time in This functionality increases the flexibility of the performance between two analysis because it can be used for the user to derive the values transitions of additional time related performance indicators 57 Appendix C Motivations Functionality Motivation Graphs and Dimensions Statistical distributions of KPIs Routings Views Displaying case properties responsible for choice at XOR splits Reporting and exporting functionality Graphs and tables increase insight into the performance of processes and the causes of extreme values in KPIs This functionality can become very complex however and would probably require a huge effort Other functionality that can assist in cause detection will be implemented instead Increases insight into how many and which process instances cause extreme behavior We will only state the average minimum maximum and standard deviation of time related KPIs and the averages of the fastest slowest and normal
35. as the same index as b Then di is of course equal to digim p s Each block b Bp does not have a begin or end timestamp associated with it but an average begin and end time The average begin time is the average time between the begin timestamp of a sequence and the begin timestamp of the block sim b s in the sequence For each b Bp we define Thegin b AVGses Thegin sim b s MI Nees Thegin b Tena b AVGses Tena sim d s MINoes Thegin b AR C Bp x Bp a set of arrows that connect the blocks and represent the transfer of work AR contains the same arrows as each of the sequences in Sp Each arrow ar AR has an arrow sim ar s in each sequence s Sp that has the same index as ar and the same source block sim bsrc ar s bsre sim ar s and destination block sim bgest ar s bdest sim ar s Just like with blocks the average time from the begin timestamp of each sequence is calculated for the begin and end of each arrow ar ARp Thegin ar AVGscs Thegin sim ar s MI Npep Thegin Tenalar AV Gses Tena sim ar s MI Noes Thegin The patterns obtained this way should be placed in a pattern diagram In such a diagram each pattern that occurs in the input log is displayed separately In the diagram the patterns are sorted descendingly based on the frequency The frequency of a pattern p is the number of sequences that it contains s Definit
36. be estimated by the plug in and standard coloring will be used Each level will then contain approximately one third of the places 4 Advanced Settings From the options screen you have the possibility to access the advanced settings by pressing the corresponding button In the window as depicted in figure that pops up you can specify which conformance settings you wish to use and if you wish to use the time saving option here Dealing with non conformance As mentioned when calculating the values of the KPIs all process instances in the input log are replayed in the input Petri net When doing this it can happen that a process instance does not fit in the Petri net i e it can happen that when a process instance is replayed an event occurs in the log at a moment at which the corresponding transition in the Petri net is not yet enabled Since process instances are replayed in the Petri net using the non blocking log replay method in such case the transition is artificially enabled by creating the missing 90 Appendix H User manuals E Advanced Performance Analysis options a Advanced Settings Values of various performance metrics can be influenced by non conformance These are the settings that are currently in use Process metrics are based on Place metrics are based on All measurements taken in each trace before the trace fails ly Times in between two transition are based on All measureme
37. between the two selected transitions for the selected case will be shown below the Petri net see figure G 5 if such a time is recorded for the selected case G 1 4 Export throughput times The first two steps of this use case are again the same as the first two steps of the previous two use cases At that point the advanced settings are shown as displayed in figure G 2 3 User sets non conformance option Since this use case is about the throughput time a process related KPI set the non conformance option for process metrics to the required value In our example we will use All measurements taken The throughput times of all cases fitting or not will then be used for calculation of the statistical values of the throughput time of the process Press the Apply changes button to proceed When the plug in returns to the general options window figure G 1h press the Start analysis button 80 Appendix G Test cases Log Traces Tireopel Process information My gt Total number selected 1 cases Number fitting 1 cases Arrival rate 0 0 cases per second ghp avg 8760 0 min 8760 0 max 8760 0 stdev 0 0 fast 25 0 0 slow 2 0 0 norma 8760 0 E Analysis Perfo hoo o o o hoo ooo 1001 hoos hoos hoos hoos 1010 1012 1013 4 Update Invert Selection 4 rs ey Zoom 130 1 Performance information of the selected transitions Waiting time Sel
38. by all event logs that serve as input to process mining 24 i Each event refers to an activity or task i e a well defined step in the process for instance contact customer ii Each event refers to a case i e a process instance e g an insurance claim iii Events are totally ordered Any information system that uses a transactional system will offer this information in some form However recorded events often also contain additional data This can for instance be the resource that handled the activity or the timestamp at which the event occurred When these data are used more information can be distilled from event logs than merely a process model For instance in order to determine the organizational model or to analyze the social network of a process the following requirement has to be met as well iv Each event refers to an originator a person which executes the task It will be clear that to be able to deduce basic performance indicators e g arrival rate of cases the mean waiting time of an activity the mean throughput time of a case from Chapter 2 Process Mining event logs a time dimension must be present Therefore to be able to derive performance information from event logs the following requirement will have to be met as well v Each event refers to a timestamp i e the time at which it was recorded The example log in table 2 1 meets all the requirements as described above 2 3 Th
39. cases where the user can determine which percentage of cases is fast slow or normal If a statistical distribution is required then the user can make use of exporting to comma separated lists These lists can be used by other tools for instance Statgraphics to derive the statistical distribution The functionality that will be implemented is therefore considered to be sufficient Probabilities will be shown at XOR splits within the Petri net so the routings that are followed by cases can easily be determined When there are many KPIs to monitor then the use of views can be very useful but in our situation it is not worth the burden This can be very useful for detection of the root causes of problems but it requires a large effort and goes beyond performance analysis It may be added to the ProM framework in the future though PDF and HTML reports can be very useful but this functionality is not essential and costs much time to implement Exporting performance information to comma separated lists is deemed to be more useful and will be implemented It can be used to export performance information which can then be used in other tools e g to get a statistical distribution 58 Appendix C Motivations Functionality Motivation Sequence and Pattern Diagrams Dotted charts Sequence and pattern diagrams provide insight in how work is normally transferred within business processes but also about w
40. ce related KPIs and the time between two transitions can be viewed in a similar manner as the values of activity related KPIs At step 4 of the main path the plug in provides a way to select places and transitions When the user selects a place the values of the place related KPIs of the selected place are displayed When the user selects two transitions the time between these two transitions is displayed E 1 3 View time between two transitions for one case only Brief Description Instead of the statistical values for the time between two transitions measured over all re played cases in the log also the time between two transitions for one case only can be viewed Preconditions A Petri net a log and a relation between event classes in the log and transitions in the Petri net can be accessed through the ProM framework Main Path 1 User starts the plug in 2 Plug in requests the user to set the non conformance option that defines how the plug in should deal with failing transitions 3 User sets the non conformance option 65 Appendix E Use cases Plug in provides a way to select two transitions User selects the transitions s he is interested in Plug in provides a way to select cases Sl E ON User selects a case 8 The plug in shows the time between the two selected transitions for the selected case Postcondition The time between the selected transitions for the selected case is displayed provided
41. ch such process instance describes one transaction flow To be able to display process instances in sequence diagrams however we also need classification roles or components Good options for components can be tasks how is work transferred between tasks originators how do originators communicate i e how is work transferred between originators or departments how is work transferred between departments The typical place where such information is stored in logs is within audit trail entries In an audit trail entry information about one event is stored e g who performed it which task is involved The communication that should be mined is the transfer of work between instances of data elements For instance when the data element i e the component is Originator then instances thereof can be Jack or Paul or another person These persons are then the components that communicate Let us look at an example Suppose we have a log as in table When we examine the communication between departments for these traces it may be possible to derive a sequence diagram as the one displayed in figure The sequences that visualize case 0 and case 2 bare strong resemblance The sequence that visualizes case 1 is different from the beginning 18 Chapter 8 Process Performance Analysis Case 0 TaskID Originator Department Timestamp Receive order Nick Sales 1 2006 0 15 00 Send production request Nick Sales 1 1 2006
42. contain start and complete events transitions In such case execution times can be calculated in a regular fashion Since we do not know when the activity was scheduled we can at best calculate an upper bound for the waiting time and the sojourn time of the activity This can be done by pretending that a schedule did take place at the moment that the last activity that is a direct precessor of this activity was completed The current activity may actually 88 Appendix H User manuals have been scheduled at every moment after this moment and before the start event The precise moment is unknown It is not always possible to calculate upper bounds for all activity metrics however When for instance in a log only complete events exist only an upper bound for the sojourn time which then naturally is also an upper bound for the waiting time and execution time can be calculated On the other hand if only start events occur it is possible to calculate upper bounds for all activity metrics but in this case the upper bounds of execution time and sojourn time of each activity do overlap with the upper bounds of waiting time and sojourn time of its direct successor s To show the difference between normal i e exact values and bound values for activity metrics bound values are displayed in red 3 Getting Started This plug in requires a Petri net and an associated log as input Associating a log to a Petri net can be done in the follow
43. cribed in section 3 4 1 should be implemented in a separate plug in This type of functionality appears to be very promising since communica tion patterns can give the user a clear view of what is normal and what is exceptional behavior within a business process It can also assist in detecting the causes of certain behavior The plug in requires an event log in standard MXML format as input To be able to define re quirements on this plug in we need to understand how sequence and pattern diagrams can be derived from an event log 4 3 1 Deriving Sequence Diagrams In a sequence diagram the process instances that occur in a log can be displayed as transactions i e sequences which represent the transfer of work between data element instances over time Such sequences can of course only be derived after a data element has been selected The user should have the possibility to select this data element out of a list that contains all data elements that appear at the audit trail entry level in the log Examples of data elements are task originator department etc To be able to determine when during an execution of a process i e when in a process instance there is a transfer of work from one instance of a data element to another the relations between the events in that process instance are needed Such relations can be obtained in a similar manner as is done by another plug in of the ProM framework the multi phase miner This plug in estab
44. d to the wanted value In our example we will set this value to 300 seconds All places with an average waiting time equal to or below this value will then be given the color of the low level after log replay At Medium level settings set the upper bound to the wanted value In our example we will use 1200 seconds All places with an average waiting time equal to or below 1200 seconds and above 300 seconds will then be given the color of the medium level All places with an average waiting time above 1200 seconds will be given the color of the high level For this test case we will keep the standard colors these can be changed by pressing the Change button behind the color that you wish to change Note that instead of the Use manual settings option also the Use auto settings option could have been chosen In that case the plug in would set the bound values for the waiting time levels in such a manner that after log replay approximately one third of the places would belong to each of the waiting time levels In the windows of figure also general options can be set The following general options are available e Time measured in the time unit in which the values of time related KPIs are 75 Appendix G Test cases given The standard value of this option is seconds other possible values are milliseconds minutes hours days weeks months 80 days and years 365 days e Decimal number the number of decimal places that th
45. data that is known about the event can be stored e g the department at which it was handled To bridge the gap between the log formats of the various information systems and the standard MXML log format another tool the ProM import framework 5 was developed This tool makes it possible to convert the logs of the various types of information systems to the standard MXML log format Chapter 3 Process Performance Analysis 3 1 Introduction Process Performance Analysis PPA is closely related to disciplines as Performance Measure ment Business Performance Management Process Monitoring and Business Activity Moni toring BAM PPA supports organizations in measuring and analyzing the performance of their business processes Although the term performance is commonly used in literature few people agree on what it actually means Definitions of the term performance have appeared e g in 8 4 11 We adhere to the definition from 3 namely the degree in which an organization carries its objectives into effect The performance of an organization can be measured through quantitative measurable in dicators These are commonly known as Performance Indicators PIs There exist many different performance indicators which can be time related e g throughput time of a process service time of an activity cost related e g process costs material costs or quality related e g visiting frequencies error rates For every
46. detecting the causes of possible deviations in the values of the monitored KPIs After discussion with other researchers at the BPM section the functionality that was found most suitable for achieving this was selected The selected functionality and the requirements placed thereon are discussed in chapter 4 of this document Eventually our research has led to the implementation of two plug ins which have been added to the ProM framework and can be used for analyzing the performance of business processes The Performance Analysis with Petri net plug in is a plug in which derives the values of certain KPIs from event logs by using a process model a Petri net This process model is also used as an intuitive interface in which problem areas can easily be spotted In section 5 2 of this document the design and implementation of the plug in was discussed The Performance Sequence Diagram Analysis plug in assists in Process Performance Analysis by visualizing the transfer of work between the components that collaborate in a business process This can be particularly useful for detecting the causes of certain rare or unwanted behavior In section 5 3 of this document the design and implementation of the plug in was discussed The overall conclusion that we can draw is that this project has resulted in the extension of the ProM framework with plug ins that enable the analysis of the performance of business processes Therefore we can estab
47. different kind of organization different per formance indicators are of importance A restaurant that has Michelin stars for instance will most likely find the quality of the food it serves of greater importance than the costs A fast food restaurant on the other hand will focus more on the costs and less on the quality of the food The indicators that best represent the mission and strategy of an organization are called its Key Performance Indicators KPIs They reflect the critical success factors of the organization and help the organization to define and measure progress towards its organizational goals I5 The well known Balanced Scorecard BSC as described in IO is often used to translate the mission and strategy of an organization into a comprehensive set of KPIs The BSC results in a set of planned values for the KPIs of the organization The actual values of these KPIs can then be monitored and analyzed to evaluate the performance status of the organization The BSC divides KPIs over four perspectives Financial Learning and Growth Customer and Internal Business Processes PPA focuses on the Business Process Chapter 8 Process Performance Analysis perspective of the organization KPIs based on this perspective help to determine how well business processes are contributing to achieving the goals of the organization Process monitoring tools are tools that support PPA by monitoring the actual values of the KPIs of business
48. e the user with a means to obtain the values of a selected group of key performance indicators KPIs out of an event log The values of the selected KPIs are obtained by means of the log replay method When defining the selected KPIs we first need to consider this method 4 2 1 Log replay The input of the log replay method is a Petri net model and an event log in MXML format In addition a relation R must be given between the event classes that are registered in the log and the transitions in the Petri net An event class encloses all events that refer to the same WorkflowModelElement i e task activity and the same Event type e g start complete The relation R is needed to be able to replay the cases that are registered in the log in the Petri net The ProM framework provides two ways to establish such a relation e The Petri net is mined out of the event log by a mining plug in of the framework When this is the case a relation between the mined transitions and the event classes in the log is automatically established by the mining algorithm e The Petri net is loaded into the framework in combination with the event log When this is the case the user has to establish which transitions in the Petri net should be related to which event classes in the log An event class can be related to zero or more transitions Likewise a transition can be related to zero or more event classes Transitions that are related to no event classes
49. e ProM framework The ProM framework is a framework that was developed to support the various forms of process mining It has been designed to facilitate the implementation of new algorithms and techniques in the field of process mining 26 New functionality can be added to the framework by means of plug ins A plug in is a piece of software or functionality that can be added or removed without affecting the total framework functionality 14 Plug ins can be seen as black boxes with well defined interfaces In figure 2 2 as presented in 17 a graphical interpretation of the ProM framework is shown At start up all plug ins are loaded into the framework after which they are available to the ProM Framework E Plugin Loader ee eee Plugin Interface Plugin Activator Provided Objects GUI Figure 2 2 The derived process model in Petri net notation Chapter 2 Process Mining user through a graphical user interface The ProM framework provides high level access to so called Provided Objects e g log files or graph structures like Petri nets These provided objects can serve as input and or output to the plug ins The ProM framework distinguishes five different groups of process mining plug ins 26 Mining plug ins which implement algorithms that mine models from event logs e g the a algorithm that distils a process model in Petri net notation from event logs Analysis p
50. e between transition u and t T c u t is equal to IMIN wer A case x c trans x w time x MIN wer A case x c trans x t time ax Based on the time between u and t for all cases in which both transitions fire at least once 28 Chapter 4 Requirements statistical values for the time between the transitions should be calculated Furthermore the number of cases in which both transitions u and t fire at least once during log replay should be available to the user Note that for each case at most one time between two transitions is calculated 4 2 5 Activity related KPIs In event logs an activity is often represented by multiple events with different event types Because of this various activity related KPIs can be discovered Waiting time of activity a The time between schedule and start events of activity a For each activity a the waiting time of all registered executions of a should be calculated This is done as follows Each element x L where act x a and type x schedule is compared with the nearest following element x L for which act x a and type x start When in list L in between x and wz no element x L is registered for which act 2 a and type x schedule holds then the waiting time of this execution is defined to be time x time z Note that we assume here that an activity cannot be executed in parallel with the same activity Statistical
51. e values of KPIs can have at most The standard value of this option is 2 In this test case we will use the standard general settings 4 Plug in requests the user to set the non conformance option that defines how the plug in should deal with failing transitions The plug in uses standard non conformance options but these can be viewed and changed through the window that appears after pressing the Advanced Settings but ton 5 User sets non conformance option Press the Advanced Settings button The window as depicted in figure G 2 will appear 4 Advanced Performance Analysis options Advanced Settings Values of various performance metrics can be influenced by non conformance These are the settings that are currently in use Process metrics are based on All measure aken in fitting Place metrics are based on All measurements taken in each trace before the trace fails Times in between two transition are based on All measurements taken in each trace before the trace fails z Activity metrics based on All measurements taken in each trace before the trace fails x Saving time Log replay can be a very time consuming method especially when the used Petri net contains invisible transitions Use the time saving option below if it costs too much Note however that some performance metrics e g place metrics are calculated based on the structure of the used Petri net and might be influenced by
52. e waiting time Preconditions A Petri net a log and a relation between event classes in the log and transitions in the Petri net can be accessed through the ProM framework Main Path 1 User starts the plug in 2 Plug in requests the user to set bounds and colors for the low medium and high waiting time level of places 3 User sets bounds and colors for low medium and high waiting time level and in this manner defines when a place is considered to be a bottleneck i e has a high waiting time level 4 Plug in requests the user to set the non conformance option that defines how the plug in should deal with failing transitions see also section 4 2 6 5 User sets the non conformance option 6 Plug in provides a visualization of the input Petri net but enhanced in such a way that places are colored according to their average waiting time Postcondition A Petri net visualization is available in which places are colored according to their waiting time level Places with a high waiting time level are considered to be bottlenecks Alternative Paths e At step 3 instead of setting bounds values and colors the user can let the plug in choose the bound values and colors Approximately one third of the places is contained in each waiting time level when the plug in uses these standard settings This does not affect the remainder of the path e After step 6 the user has the possibility to adjust the bounds and color
53. ected Frequency 1 cases 2 Tima in between seconds E High Transition Ax complete z avg 6540 0 E Medium ee min 6540 0 E Low E max 6540 0 Transition Tireopel start bd ett mo fast 25 00 0 0 Figure G 5 Screenshot Two transitions selected in main window Plug in shows process related KPIs including the statistical values for the throughput time On the right side of the window that appears similar to figure G 3 the process related KPIs are depicted User requests to export the throughput times Press the Export time metrics button that is located below the table in which the statistical values of the throughput time of the process are shown Plug in asks for a filename The window as depicted in figure appears The user provides filename At File Name fill in file or another name of your liking In this window below the filename also the file extension txt or csv can be set In this example we keep the standard setting for the file extension csv Press the Export button Plug in creates a comma separated file which contains the throughput times of all cases in the log 81 Appendix G Test cases i 6 6 6 l wl fra les s o Look In My Data Sources MA es 8j c cca File Name Files of Type csv x m Export Cancel Figure G 6 Screenshot Export window The plug in creates a comma separated
54. ed KPIs and bound values for activity related KPIs are affected by this A possible solution may be to perform log replay in heuristic nets instead of in Petri nets Heuristic nets are process models as well but they do not contain places nor invisible transitions or tasks Because the concept of places does not exist the place related KPIs will be lost when using heuristic nets and bottleneck coloring cannot be done then On the plus side the bound values of activity related KPIs can be calculated more accurately when using heuristic nets and results will likely be found faster Further research will be needed to see if using heuristic nets is feasible and useful Another issue that requires attention is the consumption of a token from each input place at firing of transitions during log replay When an input place contains multiple tokens the log replay method selects a token from this place pseudo randomly In real life situations selection often is not done randomly at all The First In First Out FIFO principle is for instance frequently used instead Certain statistical values of KPIs can be affected by the used selection method e g the minimum maximum and standard deviation value of the sojourn time of a place It may be worthwhile to extend the log replay method with support for other selection methods such as FIFO The user can then be given the choice on which principle to use for each place 44 Chapter 5 Design
55. els In these tools a user defined process model is used as an interface through which performance information can be accessed An advantage of the ProM framework is that next to user defined process models also process models that are mined from event logs can be obtained and used for this purpose Discussion with other researchers and investigation of the academic tool EMiT led us to believe that process models next to being used as a user interface can also be used as a means to derive certain performance information Functionality that supports process performance analysis 48 Chapter 6 Evaluation and that makes use of a process model was presented in sections 3 2 2 and of this thesis Next to answering the research questions our research aimed at finding more functionality that could assist in analyzing the performance of business processes Research of related tools IBM Web Services Navigator and discussion with other researchers at the BPM section led to new ideas for functionality in this area The sequence and pattern diagrams as presented in section 3 4 1 and the dotted charts as presented in section 3 4 3 were for instance found this way The main conclusion that we drew from our research was that tools that support process performance analysis and thus also our plug ins should mainly serve the following purposes e Monitor the performance status of processes by keeping track of the values of KPIs e Assist in
56. ent e Participants in the business process Each of these function roles has its own requirements for the provision and preparation of information Senior Management for instance is more interested in global information such as the evolution of the performance of one of the business processes Participants on the other hand need information to be highly detailed i e participants need information concerning the specific task s in which they are involved Some monitoring tools such as ILOG JViews support this by providing different views for different users 3 2 4 Reports Commercial monitoring tools often offer extensive reporting functionality With this func tionality the user can easily create reports that contain relevant performance information and distribute these reports to others Some tools can even generate reports automatically by creating them at predefined moments e g on a daily or weekly basis Possible output formats for process monitoring reports include e Internet compatible pages in HTML 14 Chapter 8 Process Performance Analysis e Platform independent PDF documents e XML files which allow for further processing e Comma delimited text files which can be used by other programs e g MS Excel StatGraphics 3 3 Academic Process Monitoring Tool There also exist academic tools that support PPA EMiT developed at Eindhoven University of Technology is such a tool EMiT is a tool which can derive
57. epartment being active It is thus clear that there are two different patterns in this diagram These patterns are displayed separately in the diagram in figure 3 8 along with their corresponding frequency i e the number of sequences that follow the pattern The communication patterns between data elements can be useful for the analysis of processes With these patterns common behavior can easily be distinguished from rare behavior If also additional performance information is included such as the average throughput time then wanted behavior patterns with a low average throughput time can be distinguished from unwanted behavior patterns with a high average throughput time as well The discovered patterns can also assist in finding the causes of certain behavior e g a large period in which a specific data element instance is active Measures may then be taken to prevent unwanted behavior from occurring in the future and to stimulate wanted behavior For example suppose there is a pattern with a high average throughput time in which work is transferred from originator John to Ben If there is also a pattern where instead work is transferred from John to Mike and which results in a low average throughput time then it may be worthwhile to request John to transfer work to Mike rather than to Ben if possible 19 Chapter 8 Process Performance Analysis 0 00 1 00 Time 2 00 3 00 4 00 case 2 case 1 l
58. epresent the transfer of work between data element instances over time Each sequence in a sequence diagram is displayed in a different color and sequences can overlap i e a data element instance can be active in more than one sequence at the same time A sequence diagram has two dimensions I the set of all data element instances in S i Is So E S The data element instances that occur in this set are displayed as lifelines along the horizontal axis T the time that passes from the beginning of the first sequence in S until the end of the last sequence in S The time dimension is displayed along the vertical axis The time axis begins at MIN eB Asc S Thegin b and ends at MAX eB Asces Tena b It should be possible for the user to view a sequence diagram that contains all sequences that can be derived from the input event log Furthermore for each sequence so S the following information should be accessible e Throughput time of s Time between the beginning and end of the sequence so This can be defined as T 80 MAXpep Tena b MINoes Tbegin b 35 Chapter 4 Requirements For each block b B also the following should be available e Time in block b Time between begin and end of block b this can be defined as Tend Thegin b For each arrow ar AR also the following should be available e Time in arrow ar Time between begin and end of arrow ar This can be defined as
59. er of work is displayed in a sequence diagram in which you can easily see which data elements cooperate What is more this plug in takes the sequences that visualize the transfer of work and compares them against each other to see if they follow the same pattern The found patterns are then displayed in a separate diagram the pattern diagram Here the patterns are displayed sorted based on the frequency of sequences that follow the pattern the pattern with the highest frequency if displayed at the top Furthermore in formation such as the mean throughput time of the patterns is available as well here Using this information you can easily determine which patterns appear often common behavior 97 Appendix H User manuals which appear rarely rare maybe even unwanted behavior and which patterns result in a high average throughput time and thus may indicate unwanted behavior With help of the patterns you may be able to understand the causes of this unwanted behavior much time spend at a certain data element instance perhaps This can help you with preventing the behavior from appearing in the future How to use For this plug in only an event log is required as input When you start the plug in first log relations will be derived from this log When this is done you will see a screen as the one depicted below figure H 5 E Analysis Performance Sequency Diagram Analysis Full diagram Pattern diagram Flex
60. es t are fired first After the transitions in the sequence have fired t is enabled and is fired At each transition firing that takes place during log replay the following information about the firing is stored trans Identifier of the transition that fired case time Identifier of the case that was being replayed when the transition fired The timestamp of the event that was being replayed when the transition fired 25 Chapter 4 Requirements act Identifier of the activity to which the event that is related to and replayed by the firing transition refers type The event type of the event that is related to and replayed by the firing transition failed Boolean set to true if trans failed execution for this firing Note that when an invisible transition fires no value is stored for act or type The time that is stored when an invisible transition fires is the same as the time stored for the transition that is enabled by the sequence of invisible transition s We thus assume that invisible transitions are not eager to fire i e they fire at the latest possible moment The log replay method has as output a list L that contains the information of each transition firing that took place during log replay The list L can be used for the calculation of the values of the selected KPIs 4 2 2 Process related KPIs In this section the KPIs that are related to the performance of the entire process wil
61. estamp to which ea refers until the timestamp to which ey refers e and e refer to different data element instances If this is the case we know that there is an arrow i e a transfer of work from the data element instance to which e refers to the data element instance to which ey refers This arrow starts at the timestamp to which ea refers and ends at the timestamp to which ey refers The comparison of all events of a process instance against all their closest causal predecessors results in a collection of arrows and periods of activity These arrows and blocks i e periods together form a sequence It is possible that the periods in which a certain data element instance is active in a sequence overlap In a department for instance more than one activity for the same case can be executed at the same time When overlap occurs the blocks that overlap should be combined because a data element instance is active or not and cannot be active a number of times Now we can define Definition Sequence A sequence S represents the transfer of work between the set of data element instances J that occur in process instance g over time The sequence consists of a set of blocks Bs and a set of arrows ARs where Bs a set of blocks or periods in which the data element instances in Is are active Each block b Bs has a begin timestamp Thegin b an end timestamp Tenalb and a data element instance di b associated with i
62. esult in extreme situations e g instances with extremely high throughput times What is more it can assist you in finding the causes of such behavior In the logs of today s information systems a lot of information is recorded This plug in will focus on information that is stored on the audit trail entry level of the logs In an audit trail entry information about one event is recorded Common information stored at the audit trail entry level includes e TaskID Name of the task involved in the event e Event type Type of the event e g schedule start complete e Timestamp Timestamp at which the event took place e Originator Name of the originator involved in the event Though one could also think of other information that may be stored here such as e Department Name of the department at which the event was handled e Part Name of the part of the case that is currently being handled For instance suppose the process under consideration is the production line of a car A case is then the production of one car Parts currently handled may be doors wheels windows etc e Country Name of the country in which the event took place This plug in allows you to focus on a certain data element such as taskID originator de partment and see how transfer of work between instances of the selected data element for instance in case of originator this can be Nick or John or another person takes place for each case The transf
63. ets ICATPN 2005 G Ciardo and P Darondeau LNCS 3536 2005 W M P van der Aalst and M Song Mining social networks Uncovering interaction pat terns in business processes International Conference on Business Process Management BPM 2004 2004 W M P van der Aalst and B F van Dongen Discovering workflow performance models from timed logs Lecture Notes in Computer Science 2002 W M P van der Aalst B F van Dongen J Herbst L Maruster G Schimm and A J M M Weijters Workflow mining A survey of issues and approaches Data and Knowledge Engineering 2003 W M P van der Aalst A J M M Weijters and L Maruster Workflow mining Which processes can be rediscovered BETA Working Paper Series 2002 5l Bibliography 26 B F van Dongen A K A de Medeiros H M W Verbeek A J M M Weijters and W M P van der Aalst The ProM framework A new era in process mining tool support 6th International Conference on Applications and Theory of Petri Nets G Ciardo and P Darondeau LNCS 3536 444 454 2005 27 B F van Dongen and W M P van der Aalst Multi phase process mining Building instance graphs In Atzeni P Chu W Lu H Zhou S Ling T eds Conceptual Modeling ER 2004 LNCS 8288 pages 362 376 2004 28 B F van Dongen and W M P van der Aalst Multi phase process mining Aggregating instance graphs into epcs and petri nets 2nd International Workshop on Applications of Petri Nets to Coordination Wo
64. f course also for their support beyond this project Peter Hornix Stramproy January 2007 Contents 1 Research Assignment 1 1 Assignment Background 0 00002 e ee 1 2 Problem Area 1 3 Assignment definition scs kn dons ooh ee OEE oe ee Bk ee a eg 1 4 Outline 2 Process Mining 2 1 Introduction 2 2 Requirements 2 3 The ProM framework 2 e 3 Process Performance Analysis 3 1 Introduction 3 2 Commercial Process Monitoring Tools 2004 3 2 1 Graphs and tables 3 4 gcs p ecetes OE oe ee Be ee a A ee 3 2 2 Process model 3 2 3 Views 3 2 4 Reports 3 3 Academic Process Monitoring Tool 2 22004 3 4 Other Process Performance Functionality o a a 3 4 1 Sequence diagrams 00 2 ee ee ee 3 4 2 Process model related functionality 2 3 4 3 Dotted charts 4 Requirements 4 1 _Introduction 4 2 Performance analysis with process model ooa a a 4 2 1 Log replay 4 2 2 Process related KPIs 0 0 0 0 0000 a 4 2 3 Place related KPIs 0000000000000 eee 4 2 4 Time between two transitions 0 0 000000 2 eee 4 2 5 Activity related KPIs 0 200202020004 4 2 6 Failing transitions 0 0 a 4 2 7 _User interface ii aAow Ww OF Fe 10 10 13 13 14 15 17 17 21 21 23 23 24 24 26 27 28 29 30 31 Co
65. file with name file which contains the through put times of all cases in the log Comma separated text files can be loaded into other programs such as Microsoft Excel or Statgraphics and can be used for further analysis G 2 Performance Sequence Diagram Analysis All use cases of the Performance Sequence Diagram Analysis plug in have the same precondi tion A log can be accessed through the ProM framework We assume the precondition is met at the start of each test case In the remainder of this section we will address the use cases of appendix E 2 one by one to verify whether they can be performed with the Performance Sequence Diagram Analysis plug in G 2 1 View throughput time sequence 1 User starts the plug in Start the plug in by first selecting Analysis from the menu bar of the ProM framework and then selecting the Performance Sequence Diagram Analysis item from the menu that appears 2 Plug in provides a list of the available data elements that appear on the audit trail level in the log and requests the user to choose one At start up of the plug in the window as depicted in figure G 7 is displayed On the left side of this window options can be set 3 User selects a data element At Component type select the data element that your are interested in In our example 82 Appendix G Test cases 6 Analysis Performance Sequency Diagram Analysis Pattern diagram Flexible equivalent S
66. for instance 0 20 and the probability that the token is consumed by transition B is 0 80 e Number of observations The number of recorded values of each time related KPI 3 4 Other Process Performance Functionality Next to the functionality available in process monitoring tools also functionality that may assist in PPA and is contained in other tools was found Furthermore some completely new functionality that supports PPA was discovered In this section we will describe this additional functionality 3 4 1 Sequence diagrams IBM Web Services Navigator WSN as described in I is a tool that was created to foster a better understanding of service oriented architecture SOA applications These days systems often consist of components e g servers which communicate with each other WSN is a tool that mines communication between servers from logs and displays this communication within sequence diagrams see appendix B for more information about sequence diagrams In these sequence diagrams the components that communicate services are put along the horizontal axis as classifier roles Along the vertical axis time is measured increasing from top to bottom The time can be displayed as conventional time seconds minutes etc or as logic steps In a sequence diagram a set of transactions or scenarios which are separate flows of connected arrows and vertical blocks is displayed An arrow in a transaction represents commu
67. h as for example Microsoft Excel or StatGraphics for further analysis 6 Adjusting Settings When you press the Change Settings button below the waiting time legend a window will appear which looks quite similar to the Initial Settings window Here you can set adjust the following options The timesort i e time unit in which metrics are displayed milliseconds seconds minutes hours days weeks months years e The accuracy of the displayed metrics i e the number of decimal places that a metric can have at most e The values of the bounds between the waiting time levels e The colors corresponding to the waiting time levels If a place was selected when you pressed the Change Settings button then you can also choose to apply the filled in bound values and selected level colors to all places in the Petri net or to apply them only to the selected place Furthermore you can press the Advanced Settings button after which you can specify in the window that pop ups how the plug in should deal with non fitting traces 96 Appendix H User manuals H 2 User manual Performance Sequence Diagram Analysis Introduction The purpose of the performance sequence diagram plug in is to provide you with a means to assess the performance of your processes This plug in can especially be of help if you want to know what behavior in your processes is common what behavior is rare and what behavior may r
68. h value for the average waiting time of task Sell when it is performed by originator John The user can select the corresponding bar the highest one in the diagram of figure 3 2 and choose to view the statistical distribution instead of the average value for this KPI After this the user can select the cases for which the waiting time when task Sell is performed by John is the highest and view the routings of these cases If the user compares these routings with the routings that cases with normal values for this KPI follow then he can easily see if odd behavior has occurred e g strange routings followed and perhaps prevent such behavior from occurring in the future 3 2 2 Process model In some tools such as ILOG JViews 9 performance information concerning a business process can be accessed through a user defined process model that represents the business process In such a process model the user can select objects i e a specific task in the model or the whole model After selection of an object performance information that is related to that particular object is displayed In the example in figure 3 4 the user has selected the task 1st Level Support The shown information is the change in the number of pending work items over time for this task displayed in a graph Using the process model as a user interface makes it easy to spot bottlenecks within a business process and to detect the causes thereof A bottlenecks is a po
69. have the WF net of figure 3 5 and during replay of a case in this net transition D Complete first fires then transition E Complete and after that transition G Schedule For p the place that is located between D Complete and G Schedule the sojourn time in this example is the time between the firing of D Complete and the firing of G Schedule The synchronization time of place p is then the time between the firing of D Complete and the firing of E Complete The waiting time of place p is the time between the firing of E Complete and the firing of G S chedule In EMiT the user can summon the values of the KPIs that are related to a place by selecting the place within the Petri net In the example in figure the user has selected the place between D Complete and G Schedule The KPIs are displayed below the workflow net Next to the already described KPIs also the following KPIs are available in the EMiT tool e Process flow time Throughput time of the process i e average minimum maxi 16 Chapter 8 Process Performance Analysis mum variance in time that cases spend in the process e Probabilities At each outgoing arc of XOR splits i e places that are input to more than one transition the probability that a token in the XOR split is consumed by the transition to which the arc leads Suppose for instance that place p is input to two transitions A and B The probability that a token in place p is consumed by transition A is
70. he place At least three waiting time levels should be distinguished low medium and high The user should have the possibility to change the bounds of these levels as well as the color that corresponds to each of these levels Coloring places can help the user to discover bottlenecks because bottlenecks are typically locations in the process where a lot of time is spend Next to this at each outgoing arc of places that are XOR splits the probability should be displayed that a token in the place is consumed by the transition to which the arc leads 31 Chapter 4 Requirements This can help to determine the causes of certain behavior because the probabilities give an indication of the routings that cases follow within the process model The Petri net should also be used as a means to access the values of KPIs It should be possible to access the values of the place related KPIs of a certain place by selecting the place within the Petri net The time between two transitions should be accessible by selecting the transitions in the Petri net When a transition is selected the activity related KPIs of the activities to which related event classes of the transition refer should be displayed Furthermore the user should also have the possibility to select and deselect process instances All KPIs should be based only on the selected process instances 4 3 Sequence Diagram functionality The sequence and pattern diagrams as des
71. hich behavior is rare What is more these diagrams and the additional information e g average throughput time can assist in detecting the causes of rare or unwanted behavior This information may be used to prevent such behavior from occurring in the future Dotted charts can give the user an idea on how data elements are related and can therefore help to discover the causes of certain extreme behavior Furthermore this functionality is unique it is not present in any other tool However these charts can be quite useless at times as well and this functionality of course also requires some effort to implement That is why this functionality will not be implemented 59 Appendix D User Requirements In this appendix the original requirements placed upon the Process Performance Analysis functionality are listed along with their priority Priorities have been assigned to requirements based on the relevance of the functionality The following priority levels exist 1 the highest priority level requirements of this priority level must be met 2 the middle priority level requirements of this priority level should be met 3 the lowest priority level requirements of this priority level may be met Functional requirements URCAR Performance Analysis with process model functionality Number Requirement Priority URCAR 1 The input Petri net must be used as a user interface 1 URCA
72. hich choice is dependent can be summoned for each XOR split this can be of great use when trying to detect the causes of extreme values of certain KPIs e g all orders bigger than a thousand pieces follow a certain route in the process model which always results in a high throughput time In case Petri nets are used as process models as in EMiT a feature may be added that allows users to select two transitions in the Petri net The time average minimum maximum and variance that elapses between the firing of the one transition and the firing of the other transition can then be displayed along with the number of observations i e the number of times both transitions fire The user can then derive time related KPIs himself For instance in the situation where in the Petri net a task is represented by a start transition and a complete transition the user can select these two transitions and the time between the firing of these two transitions will be given This time in fact is the service time of the corresponding task where the service time of a task is the time in which the task is actually being executed Another option here is to allow the user to fill in bound values for the time that may elapse at most and the time that should elapse at least between the firing of the two transitions After filling in these bound values the number of cases the average and the min max var time between the transitions and the distinguishing case pro
73. his example are Customer Bank Account Ledger and Account After the request of the customer to the bank the bank becomes active The bank requests the account of the customer from the account ledger which becomes active and sends the account to the bank The bank then gets the balance from the account which it then sends to the customer who can examine it After having handled the request the bank stops being active 59 Appendix B Sequence diagrams Figure B 1 Sequence diagram of a process at a bank 56 Appendix C Motivations It is not possible for us to implement all the functionality as described in chapter 3 of this document That is why decisions had to be made about which functionality to implement and which not These decisions were made based on a contemplation between the Pros and the Cons of each type of functionality In the table below the functionality has been listed guided with a motivation why it should or should not be implemented Functionality Motivation Key Performance The most relevant KPIs which can be distilled from most event logs Indicators will be implemented Adding more KPIs will make the functionality more useful but will also result in an increase in the amount of effort needed Furthermore extra restrictions on the log may be required to be able to derive the values of these additional KPIs Using Petri A process model gives a clear overview of the process The user can net
74. hod extends the LogReplayMethod with performance analysis functionality The results of the logReplay method are obtained as a PerformanceLogReplayResult which provides access to the data that has been collected As can be seen in figure the PerformanceLogReplayResult consists of an extension of the input log and an extension of the input Petri net with performance information Methods that start with getValues return the statistical values of the time related KPI in question Note that activity related KPIs can only be accessed through an ExtendedTransition Each ExtendedTransition is related to zero or more Activity objects A visualization of the Petri net is available through the ExtendedPetriNet class and can be displayed in the GUI of the plug in 42 Chapter 5 Design and Implementation Processinstance reset hasNext next LogReader getPlace reset getPNEdge hasNext getTransition next ProM framework ExtendedPNEdge ExtendedProcessinstance ee O ee geiFirstFiringTimes i hasCompleted hasFailed getValuesWaitingTime setThroughputTime 7 getValuesSojournTime get ThroughputTime ab getValuesSynchronizationTime getArrivalTime ExtendedLogReader PerformanceLogReplayResult getValues ThroughputTime getValuesExecutionTime getValuesSojournTime getBoundValuesWaltingTime getBoundValuesExecutionTime getBoundValuesSojournTime Performance Analysis with
75. ible equivalent Strict equivalent Show diagram _ Fitter options Figure H 5 Screenshot of the Performance Sequence Diagram Analysis plug in at start up On the left hand side of this screen there are a number of options that you can set At the top you can choose the data element that you wish to use Note that the types available here are dependent on the information contained in the log Clearly if departments are not mentioned in the log then the department data element type will not be available The second option available is the time sort here you can select the unit in which the available times should be displayed Time sorts available here are milliseconds seconds minutes hours days months 30 days and years 365 days The last setting that is available is the pattern type You can choose between using the flexible equivalent or the strict equivalent pattern type The difference between the both is that when using the strict equivalent pattern type sequences are defined to be different if data element blocks do not start and end in exactly the same order When using the 98 Appendix H User manuals flexible equivalent pattern type this is not needed Instead for each block in the pattern the requirement is set that all blocks that begin after the block ends should begin after the block ends in the compared sequence as well Let us look at an example figure H 6 Figure H 6 Example of
76. iling transitions At least the following options should be available 30 Chapter 4 Requirements Event types in log Exact values for Upper bound values for schedule start complete sojourn time waiting time execution time schedule start waiting time sojourn time execution time schedule complete sojourn time start complete execution time sojourn time waiting time schedule sojourn time start sojourn time execution time waiting time complete sojourn time Table 4 1 Values of activity related KPIs e All measurements are used for calculation of values of KPIs e Only measurements taken in fitting cases are used for calculation of the values of KPIs A case fits if no transition fails execution during log replay of the case e All measurements taken in fitting cases as well as all measurements taken for non fitting cases that are taken before a transition fails execution during replay of the case are used for calculation of the values of KPIs 4 2 7 User interface User friendliness is of high importance for this plug in The plug in should help the user to spot problems within processes in an intuitive manner One way in which this is established is by using the input Petri net as an interface The plug in should extend the input Petri net by giving the places in the net a color that corresponds to the average waiting time of t
77. in turn is a list that contains the audit trail entries that belong to the process instance An audit trail entry is a data structure in which information about one event is stored This data structure encompasses e WorkflowModelElement the name of the task that was involved in the event TaskID e Event type for each task different types of events can occur schedule occurs when the task is scheduled i e is ready to be executed re assign occurs when the task is re assigned to a resource start occurs when the task actually starts execution complete occurs when the task is completed i e the execution of the task ends suspend occurs when the task is suspended i e execution is halted 7 Chapter 2 Process Mining resume occurs when a suspended task starts executing again i e is resumed skip occurs when a task is skipped manually or automatically withdraw occurs when a task is withdrawn i e decision was made that the task will not be executed abort occurs when a task is aborted i e execution of the task is stopped and will not be continued normal Some information systems do not keep track of the type of event that occurs so when the event type is unknown it is said to be normal e Originator optional the person or resource that handled the event e Timestamp optional the moment at which the event occurred e Other data optional here any other
78. ing way First open the log file in MXML format Then import the process model via choosing File Open process format file With log from the menu This process model must either be available in some Petri net format a tpn or pnml file or in another modeling paradigm that can be read by ProM such as an EPC or a YAWL model which can then be converted into a Petri net by dedicated conversion plug ins within the framework Another possibility is of course to mine a process model from the event log using one of the mining plug ins When a Petri net and an associated log have been provided and the plug in has been started you will first see a screen where you can specify the initial settings In the example below the user has chosen to have all values of the time related KPIs to be given in days and to have the KPIs given with an accuracy of up to five decimal places Instead of days the user could have selected milliseconds seconds minutes weeks months 30 days and years 365 days The user has also specified to use manual bottleneck settings A bottleneck in a process is a place with a high average waiting time Three bottleneck levels are distinguished low medium and high and each of these levels has its own settings After calculation the used Petri net will be visualized with places colored according to their waiting time level At low level and medium level you can specify the upper bound of the levels I
79. int of congestion in a system that occurs when workloads arrive at that point more quickly than the point can handle them The inefficiencies brought about by the bottleneck often create a queue and a longer overall cycle time 7 In JViews a task is colored red if it is considered to be a bottleneck else it is colored yellow perhaps a bottleneck blue probably not a bottleneck or white certainly not a bottleneck In case the user selects a task that is in fact a bottleneck the shown information that belongs to the task may well assist in detecting what causes the task to be a bottleneck 3 2 3 Views Many people are involved in monitoring the performance of business processes this includes e Senior management 13 Chapter 8 Process Performance Analysis 2 Workflow Cockpit Demo lol feed Workflow Cockpit Table View lol x File View Help name a yi Customer Request BR S a amp l AT Automatic Processing Hist Level Support A Dispatching Guru Support www ilog com Java Applet Window H Workflow Cockpit Chart iew Number of Pending Work Items o Ast Level Support 2 4 5 6 7 8 9 10 14 12 13 14 15 16 17 18 19 20 21 Time Ticks lava Applet Window B activity Automatic Processing o 2nd Tier Support Customer Request 1stLevel Support ava Applet Window Figure 3 4 Example of ILOG JViews in which a process model is used as UI e Middle managem
80. ion Pattern Diagram a pattern diagram is a diagram in which a set of patterns P is displayed Each pattern is displayed separately from the other patterns The pattern with the highest frequency is displayed at the top of the diagram and the one with the lowest frequency at the bottom A pattern diagram has two dimensions rT the set of all data element instances in P i Ip p P The data element instances that occur in this set are displayed as lifelines along the horizontal axis the time dimension For each pattern p the time starts at 0 and ends at MAXvep Tena b MINgep Tregin b 39 Chapter 4 Requirements The user should have the possibility to view the pattern diagram which contains all different patterns that occur in the sequence diagram Furthermore for each pattern the following information should be available e Throughput time of pattern p The time between the begin and end of sequences that belong to p The following statistical values are calculated for the throughput time of pattern p AVGscs T s MINses T s MAXses T s the maximum throughput time STDEVses T s the standard deviation in throughput time the average throughput time the minimum throughput time For each block b Bp the following information should be available e Time in block b average duration of block b Tena b a Tegin b For each arrow ar AR the following information should be av
81. ion of the selected period 85 Appendix H User manuals This appendix contains the user manuals of the Performance Analysis with Petri net plug in and the Performance Sequence Diagram Analysis plug in H 1 User manual Performance Analysis with Petri net 1 Introduction The purpose of the Performance Analysis with Petri net plug in is to provide you with a means to assess the performance of your processes Main focus here is to provide you with Key Performance Indicators KPIs which can be summoned in an intuitive way This is done with help of an event log and a process model in form of a Petri net of the process under consideration The process model that is used for this can be user defined and imported into the ProM framework or obtained by process mining By replaying the log traces of the event log in the process model the plug in retrieves the values of various KPIs When the log traces have been replayed the values of these can be accessed through the process model 2 Performance Information The log replay method also used in the conformance checker plug in is used to derive performance information from a log with help of a process model The log replay method simulates the process instances in the input log in the Petri net To be able to do this events in the log have to be associated with transitions in the Petri net During log replay every time an event occurs in a log trace one of its corresponding transiti
82. isplayed It is depicted on the right side of the window Information about each sequence throughput time of the sequence name of the process instance that is represented by the sequence can be viewed by moving the mouse cursor over the sequence The information will then be displayed in a tooltip Specific information about the arrows duration of arrow and the periods blocks duration of block of a sequence can be summoned by moving the mouse cursor over the object arrow or block of interest Analysis Performance Sequence Diagram Pattern Diagram Options Sales Production Purchase Component type Department Department Department Department w Equivalence type Fexible equivalence O Strict equivalence Show diagram Figure F 3 Prototype Performance Analysis Sequence Diagram 72 Appendix F Prototypes Analysis Performance Pattem Diagram Sequence Diagram Performance info Options Sales Production Purchase Pattern 0 Component type Department Department Department frequency 187x Department w Equivalence type Fexible equivalence O Strict equivalence Show diagram 4 22222 days 23 430 Od Figure F 4 Prototype Performance Analysis Pattern Diagram In figure an example of the GUI is shown where the pattern diagram has been selected The pattern diagram is displayed in the center of the window On the right side of the pattern diagram the statistical values of the throughpu
83. ken is added to or removed from the place after a transition fails then the waiting time of that token should not be used for the calculation of the average waiting time because it may have been influenced by the deviation that occurred before Note that selecting another option here does not influence the remainder of this test case but only the results Proceed by pressing the Apply changes button The previous window figure G 1h is then visible again In this window press the Start analysis button to continue Plug in provides a visualization of the input Petri net but enhanced in such a way that places are colored according to their average waiting time In the center of the window that appears similar to the one depicted in G 3 a visual ization of the Petri net is shown In this Petri net visualization the places are colored according to the waiting time levels Below the Petri net a legend is shown which shows which color corresponds to which level Bounds and colors of the levels can be adjusted by pressing the Settings button below the legend A window very similar to the settings window figure G 1p will then appear G 1 2 View activity related KPIs 1 User starts the plug in Start the plug in by first selecting Analysis from the menu bar of the ProM framework and then selecting the Performance Analysis with Petri net item from the menu that appears Plug in requests the user to set the non conforma
84. l be described Throughput time of the process The throughput time of a case is the total time that the case spends in the process Given replay output L and case c the definition of the throughput time T c of case c is equal to MAX wex n case x c time s MIN wer a case x c time s For the throughput time of the process the following statistical values should be calculated AVGiec T c the average throughput time MINcec T c the minimum throughput time MAX ec T c the maximum throughput time STDEV cc T c the standard deviation in throughput time AVGiow x T c AV GecCiou T c where Ciow x is defined by deC T d lt T c a oe Ho lt T00 Clow x contains the 7 cases in C that have the lowest throughput times c E Clow x The user should have the possibility to set the value of x where x lt 100 AV Ghrighly T C AV Geecyign y T c where Chign y is defined by c CE Chignly PHS Pera lt oy Chignh y contains the y cases in C that have the highest throughput times The user should have the possibility to set the value of y where y lt 100 zx z AV Gnormal T Y T c AV GecCnormarlz y De where Cnormal T y is defined by Cranial z y C Clow x U Chigh y Chormal X y contains the cases in C that are fast nor slow 26 Chapter 4 Requirements Furthermore the user should have the possibility to export the values in T c c C toa
85. lish that the goals that were set at the beginning of the project have been largely achieved 49 Bibliography 1 11 M de Pauw W and van Lei E Pring L Villard M Arnold and J F Morar Web services navigator Visualizing the execution of web services IBM Systems Journal 2005 J Desel and J Esparza Free choice petri nets Cambridge Tracts in Theoretical Computer Science 40 1995 S D P Flapper L Fortuin and P P M Stoop Towards consistent performance man agement systems International Journal of Operations And Production Management 16 27 37 1996 Nonprofit Good Practice Guide Performance http www npgoodpractice org CompleteGlossary aspx ID 1 amp curLetter P 18 8 2006 C W Gunther and W M P van der Aalst A generic import framework for process event logs 2006 M Hammer and J Champy Reengineering the Corporation A Manifesto for Business Revolution Harper Collins Publishers 1993 Investopedia com Bottleneck http www investopedia com terms b bottleneck 22 8 2006 J Jeston and J Nelis Business Process Management Practical Guidelines to Successful Implementations Elsevier Ist edition 2006 ILOG JViews Business process management visualization with ilog jviews technical whitepaper 2004 R S Kaplan and D P Norton Translating Strategy into Action The Balanced Scorecard Harvard Business School Press Boston 1996 M J Lebas Performance meas
86. lishes an instance ordering for each process instance in the log In 27 the authors describe how to obtain instance orderings Therefore we will not dig into this here What is important for us to know is that for a process instance o where g is a chronologically ordered trail of events e1 n an instance ordering gt on the domain of o Do 1 n can be obtained This ordering is irreflexive Vaep a o asymmetric Vao a1 D 40 o a1 gt a1 o Ag and acyclic Vag aic D 40 o a1 gt Ja aae D ai gt a2 gt 32 Chapter 4 Requirements gt Gy gt ag and defines how events in are interrelated This information can be used to determine how work is transferred between data element instances in When there are two events eg and ep in o and in the instance ordering a gt b appears then ea is said to be a closest causal predecessor of ep An event can only appear in a process instance after all its closest causal predecessors have appeared When determining how work is transferred between data element instances in a process in stance each event in the process instance should be compared against its closest causal predecessors When event e is compared against closest causal predecessor a a gt o b two situations can occur e ea and e refer to the same data element instance If this is the case we know that the data element instance to which both refer is active at least from the tim
87. lug ins which typically implement some property analysis on some mining result For example for Petri nets there is a plug in which constructs place invariants transition invariants and a coverability graph However there are also analysis plug ins that can compare a log and a model i e conformance testing or a log and an LTL formula Import plug ins which implement functionality that makes it possible to import open objects of a certain data format e g load instance EPCs from ARIS PPM Export plug ins which implement functionality that makes it possible to export save objects to a certain data format e g save EPCs Petri nets spreadsheets etc Conversion plug ins which implement conversions between different data formats e g from EPCs to Petri nets The log files that can be accessed through the ProM framework need to be stored in the standard Mining XML MXML log format This format is a generic XML format that has been developed for the ProM framework It was based on a thorough comparison of the required input of various existing ad hoc process mining tools e g EMiT MinSoN Little Thumb and the information typically contained in the event logs of complex information systems e g ERP or WFM systems 26 In appendix A this MXML log format is described in more detail At this moment what is important to know is that a log in MXML format is a list of process instances i e cases Each process instance
88. main der of this section the functionality that was found during research of these tools will be addressed 3 2 1 Graphs and tables All investigated commercial monitoring tools can monitor the values of process related KPIs often in real time and these tools support many different KPIs over one hundred in some Some process monitoring tools even allow the user to create his own KPIs 10 Chapter 8 Process Performance Analysis In most commercial process monitoring tools the values of the monitored KPIs can be viewed in tables and graphs In its simplest form a graph contains only the current value of a KPI compared against planned values In figure 3 1 for instance the average process cycle time of the business process under investigation is displayed in a speedometer In the example the average process cycle time is in the critical area i e it is above a certain bound value More detailed analysis is needed to derive the causes of such an extreme value DO Preserve re fas irens tee Cate memoro iror pete Figure 3 1 Speedometer containing the average process cycle time I9 One way to perform a more detailed analysis is by using dimensions or perspectives against which KPIs can be measured Possible dimensions include time task originator department country and sold to party but there are many more In this manner it is possible to view a KPI such as the total number of cases in the process in a gra
89. mance Analysis with Petri net plug in are discussed In section F 2 prototypes for the Performance Sequence Diagram Analysis plug in are discussed F 1 Performance Analysis with Petri net Before the values of the various KPIs can be obtained by means of the log replay method some options need to be set Therefore at start up of the plug in a window should appear in which these options can be set In figure a prototype of such a window is shown At the top of the window one of the non conformance options must be selected The non conformance options define what the plug in should do with measurements taken during replay of cases that do not fit in the Petri net Furthermore waiting time level settings can Options Non conformance options Use all cases Use only fitting cases Use measurements taken before transition fails Waiting time levels Use standard settings Use manual settings High level include all above 5 days Medium level L include above 2 days Lowie C Figure F 1 Prototype of options screen 70 Appendix F Prototypes Analysis Performance Process info ig ae ES als Total of cases 150 cases Arrival rate 0 05 cases per day Throughput time 1 02 j aj 2 8 8 318 8 8 8 A Performance info on selected object s Selected measurements 105 measurements Place d3 Arrival rate 0 035 cases per day Waiting time Wali tine T Synchroni
90. n 92 Appendix H User manuals e All measurements taken measurements taken in fitting as well as in non fitting traces are used when calculating activity metrics All measurements taken in each trace before the trace fails all measurements belonging to the activity that are taken in fitting traces are used when calculating activity metrics and next to these of each non fitting trace the measurements that are taken during replay of the trace before the moment at which an event occurs in the log that does not fit in the Petri net One can expect that these measurements taken before a failure occurs are not influenced by the non conformance of the trace in which they were taken That is why this is the standard setting All measurements taken where no transition corresponding to the activity fails all measurements taken in traces where no transition which corresponds to the activity has to be artificially enabled For example when a complete transition of an activity fails not only the sojourn time and execution time measurements are discarded but also the waiting time measurement even though the schedule and the start transition do not fail e All measurements taken where no transition corresponding to a metric of the activity fails the same as the last option though now the measurements of the metrics that are not influenced by failing transitions are taken into account In the example of the previou
91. n heus1 AGO2 Voorbereiden AGO8 GBA afnemer AG14 Control T T T requency 14 Throughput time 174049 42714 12664 029 274703 99 121819 22165 Pattern 0 Throughput ime 12463 005 12463 005 12463 005 Pattern 2 4 F L Figure H 8 Example of a pattern diagram in the plug in e The pattern number of the pattern e The mean throughput time of the pattern If the cursor points to an arrow e The name of the data element from which the arrow originates i e the data element that is the source of the arrow e The name of the data element in which the arrow ends i e the data element that is the destination of the arrow e The mean time spend between the moment at which the arrow begins and the moment at which the arrow ends If the cursor points to a block e The name of the data element to which the block belongs e The mean time spend in the block If the results still contain many patterns and you only wish to view a limited number you can click on the Filter Options button You will then see the screen below figure H 9 Here you can select the process instances that you wish to use by selecting them in the table on the left side of the screen You can also select instances by selecting one of the following options and pressing the Update button afterwards The following option is always available 101 Appendix H User manuals F Fitter Options Select e
92. n the example the user has specified the upper bound at low level to be 2 days which means that every place that has an average waiting time associated to it of 2 days or less is considered to be of low level and will be colored blue you can change this color by clicking on the Change button and choosing the color of your liking At medium level the user has set the upper bound to 5 days so every place with an average waiting time between 2 and 5 days including 5 excluding 2 is considered to be of medium level and will be colored yellow or the color 89 Appendix H User manuals F Analysis Performance Analysis with Petri net o Settings General settings Times measuredin days v Decimal number 5 Use auto bottleneck settings let program choose bound values Use manual bottleneck settings specify below Bottleneck settings Low level settings Upper bound 20 days Color iz Change Medium level settings Upper bound 5 0 days Color pe Change High level settings Color ey Change Plugin documentation Start analysis Figure H 1 Screenshot of the options screen of the plug in you selected and all places with an average waiting time above 5 days will be colored purple or the color you selected If you have no clue on which upper bound values you want to use you are advised to select the use auto bottleneck settings option instead Values for upper bounds will then
93. n the right side of the Petri net information concerning the whole process is displayed such as the arrival rate of cases and the average throughput time Places and transitions can be selected in the selection boxes on the lower right side of the window but can also be selected within the Petri net itself In the example in figure F 2 the place d3 is selected and below the Petri net performance information e g waiting time 71 Appendix F Prototypes synchronization time sojourn time of the selected place is shown When two transitions are selected instead the time cases spend in between these two transitions is shown When one transition is selected the performance information of the activity that is referred to by the event class that is related to the transition is shown F 2 Performance Sequence Diagram Analysis A prototype of the GUI of the Performance Sequence Diagram Analysis plug in is displayed in figure F 3 On the left side of this window the initial options can be set and or changed The options available here are the component type i e data element on which the plug in should focus and the type of equivalence that should be used when deriving patterns After the options have been set the user can press the Show diagram button The sequence and pattern diagram are derived then and can be viewed by selecting the corresponding tab at the top of the window In figure F 3 the sequence diagram is d
94. nce option that defines how the plug in should deal with failing transitions At start up of the plug in the Settings window as depicted in figure G 1p is created To view the non conformance settings press the Advanced settings button The window in figure appears TT Appendix G Test cases Analysis Performance Analysis with Petri net o i i Log Traces gt Process information Total number selected 2500 cases Number fitting 2500 cases Arrival rate 9 9E 4 cases per second Throughput time seconds avg 4297 61 min 800 0 max 17580 0 stdev 3499 5 fast 25 2116 42 slow 2 9709 82 norma 2682 1 g Moo o foo o i nooo no 1002 1003 n004 moos noor AO 100 Hon n02 1013 nos n05 4 Change Export Update Percentages Time Metrics Invert Selection Zoom 102 4 ill gt Waiting time Selected OB High C Medium E Low Figure G 3 Screenshot Main window 3 User sets non conformance option At Activity metrics select a non conformance option In this test case we will use the All measurements taken in fitting traces option Only the measurements for activity related KPIs that are taken during replay of cases that fit i e no transition fails during log replay are then used for calculation of the values of these KPIs Proceed by pressing the Apply changes button The previous window figure G 1h
95. ncy Diagram Analysis Pattern 0 a Joep constructionteam3 constructionteam4 constructionte T T Frequency 153 Throughputtime 75 61438 Pattern 0 l 4 31 0 265 0 64 30971 l l I l l l I l l Flexible equivalent Strict equivalent Frequency 152 Pattern 1 Show diagram __ _Throughputtime 74 67105 OO 31 0 276 0 59 10265 I Fitter options Pattern 2 Pattern 2 Frequency 152 Throughput time 64 36184 31 0 219 0 49 47065 Patten 3 Pattern 3 Zoom 86 00 I l Frequency 150 i l 1 Throughput time avg 8 min 4l Figure G 9 Screenshot pattern diagram G 2 4 View average time in period The first five steps of this use case are equal to the first five steps of the previous use case 6 Plug in shows the pattern diagram and a list of the statistical values of the throughput time of each pattern The window will look similar to the one depicted in figure 7 User selects a block i e a period in the pattern diagram Move the mouse cursor over the block that you are interested in 8 Plug in shows the average duration of the selected period In the tooltip information about the selected block will appear including the average time spend in the block i e the average durat
96. nication between services i e a web service request or respond and a block represents a period of activity of a certain service Each transaction in a sequence diagram represents a communication sequence and is drawn in a unique color Next to the basic sequence diagrams WSN also provides the user with a Flow Patterns view which classifies similar transaction flows into flow patterns i e transactions that represent similar behavior are classified in the same pattern By doing such the information that the user must digest is decreased dramatically In figure an example of a flow patterns view is shown In this view the flow patterns are displayed sorted descendingly on the frequency of their appearance in the sequence diagram WSN helps to solve problems ranging from 17 Chapter 8 Process Performance Analysis Transaction Flows Service Topology Flow Patterns z Haalen o Figure 3 6 IBM Websight Flow Patterns view business logic misunderstandings to performance bottlenecks to semantic and syntactic er rors by visualizing how applications really execute This enables businesses to understand how their applications actually behave WSN led to the idea to use sequence and pattern diagrams for process performance analysis in ProM An event log in MXML format is taken as starting point of this type of performance analysis Each event log consists of a number of process instances or cases Let us suppose ea
97. normal 54 Appendix B Sequence diagrams A sequence diagram is a Unified Modeling Language UML diagram UML is the standard notation for modeling software intensive systems 18 UML sequence diagrams are used to model the dynamic behavior of systems i e the behavior of a system over time A sequence diagram shows the sequence of messages which are exchanged between the roles that implement the behavior of a system arranged in time It visualizes the flow of control across the objects that collaborate in the context of a scenario A sequence diagram is a chart which has two dimensions The vertical dimension is the time dimension time increases from top to bottom and the horizontal dimension shows the classifier roles that represent individual objects in the collaboration Classifier roles include classes components and nodes Each classifier role is shown as a lifeline i e a vertical line that represents the role over time During the time an object exists the lifeline is depicted as a dashed line During the time an object is active the lifeline is drawn as a double line A message between two classifier roles is shown as an arrow from the lifeline of the source object to that of the destination object In figure an example of a sequence diagram is shown This sequence diagram displays a scenario of possible behavior of a bank system after the request of a customer to get the balance of his bank account Classifier roles in t
98. nput for both 1 performance analysis plug ins URCOR 3 A Petri net is required as input for performance analysis with 1 Petri net URCOR 4 Values of time related KPIs can be exported to comma separated 3 Quality requirements Number Requirement Priority URCOR 5 The source code should conform to the coding standards in use at 1 the ProM group URCOR 6 The plug ins should be accompanied by suitable user guides 1 integrated in the Plug in Help System of ProM 62 Appendix E Use cases In this appendix the use cases that were developed for the two plug ins will be described These use cases have been used to set up tests for the plug ins E 1 Performance Analysis with Petri net In figure a diagram containing the use cases that were defined for the Performance Analysis with Petri net plug in is depicted Only one very general actor is defined user This can be anyone that uses the plug in e g managers process analysts students In the remainder of this section the defined use cases will be described in detail View activity related KPIs User View time between two transitions for one case only Figure E 1 Use cases Performance Analysis with Petri net 63 Appendix E Use cases E 1 1 Find bottleneck locations Brief Description The plug in can assist in finding bottleneck locations within the process A bottleneck location in a Petri net is a place with a relatively high averag
99. nside of having many patterns in a diagram is that the user still has to search through a lot of information to determine the causes of certain behavior To diminish the load of information that the user has to consider filtering options were added to the plug in Filtering allows the user to examine only those sequences and or patterns that are relevant to him The filtering options that were introduced use process instance selection The sequence and the pattern diagram are derived from the selected process instances only The following options to select process instances were made available to the user e Select process instances by selecting their names in a table e Select all process instances that are represented by sequences that have a throughput time above below a certain bound value e Select all process instances that are represented by sequences that follow one of the patterns that have an average throughput time above below a certain bound value e Select all process instances that are represented by sequences that follow a certain pattern or one of group of patterns by selecting the pattern or a group of patterns 47 Chapter 6 Evaluation Throughout this graduation project the emphasis of our research has been on the analysis of the performance of business processes In chapter 1 of this document two research questions were defined which aim was to guide the research in the right direction During our research
100. ntents 4 3 Sequence Diagram functionality 0 2 000002 eae 4 3 1 Deriving Sequence Diagrams 0 00002 eae 4 3 2 Deriving Pattern Diagrams 002004 5 Design and Implementation 5 1 Introduction 5 2 Performance Analysis with Petrinet 200 5 3 Performance Sequence Diagram Analysis 2004 Evaluation B Sequence diagrams C Motivations D User Requirements gt D z B 0 a z m o 5 5 5 E Use cases kde eee heats oe 2 oe ee foe oS oe rA a tnd dg ed 43 eee Oh bu he bee Mh boa oe ees ee ee ere 5h okie a ne oe Siip hag eo ph deed A E 2 3 Find longest pattern E 2 4 View average time in period 020 00004 F Prototypes F 1 Performance Analysis with Petri net 200 F 2 Performance Sequence Diagram Analysis 2 0 G 1 Performance Analysis with Petri net 22004 G 1 1 Find bottleneck locations o oo ee e hep ek ton od Cad eh oo G 1 3 View time between two transitions for one case only T Ginko haa oh ied E agen npees a a ded ea a a eae Vie otnatad taeaddmeeaens G 2 3 Find longest pattern iv 32 32 36 Al 41 41 45 48 53 55 57 60 63 63 64 64 65 66 67 67 68 68 69 70 70 72 74 74 74 TT 79 80 82 82 83 84 Contents G 2 4 View average time in period H User manual
101. nts taken in each trace before the trace fails lv Activity metrics based on All measurements taken in each trace before the trace fails v Saving time Log replay can be a very time consuming method especially when the used Petri net contains invisible transitions Use the time saving option below if it costs too much Note however that some performance metrics e g place metrics are calculated based on the structure of the used Petri net and might be influenced by this option Read the help file For more detail _ Use time saving option Figure H 2 Screenshot of the advanced settings screen of the plug in tokens and then fired anyway This has its consequences for the measurements taken There are many ways to deal with this non conformance which may be appropriate in different situations It is therefore possible to set up how you want the plug in to deal with this behavior These advanced settings are related to the different metric types Process metrics can be based on e All measurements taken the process instances that fit in the Petri net as well as the ones that do not are used when calculating the values of process metrics Beware however that using non fitting traces may lead to strange results For example in case the log contains incomplete traces then the calculated average throughput time may be much lower than expected e All measurements taken in fitting traces only those process instances that fit
102. nts will be present for each execution of an activity In some event logs for instance only complete events are registered In such case the waiting time execution time sojourn time and arrival rate of the activities cannot be calculated However the Petri net defines how transitions and thus also activities are interrelated That is why an upper bound for the value of the sojourn time can be calculated in this case This bound is equal to the time between the enabling and the firing of the transition that is related to the complete event of the activity In table 4 1 is depicted which values should be calculated for an activity dependent on the event types that are registered in the log for that activity When a bound value has been calculated for a certain KPI then it should be made clear to the user that this is a bound value and not an exact value 4 2 6 Failing transitions When during log replay a transition is not and cannot be enabled when it should fire we say it fails execution When this occurs tokens are created artificially to enable the transition and log replay continues The creation timestamp of the artificially created tokens is set to the same timestamp as the firing timestamp of the transition that consumes the tokens This clearly has its effect on the measurements taken To cope with this the user should be presented with options with which s he can set and adjust the manner in which the plug in deals with these fa
103. o tei Since we have the relation 1 gt 2 we compare the department of the first event with the department of the second event These are the same so we now know that the production department is active at least from te to te This overlaps with b however so we combine these periods by setting Teng bo to te We then move on to the third event For this event the following relation can be found 2 gt 3 The third event does not refer to the production department but to the sales department The sales department is thus active for at least a moment at te Therefore we create block b where di b1 Sales Thegin Oi tes Tenalb tez Because the second and third event refer to different departments also an arrow arg is created where we set Tepe ara te Tena aro tee bsrc aro bo baest aro by If we continue with the other events in this same manner the result will be the sequence 34 Chapter 4 Requirements 1 00 2 00 3 00 4 00 5 00 7 Figure 4 2 The derived sequence as depicted in figure This sequence contains four periods blocks in which one of the departments is active and four arrows which represent a transfer of work from one department to another In figure the sequence is displayed in a sequence diagram We define Definition Performance Sequence Diagram a sequence diagram is a diagram in which a set of sequences S is displayed which r
104. o comma separated text files in a similar manner E 2 Performance Sequence Diagrams Analysis In figure a diagram containing the use cases that were defined for the Performance Sequence Diagram Analysis plug in is depicted Again only one very general actor is defined user In the remainder of this section the defined use cases will be described in detail iew throughput time sequence Find most frequent pattem ind longest User pattern iew average time in period Figure E 2 Use cases Performance Sequence Diagram Analysis E 2 1 View throughput time sequence Brief Description The throughput time of all derived sequences can be viewed Preconditions A log in which events are registered is accessable through the ProM framework Main Path 1 User starts the plug in 2 Plug in provides a list of the available data elements that appear on the audit trail level in the log and requests the user to choose one 3 User selects a data element 4 Plug in shows the sequence diagram 5 User selects the sequence s he is interested in 6 Plug in displays the throughput time of the selected sequence Postcondition The throughput time of the selected sequence is displayed 67 Appendix E Use cases E 2 2 Find most frequent pattern Brief Description The most frequent pattern is the pattern that represents the most sequences i e the pattern that represents the most common behavior Preconditions
105. ocess instance only one measurement is taken namely the time between the first firing of the one transition during log replay and the first firing of the other transition The values of metrics belonging to an activity can be viewed by selecting a transition that is related to the activity Each normal visible transition is associated with an activity 95 Appendix H User manuals Just like with the throughput time of each time metric time between activity and place time metrics the average minimum maximum standard deviation and the average time of the fastest measurements initially 25 of the total number of visits to the place i e measurements with the lowest times the average time of the slowest measurements also 25 initially ie measurements with the highest times and the average time of all other visits obviously 50 initially is calculated The time metrics are displayed in the table on the panel below the Petri net You can adjust the percentages of slow fast and normal measurements by pressing the Change Percentages button below the table and filling in the percentages that you want to use in the dialogs that will appear Next to this you also have the possibility to export the times of all measurements of a time metric to a comma separated text file csv or txt by pressing the Export Time Metrics button and filling in or selecting a filename Comma separated files can be loaded into other tools suc
106. of data elements in a sequence diagram URCAR 25 The user can set the sequence diagrams to real time or to logic 3 steps URCAR 26 The user can view the throughput time of each sequence 1 URCAR 27 The user can view the time that is spend in blocks i e in 2 periods of activity of a data element instance for each sequence 61 Appendix D User Requirements Number Requirement Priority URCAR 28 The user can view the time that is spend in arrows i e in 2 periods in which there is a transfer of work for each sequence URCAR 29 The user can view all patterns existing in a sequence diagram 1 in a separate diagram URCAR 30 The user can view the number of times that each pattern occurs 1 within a sequence diagram URCAR 31 The user can view the average minimum and maximum 1 throughput time for each pattern and the standard deviation URCAR 32 The user can view the average time that is spend in blocks i e 2 periods of activity of a data element instance for each pattern URCAR 33 The user can view the average time that is spend in arrows i e 2 periods in which there is a transfer of work for each pattern Restriction requirements URCOR Interface requirements Number Requirement Priority URCOR 1 The Process Performance Analysis functionality must be added 1 to the ProM framework in form of plug ins URCOR 2 A timed event log in MXML format is required as i
107. on Total number selected 2500 cases Number fitting 2500 cases Arrival rate 9 9E 4 cases per second J Throughput time seconds 4297 61 1800 0 17580 0 3499 5 2116 42 9709 82 2682 1 Change p Percentages Zoom 142 4 ill E anatys Mo o oo 1000 noo 1002 1003 1004 1005 1007 M o o o o oo 1010 fom 1012 1013 1014 ooo 105 Export Time Metrics eK T M gt Performance information of activity Ax Waiting time Selected E High E Medium E Low Settings Arrival rate 9 9E 4 visits per second Waiting time seconds Execution time secon Sojourn time seconds Transition Ax complete 0 0 1468 82 1468 82 avg min 0 0 max 00 stdev 0 0 fast 25 0 0 0 1200 0 ee Te 172 82 1245 5 1200 0 1740 0 172 82 1245 5 Figure G 4 Screenshot Activity selected in main window In a similar manner the values of place related KPIs can be viewed by selecting the place in the selectionbox or in the Petri net and the time between two transitions by selecting the transitions in the two selectionboxes or in the Petri net The values of these KPIs are displayed below the Petri net as well G 1 3 View time between two transitions for one case only The first two steps of this use case are the same as the first two steps of the previous use case
108. on s in the Petri net is fired and measurements are taken In this way performance information can be obtained like for instance routing probabilities at XOR splits and the time that cases represented by tokens spend at certain places within the process We distinguish four groups of performance information e Process metrics e Place metrics 86 Appendix H User manuals e Two transition metrics e Activity metrics Process metrics The values of the following process metrics i e process related KPIs are derived by this plug in e Total number selected the total number of process instances e Number fitting the number of process instances that complete properly and suc cessfully i e the number of instances that can be replayed in the Petri net without any problems For details about fitness see the Conformance Checker help file and documentation e Arrival rate the number of arrivals of cases to the process per time unit e Throughput time the statistical values average minimum maximum etc of the time cases spend in the process Place metrics The values of the following place related KPIs are derived e Frequency the number of visits of tokens to the place during replay of the process instances in the Petri net e Arrival rate the rate at which tokens arrive to the place per time unit e Place time metrics Sojourn time the statistical values of the total time tokens spend in the place
109. oose between Use auto bottleneck settings and Use manualbottleneck settings When the latter option is chosen the window will show options to define waiting time levels see figure G 1p 3 User sets bounds and colors for low medium and high waiting time level and in this manner defines when a place is considered to be a bottleneck 74 Appendix G Test cases E Analysis Performance Analysis with Petri net LEE E Analysis Performance Analysis with Petri net Settings Settings General settings General settings Times measured in seconds Times measured in seconds Decimal number Decimal number Advanced Settings Advanced Settings Use auto bottleneck settings let program choose bound values Use auto bottleneck settings let program choose bound values Use manual bottleneck settings specify below Use manual bottleneck settings specify below Bottleneck settings Low level settings Upper bound 0 0 seconds Color Change Medium level settings Upper bound 0 0 seconds Color is Change High level settings Color Change Plugin documentation Plugin documentation gt Start analysis Figure G 1 a Settings window standard b Settings window manual bottleneck settings Below the General settings select Use manual settings At Low level settings set the upper boun
110. oughput time 96763 972 Time in data element block 86155 053 Process instance 05307ESWM392257 oughput time 96736 965 Time in data element block 86001 037 Figure H 7 Example of a sequence diagram in the plug in e The name of the process instance of which the sequence is a visualization e The throughput time of the sequence If the cursor points to an arrow e The name of the data element from which the arrow originates i e the data element that is the source of the arrow e The name of the data element in which the arrow ends i e the destination of the arrow e The time spend between the moment at which the arrow begins and the moment at which the arrow ends If the cursor points to a block as in the example e The name of the data element to which the block belongs e The time spend in the block You can switch between the sequence diagram and the pattern diagram by selecting the corresponding tab at the top of the screen An example of what a pattern diagram can look like is shown below figure H 8 As you can see information about each pattern such as throughput time and frequency is shown on the right hand side of the screen You can also see information belonging to the pattern in a tooltip when you move the cursor of your mouse over the pattern The information in the tooltip encompasses 100 Appendix H User manuals F Analysis Performance Sequency Diagram Analysis ern 0 Domai
111. ould be implemented the user could do a more than excellent analysis Unfortunately however the ProM group does not have the resources available to complete such a large task Therefore decisions had to be made about which functionality should be added to the ProM framework The chosen approach was to start by implementing the functionality that is the most essential for analyzing the performance of the business processes of an organization Such functionality should help to detect possible problems within the performance of business processes and should assist in finding the causes of these problems In this chapter the selected functionality and requirements placed thereon will be described The functionality has been divided over two plug ins In section 4 2 the functionality that makes use of a process model next to an event log in the standard MXML format is described In section the other part that consists of the functionality that only makes use of an event log is described Both types of functionality require the input log to meet at least requirements i ii iii and v as described in section 2 2 Motivations for choosing to implement or dropping certain functionality are given in appendix Explicit requirements are listed in appendix D 23 Chapter 4 Requirements 4 2 Performance analysis with process model The main purpose of the performance analysis with process model functionality is to provid
112. pattern diagram will be redrawn 102
113. perties e g size of all cases in the range is bigger than 1000 of those cases that fall in the range can be displayed 3 4 3 Dotted charts Another type of functionality that is not contained in any other monitoring tool is a chart similar to a Gantt chart This dotted chart has two dimensions time and component The time should be measured along the horizontal axis of the chart Along the vertical axis in boxes components should be put The components that we will focus on are again the 21 Chapter 8 Process Performance Analysis instances of data elements that can be found at the audit trail entry level of logs The user should select a data element e g Task Originator Department and an event type e g complete schedule normal A marking i e a dot should then be drawn in the chart for each event in the log that refers to an instance of the selected data element and to the selected event type This marking should be positioned at the data element instance to which the event refers at the timestamp relative to the first event of the process instance to which the event refers Suppose for instance that the user has selected event type complete and data element task For every event that refers to the completion of a task a dot should then be drawn in the chart positioned at the task in question and at the relative timestamp to which the event refers For a certain log this may result in the dotted chart as depicted
114. ph or table measured against a dimension If for example the time dimension is used it is possible to analyze the trend of the KPI is it going up or down and to see if and when extreme values have occurred Such information can help with identifying the causes of deviations of certain KPIs Suppose for instance that there is an upward trend for the total number of cases in the process The increase in cases to handle may well be one of the causes of an increase in the average process cycle time Many monitoring tools also allow the user to measure a KPI against more than one dimension at the same time In the example in figure 3 2 for instance the KPI average waiting time is displayed in a 3D graph measured against the dimensions task and originator From such a graph various sorts of information can be derived e g the user can quickly see when ie under the influence of which values for the dimensions extreme values have occurred but also information about relations between dimensions can be derived In the example in 11 Chapter 8 Process Performance Analysis figure an extreme value for the average waiting time occurs in those process instances were originator John is involved in task Sell At the same time one can see that there are no process instances in which the task Reject order is performed by originator Clara since there is no relation between these elements visible in the graph Figure 3 2 3D graph of aver
115. process instances that fit in the Petri net as well as the ones that do not are used when calculating the time between two transitions Note that in all traces at most one measurement is taken for the time in between is taken e All measurements taken in each trace before the trace fails all measurements belonging to the two transitions that are taken in fitting traces are used when calculating time between metrics and next to these of each non fitting trace the measurements that are taken during replay of the trace before the moment at which an event occurs in the log that does fit with the Petri net This means that when a transition fails before the last of the two transitions fires during log replay the measurement is not used One can expect that these measurements taken before a failure occurs are not influenced by the non conformance of the trace in which they were taken That is why this is the standard setting e All measurements taken in each trace where both transitions do not fail all time in between measurements taken in traces where both the selected transitions fire regularly so do not fail Note that this does not exclude the measurements taken when a tran sition fails which is located in between these two transitions which can be of influence to the results of calculations e All measurements taken in fitting traces only those process instances that fit are used in calculation Activity metrics can be based o
116. processes In case large deviations occur between the planned and actual values of these KPIs process monitoring tools help in finding the causes of these deviations Reviewing the actual values of KPIs against planned values of the same indicators may then lead to corrective actions in order to increase the likelihood of achieving the goals However it may also lead to challenging and adjusting these goals PPA enhances processes by creating better feedback loops This continuous assessment helps to identify and eliminate problems before they grow Therefore PPA can be useful in risk analysis prediction of outcomes of various scenarios and making plans to overcome potential problems 29 In the remainder of this chapter first the PPA functionality present in commercial process monitoring tools will be discussed in section After that in section the academic process monitoring tool EMiT will be discussed and finally in section functionality derived from other not process monitoring tools and completely new functionality that supports PPA will be discussed 3 2 Commercial Process Monitoring Tools During our research the following commercial process monitoring tools and or their docu mentation have been examined e Aris Process Performance Manager Aris PPM e Business Objects e HP Business Process Cockpit HP BPC e Hyperion System 9 e ILOG JViews These tools are among the market leaders in the area of process monitoring In the re
117. raised with time y time y The execution time without suspended time of the execution is defined as time x time x Teusp 2 29 Chapter 4 Requirements Statistical values should be calculated for the execution time of a based on the execution times of all registered executions of a Next to these values the number of execution time measurements for activity a should also be available to the user Sojourn time of activity a The time between schedule and complete events of activity a For each activity a the sojourn time of all registered executions of a should be calculated This is done as follows Each element x L where act x a and type x schedule is compared with the nearest following element x L for which act x a and type x complete When in list L in between x and 2 no element x L is registered for which act x a and type x schedule holds then the sojourn time of the execution is defined to be time x time x Statistical values should be calculated for the sojourn time of a based on the sojourn times of all registered executions of a Next to these vales the number of sojourn time measurements for activity a should be also available to the user Arrival rate of a The number of times activity a is scheduled on average per time unit should be calculated Upper bound values Not in all event logs schedule start and complete eve
118. ransition E in the Petri net This transition is not fully enabled in this example because place p5 does not contain a token E can be enabled however by firing the invisible black transition We re firing for T fer to the set of transitions in t T e t R that are enabled or can be enabled by a sequence of invisible transitions as T One of the following three propositions holds 7 0 none of the related transitions is or can be enabled When this occurs a random transition t is chosen out of the set t T e t R The chosen transition t is said to fail execution and is artificially enabled i e a token is added to each of its input places that did not contain a token yet After this t is fired 7 1 one of the related transitions is or can be enabled If this transition t is enabled it is fired Else the transitions that are in the shortest sequence of invisible transitions that can enable t are fired first In I7 the author describes how to find this shortest sequence of invisible transitions After the transitions in the sequence have fired t is enabled and is fired 7 gt 1 more than one of the related transitions is or can be enabled The most suitable candidate t T is selected In I7 the author describes how the most suitable candidate is selected If t is enabled it is fired Else the transitions that are in the shortest sequence of invisible transitions that enabl
119. re Information Systems PAIS Typical PAIS include Workflow Management WFM systems Enterprise Resource Planning ERP systems Customer Relationship Management CRM systems Software Configuration Management SCM systems and Business to Busi ness B2B systems One of the main problems with these information systems however is that they require the explicit design of process models that represent the business processes of the organization 24 Correctly designing such process models is far from easy because it requires much knowledge of the business processes Gathering knowledge of business pro cesses can cost much time to discuss with the participants in the business process and money Moreover in practice these process models often turn out to be based more on what managers think that should be done rather than on what actually is done This is where process mining first came into play The goal of process mining is to extract Chapter 2 Process Mining information about processes i e a functional model from event logs Event logs are lists in which information about processes is stored as they are actually executed Most PAIS provide a log in which the executions of activities are recorded In 28 different methods to distil process models from event logs are described We will not discuss these methods here Instead we will give an example that illustrates how a process model can be derived from an event log Suppose
120. rklflow and Business Process Management PNCWB at the ICATPN 2005 2005 Wikipedia org Business performance management http en wikipedia org wiki Business_Performance_Management 18 8 2006 29 52 Appendix A Mining XML format This appendix describes the Mining XML MXML format as introduced in 26 to which logs that serve as input to the ProM framework have to comply In figure a a visual description of the MXML format is given The root element in this format is the WorkflowLog element which indicates that the log is a workflow log Each WorkflowLog element contains an optional Data element which can be used to store arbitrary textual data an optional Source element which can be used to store information about the source of this log the information reassign CS schedule assign a OX YS j d resume N i Ne So manualskip 1 on withdraw sJ suspend 7 2 S7 oi ma T timestamp Y Originator a Process log XML format Figure A 1 The Mining XML MXML format a and transactional model b 53 Appendix A Mining XML format system that created it and a number of Process elements at least one in which information about a specific process is stored A Process element contains a number of Process Instance elements and an optional Data element which again can be used to store arbitrary textual data One Process Instance contains one case i e one execution of the Process
121. s H 1 User manual Performance Analysis with Petri net H 2 User manual Performance Sequence Diagram Analysis 85 86 86 97 Chapter 1 Research Assignment 1 1 Assignment Background This assignment forms the graduation project of my study Computer Science at Eindhoven University of Technology This assignment was issued by the Business Process Management BPM group of the section Information Systems IS TM of the department of Technology Management at Eindhoven University of Technology 1 2 Problem Area Process Mining also known as workflow mining is an important though relatively young research area The starting point of process mining is registered information about tasks that are performed in a process e g the handling of an insurance claim the treatments of a patient in a hospital requesting a building permit This registered information is typically called an event log The goal of process mining is deducing knowledge a process model average throughput times etc from such event logs The knowledge obtained this way can increase understanding of the processes within an organization and can help with redesigning them if needed The ProM framework is a framework that was developed to support process mining This framework is used by master Technology Management students to analyze comprehensive event logs of large organizations e g hospitals government So far emphasis within this framework has been on
122. s are actually part of the plug in The classes in the upper part of the diagrams are not part of the plug in but can be accessed by the plug in through the ProM framework 41 Chapter 5 Design and Implementation interface AnalysisPlugin getName LogReplayMethod LogReplayResult ps lr a fs getinputitems analyeod A TA jimplements ProM framework PerformanceAnalysisPlugin PerformanceLogReplayMethod eS E 1 PerformanceAnalysisGU ae PerformanceLogReplayResult getResulis Performance Analysis with Petri net Figure 5 1 Class diagram Performance Analysis with Petri net plug in All analysis plug ins of the ProM framework need to implement the AnalysisPlugin in terface As can be seen in the PerformanceAnalysisPlugin class implements this in terface for the Performance Analysis with Petri net plug in The interface requires four methods one specifying the name of the plug in one which returns the user documen tation in HTML one which specifies the required input items and finally the analyse method which is called as soon as the plug in is invoked When the analyse method of the PerformanceAnalysisPlugin is called first the PerformanceAnalysisGUI i e the GUI of the plug in is created Through this GUI the user can set certain options before starting the actual analysis When the analysis is started the logReplay method of the PerformanceLogReplayMethod is called The PerformanceLogReplayMet
123. s of the waiting time levels After the settings have been adjusted the visualization is updated e In the visualization that results from following the main path also probabilities at XOR splits are shown E 1 2 View activity related KPIs Brief Description The values of the activity related KPIs of each activity can be viewed The activity related KPIs are defined in section 4 2 5 64 Appendix E Use cases Preconditions A Petri net a log and a relation between event classes in the log and transitions in the Petri net can be accessed through the ProM framework Main Path 1 User starts the plug in 2 Plug in requests the user to set the non conformance option that defines how the plug in should deal with failing transitions 3 User sets the non conformance option 4 Plug in provides a way to select activities 5 User selects the activity s he is interested in 6 Plug in displays the activity related KPIs of the selected activity Postcondition Activity related KPIs of the selected activity are displayed Alternative Paths e Next to the values of activity related KPIs also the values of process related KPIs defined in 4 2 2 place related KPIs defined in 4 2 3 and the time between two transitions defined in 4 2 4 can be viewed The values of process related KPIs are provided after step 3 in the main path at the same time as the plug in provides a way to select activities Values of pla
124. s option the waiting time measurement is used e All measurements taken in fitting traces only those process instances that fit are used in calculation Time saving option When the Petri net that is used as input to this plug in is large and contains many invisible transitions the replaying of the log may take much time due to the method used This method builds a coverability graph every time that a transition is not immediately enabled i e not all input places contain a token This is done to discover if there is a sequence of invisible transitions that does enable the transition When there are many invisible transitions in the Petri net or the Petri net is very complex then this may take much time If time is not on your side you can select the use time saving option This will result in no coverability graphs being build during log replay Note however that this may have its affect on the values of the varying performance metrics since process instances that would normally conform to the Petri net can now become non fitting 5 Starting the analysis When all options have been set you can press the Start Analysis button the program will then start the performance analysis using the specified settings As you can see in figure 93 Appendix H User manuals E Analysis Performance Analysis with Petri net Log Traces 4 4 Process information Raisin_cleaniag3 Total number selected 500 cases
125. s the derived Sequences which consist of SequenceBlocks and SequenceArrows The PatternDiagram contains Patterns which represent a number of Sequences Patterns consist of PatternBlocks and PatternArrows Based on the design the plug in was implemented and added to the ProM framework after thorough testing In figure 5 6 the GUI of the plug in is shown In the depicted example the pattern diagram is shown For more details on how to use the plug in see the user manual as presented in appendix H 2 During the tests the use cases that were defined in E 2 were performed with the plug in A description of these test cases can be found in appendix G 2 The test cases were performed with various logs The plug in performed well during the tests and results were obtained 46 Chapter 5 Design and Implementation F Analysis Performance Sequency Diagram Analysis requency 14 Soil Throughput time 74049 42714 Doman heus1 AGO2 Voorbereiden AGO8 GBA afnemer A014 Control f T T T Pattern 0 12664 029 274703 99 121819 22165 Throughput time Throughput time Patter 1 _ 12463 005 12463 005 a 12463 005 00 Figure 5 6 Screenshot of Performance Sequence Analysis plug in Pattern diagram view within reasonable time One of the things that we learned from the tests was that some of the pattern diagrams that were derived contained many different patterns A dow
126. s to a comma separated text file csv or txt by pressing the Export Time Metrics button and filling in or selecting a filename Such files can be loaded into Microsoft Excel or StatGraphics for further analysis The values of metrics belonging to places can be viewed by selecting the place that you want to study more closely You can select it by selecting its name in the upper selection box on the lower right side of sae screen or by clicking on the place within the Petri net In the example below figure the user has selected the possible bottleneck place with name p2 F Analysis Performance Analysis with Petri net a Process information Total number selected 500 cases Number fitting 336 cases Arrival rate 0 06 cases per second Throughput time seconds avg 17392 32 min 13000 0 max 43740 0 stdev 10747 43 4 i Performance information of the selected place Frequency 322 visits Arrival rate 0 05 visits per second Waiting time seconuts Synchronization time sec Sojourn time seconds and avg 19777 45 19777 45 E z min 10 0 10 a To 0 46080 0 46080 0 Selected Place p2 Figure H 4 Screenshot of the Performance Analysis with Petri net plug in where a place is selected In a similar manner two transitions can be selected after which the statistical values of the absolute time tokens spend in between these two transitions is displayed Note that per pr
127. sitions To be able to get results from the plug in within reasonable time in all cases an option was added which 43 Chapter 5 Design and Implementation aa iE Process information Raisin_cleaning Total number selected 0 49 I e Y n 500 cases start Tg o Number fitting 6 Analysis Performance Analysis with Petri net ls 336 cases Knead O Arrival rate complete 0 06 cases per second Throughput time sec avg 17392 32 min 3000 0 437 40 0 stdev 10747 43 Zoom 124 J l Frequency 322 visits Arrival rate 0 05 visits per second Waiting time Selected Waiting time seconds Synchronization ti Sojourn time seco GS High Place p2 avg 19777 45 0 0 19777 45 E Medium min 10 0 E Low and max 46080 0 stdev 13417 13 Settings fast 25 00 81 3004 44 3004 44 aimus ba au FE_ANW 04197000 TA 270M TA Figure 5 3 Screenshot of the Performance Analysis with Petri net plug in where a place is selected if enabled makes sure that the log replay method does not build coverability graphs Instead all the transitions to fire that are not immediately enabled fail execution This clearly has its effect on the obtained results though because cases that actually fit in the process model can become non fitting Especially the values of the KPIs that are calculated based on the structure of the Petri net such as place relat
128. sk B and C are executed in parallel so it can happen that at one time task B begins execution first and the other time task C begins execution first It is a valid option then to place these sequences in the same pattern Therefore two different ways of comparing sequences should be distinguished one using strict equivalence and one using flexible equivalence The user should have the possibility to select which of both equivalence types should be used Definition Sequence Equivalence Two sequences s and s are equivalent i e follow the same pattern if and only if the following requirements are met ls T o i The sequences contain the same data element instances Ts ii Equivalent blocks are present in the sequences This is dependent on the type of equiv alence strict or flexible used though For both the equivalence types the data element instance of each block in the one sequence must be equal to the data element instance of the block in the other sequence that has the same index The following function evaluates to true if this requirement is met 37 Chapter 4 Requirements a Bso H B Voces di b di sim 6 sq Strict equivalence furthermore requires the blocks in both sequences to begin and end in exactly the same order The following function evaluates to true if this requirement is met b Voo b1 Bsg Lbegin 00 Thegin b1 Thegin stm bo 80 Thegin sim b1 80 A Lena
129. sses absence of anomalies like deadlocks and livelocks For the practical usability of the framework however functionality which supports analysis of the performance of processes quantitative analysis is also required The goal of this master s project is to extend the ProM framework with plug ins that support process performance analysis In this thesis the research that was done during the project will be discussed and the plug ins that have been implemented will be described Preface This master s thesis concludes my study Computer Science at Eindhoven University of Tech nology It describes my graduation project which was issued by and carried out at the Business Process Management group of the Department of Technology Management at Eind hoven University of Technology During my graduation project there have been many people who have supported and assisted me I would like to take this opportunity to thank them First of all I would like to thank my daily supervisor Ton Weijters for giving me the opportunity to work on a very interesting project Also all other researchers of the BPM group that helped me or presented me with their views and thoughts deserve my gratitude Furthermore I would like to thank my other supervisors Marc Voorhoeve and Paul de Bra for taking place in my assessment commission Finally I would like to thank my parents brother sister and friends for supporting me throughout this project and o
130. t ARs C Bs x Bs a set of arrows that connect the blocks and represent the transfer of work between data element instances Each arrow ar ARs has associated with it a begin timestamp Tyegin ar an end timestamp Tena ar a source block bsrc ar from which it originates and a destination block bges ar in which it ends Let us consider an example that illustrates how a sequence can be derived from a process instance by means of an instance ordering Suppose we have the process instance o and the 33 Chapter 4 Requirements TaskID Department Timestamp Instance Ordering A Production 1 00 1 gt 2 B Production 2 00 2 gt 3 C Sales 3 00 2 gt o 4 D Purchase 3 00 3 gt 05 E Sales 4 00 4 gt 56 F Purchase 4 00 5 gt o7 G Sales 5 00 6 gt 07 H Production 5 00 T gt o 8 Table 4 2 Process instance o and its instance ordering instance ordering of this process instance as displayed in table Suppose also that we want to study how work is transferred between departments We go through the events in the process instance in chronological order starting with the first event event 1 This event does not have a closest causal predecessor However from this event we can deduce that the production department is active at least for a moment at te where te is the timestamp associated with the first event in this case timestamp 1 00 Therefore a block bo is created and we set di bo Production Thegin bo te Tena b
131. t time of each pattern and the frequency of sequences that follow this pattern are shown Specific information of the arrows average duration of arrow and the periods blocks average duration of block of a pattern can be summoned by moving the mouse cursor over the object of interest 73 Appendix G Test cases In this appendix test cases will be described In these test cases we will verify whether the use cases which were defined in appendix E can be performed with the implemented plug ins G 1 Performance Analysis with Petri net All use cases of the Performance Analysis with Petri net plug in have the same precondition A Petri net a log and a relation between event classes in the log and transitions in the Petri net can be accessed through the ProM framework We assume the precondition is met at the start of each test case In the remainder of this section we will address the use cases of appendix E 1 one by one to verify whether they can be performed with the Performance Analysis with Petri net plug in G 1 1 Find bottleneck locations 1 User starts the plug in Start the plug in by first selecting Analysis from the menu bar of the ProM framework and then selecting the Performance Analysis with Petri net item from the menu that appears 2 Plug in asks the user to set bounds and colors for the low medium and high waiting time level of places The window as depicted in figure G 1p appears In this window you can ch
132. that both transitions fire at least once during replay of the case Alternative Paths e At step 7 instead of one process instance the user can also select multiple process instances All KPIs are based only on the selected process instance s e Activity related KPIs process related KPIs and place related KPIs can also be based on one or multiple of the registered cases in the log E 1 4 Export throughput times Brief Description The throughput times of all cases can be exported to a comma separated text file Preconditions A Petri net a log and a relation between event classes in the log and transitions in the Petri net can be accessed through the ProM framework Main Path 1 User starts the plug in 2 Plug in requests the user to set the non conformance option that defines how the plug in should deal with failing transitions User sets the non conformance option free Plug in shows process related KPIs including the statistical values for the throughput time User requests to export the throughput times Plug in asks for a filename The user provides filename COP EA Oe ON Plug in creates a comma separated file with the specified name which contains the throughput times of all cases Postcondition A comma separated file exists which contains the throughput times of all cases Alternative Paths 66 Appendix E Use cases e The values of all time related KPIs can be exported t
133. the XOR split The second research question is Question 2 Since we have a process model at our disposal within the ProM framework which additional process mining functionality would qualify for implementation into the ProM framework For the answering of the first question existing tools and functionality in the area of process performance analysis should be examined as well as earlier master theses within the process mining research area For answering the second question discussion with other researchers within the BPM group is eminent After answering the questions 1 and 2 decisions are to be made based on practical arguments such as usability implementation effort etc as to which functionality is to be added to the ProM framework using Java The implementation of the chosen functionality is part of the graduation assignment User friendliness and ease of use are of great importance here because the functionality is to be used by Technology Management students to analyze practical situations 1 4 Outline The remainder of this thesis is structured as follows In chapter 2 the concept of process mining and the research in this area will be discussed Thereafter in chapter 3 the concept of process performance analysis will be addressed Also the functionality that exists in the process performance analysis area and possible additions to this functionality will be discussed in chapter Chapter 4 describes the requirements of
134. the discovery and validation of process models workflows Herein the ProM framework is practically unique 1 3 Assignment definition The analysis functionality contained in the ProM framework at this moment mainly consists of qualitative analysis Qualitative analysis focuses on the logical correctness of processes Chapter 1 Research Assignment absence of anomalies like deadlocks and livelocks In practical situations a quantitative analysis is needed as well to determine how parts of business processes are actually per forming For obtaining relatively superficial process characteristics average throughput time execution time etc there are commercial tools available commonly known as process moni toring tools Extending the ProM framework with such functionality is a first step to enhance the practical usability of ProM The first research question therefore is Question 1 Which process mining functionality is available in commercially developed process monitoring tools and which parts of this functionality would be enhancements to the ProM framework The ProM framework can be used to derive an explicit process model from an event log It seems possible to combine this process model with the information that is stored in the event log For instance when the process model contains a XOR split it could be examined whether there are case properties with which it is possible to predict the branch that is chosen at
135. the functionality that was chosen for implementation and in chapter 5 the design and implementation of the plug ins that contain this functionality will be discussed Finally in chapter 6 conclusions will be presented and the results evaluated Chapter 2 Process Mining 2 1 Introduction Since the emergence of disciplines as Business Process Management BPM and Business Process Reengineering BPR at the end of the 80 s organizations have been focusing more and more on their business processes in order to improve results and reduce costs A business process can be defined as a related group of tasks that together create value for a internal or external customer 6 Examples of business processes are the processing of orders in a factory or the handling of insurance claims at an insurance company Before BPR and BPM hardly any attention was given to these business processes Because organizations often were and are divided into functional areas e g accounting production sales the main focus was on improving and optimizing these individual functions The relationships between such functions did not get as much attention With the emergence of BPR and BPM however organizations started to realize that to strengthen their position and to improve their performance they would have to improve their cross functional business processes 8 To improve and support business processes many organizations have now installed Process Awa
136. timestamp instead of a duration attached to it Tokens thus spend time at places in between transitions instead of at the transitions themselves In EMiT time related KPIs are therefore associated with places instead of with transitions Of every place in the Petri net EMiT calculates the Sojourn time The sojourn time of place p is the average minimum maximum and variance in time tokens spend in p between the moment they are created and the moment they are consumed EMiT further divides the sojourn time in Synchronization time Time that passes from the partial enabling of a transition i e at least one input place marked until full enabling i e all input places are marked In other words the time that a token spends in a place between the moment of its creation and the moment at which the transition that is going to consume the token is fully enabled i e has a token in each input place Synchronization times are only greater than 0 when the transition to fire is an AND join 15 Chapter 8 Process Performance Analysis ei EMIT relations inspector Figure 3 5 Screenshot of the EMiT tool Waiting time Time that passes from the enabling of a transition until its firing i e the time between the moment at which the transition that is going to consume the token is enabled and the moment at which this transition actually fires and consumes the token To clarify this let us look at an example Suppose we
137. trict equivalent Show diagram Filter options Zoom 100 0 Figure G 7 Screenshot Main window Performance Sequence Diagram Analysis we will use the Originator data element which is present in most logs After selecting the data element press the Show Diagrams button 4 Plug in shows the sequence diagram On the right side of the window the derived sequence diagram appears See for instance figure 5 User selects the sequence s he is interested in Move the mouse cursor over the sequence in which you are interested 6 Plug in displays the throughput time of the selected sequence As can be seen in figure in a tooltip information about the sequence will appear This information includes the throughput time of the sequence G 2 2 Find most frequent pattern The first two steps of this use case are equal to the first two steps of the previous use case 3 User selects a data element At Component type select the data element that your are interested in 4 Plug in asks which equivalence type to use flexible equivalence or strict equivalence On the left side of the window as depicted in figure G 7 a Pattern type can be selected 83 Appendix G Test cases 6 Analysis Performance Sequency Diagram Analysis 7 Full diagram t Pattern diagram Options Component type Originator Time sort hours Pattern type Flexible equivalent
138. ulted from the design effort is shown The classes in the lower part of the diagram are actually part of the Performance Sequence Diagram Analysis plug in The classes in the upper part of the diagram are not part of the plug in but can be accessed by the plug in through the ProM framework The PerformanceSequenceAnalysisPlugin class is the implementation of the AnalysisPlugin interface for this analysis plug in When the analyse method of this plug in is invoked the GUI of the plug in PerformanceSequenceAnalysisGUL is built Through this GUI initial options data element equivalence type time unit can be set When the deriveDiagrams method is called a sequence and a pattern diagram are derived from the input event log using the options and the InstanceMiner The InstanceMiner is a plug in of the ProM framework which can derive log relations from a log These log relations can 45 Chapter 5 Design and Implementation AniahPiugin analyse getHTMLDescription O getLogRelations ProM framework compareToPattems getPattern getVisualization getValues Throughput getArrow getBlock getBegin Time getEndTime V getBeginBlock getEndBlock DataEltInstance Performance Sequence Diagram Analysis Figure 5 5 Class diagram Performance Sequence Diagram plug in be used to obtain instance orderings which are needed to be able to derive sequences The resulting SequenceDiagram contain
139. urement and performance management International Journal of Production Economics 41 23 35 1995 50 Bibliography 12 13 14 15 16 17 18 19 20 21 22 25 T Murata Petri nets Properties analysis and applications Proceedings of the IEEE 77 4 541 580 1989 Processmining org Process mining website http www processmining org 5 12 2006 A Ramesh Process mining in peoplesoft Master s thesis 2006 F J Reh Management You can t manage what you don t measure http management about com cs generalmanagement a keyperfindic htm 22 8 2006 W Reizig and G Rosenberg Lectures on petri nets 1 basic models Lecture notes in Computer Science 1491 1998 A Rozinat and W M P van der Aalst Conformance testing Measuring the alignment between event logs and process models BETA Working Paper Series WP 144 2005 J Rumbaugh I Jacobson and G Booch The Unified Modeling Language Reference Manual Addison Wesley 1999 IDS Scheer Aris process performance manager measure analyze and optimize your business process performance whitepaper IDS Scheer 2002 W F M van der Aalst and K M van Hee Workflow Management Models Methods and Systems The MIT Press Cambridge 2002 W M P van der Aalst A K Alves de Medeiros and A J M M Weijters Genetic process mining 26th International Conference on Applications and Theory of Petri N
140. values of the throughput time of each pattern Postcondition The pattern with the highest average throughput time is the longest pattern This can be more than one pattern 68 Appendix E Use cases E 2 4 View average time in period Brief Description View the average time length of a period of activity of a data element instance in a pattern Preconditions A log in which events are registered is accessable through the ProM framework Main Path 1 User starts the plug in 2 Plug in provides a list of the available data elements that appear on the audit trail level in the log and requests the user to choose one User selects a data element Plug in asks which equivalence type to use flexible equivalence or strict equivalence User selects an equivalence type Plug in shows the pattern diagram User selects a block i e a period in the pattern diagram ON PSR eae Plug in shows the average duration of the selected period Postcondition The average duration of the selected period of activity is shown Alternative Paths e At step 7 instead of selecting a block an arrow could have been selected after which the source data element instance the destination data element instance and the average time between beginning and end of the arrow is shown 69 Appendix F Prototypes In this appendix prototypes for the Performance Analysis plug ins are described In section prototypes for the Perfor
141. x For a block b B we refer to its peer i e block with the same index in another sequence s as sim b so In a similar manner the arrows should be sorted First on the name of the data element instance in which the source block is active then on the name of the destination then on the begin timestamp and finally on the end timestamp However sometimes it can occur that arrows have the same values for all these attributes If this is the case a random order is chosen for these arrows This can be done because the arrows are equal anyway After the sorting each arrow in AR has a unique index which should be used to compare the arrows with the same index of different sequences For an arrow ar AR we refer to its peer i e arrow with the same index in sequence so as sim ar S 36 Chapter 4 Requirements Figure 4 3 Example of sequences When the orderings of the blocks and the arrows are known it is possible to compare se quences There are still different ways to do this however Suppose we have the sequences as depicted in figure 4 3 If we compare these two sequences in a strict manner these sequences are different because in the left sequence block bo begins and ends before block co while in the right sequence block co begins and ends before block b However suppose that what these sequences represent is the transfer of work between tasks It is easy to imagine that after task A ta
142. xible equivalent but not strict equivalent Using sequence equivalence all sequences derived from the input log should be grouped in patterns i e groups of sequences that follow similar behavior This should be done by examining the sequences one by one and comparing each sequence with the patterns that were found thus far If the sequence is sequence equivalent with the sequences that follow a certain pattern then the sequence should be added to the list of sequences that follow that pattern If the sequence is not sequence equivalent with any of the sequences in the existing patterns a new pattern should be created of which the sequence becomes the initial sequence Each pattern thus consists of at least one initial sequence and possibly of a group of sequences that follow the same behavior as this first sequence Furthermore each sequence is part of exactly one pattern Definition Pattern A pattern p consists of a set of sequences Sp that are sequence equiv alent A pattern just like a sequence represents the transfer of work between a set of 38 Chapter 4 Requirements data element instances I i Is So E Sp and consists of a set of blocks Bp and a set of arrows AR where By a set of blocks or periods in which the data element instances in J are active Bp contains the same blocks as each of the sequences in Sp For each block b Bp sim b s refers to the block in sequence s Sp that h
143. ynchronization times of all tokens that are added to place p during log replay Waiting time of place p The waiting time of a token is the time between the moment at which the transition that consumes the token is fully enabled and the moment at which the token is actually consumed Statistical values should be calculated for the waiting time of place p based on the waiting times of all tokens that are added to place p during log replay Arrival rate of tokens to place p The average arrival rate of tokens to place p should be calculated based on the creation times of the tokens that are added to p The arrival rate can then be calculated in a similar manner as the arrival rate of cases Frequency of tokens in place p The number of tokens that are added to place p during log replay should be available to the user Probabilities at XOR splits A special KPI should be depicted at places with multiple outgoing arcs more commonly known as XOR splits Each outgoing arc of a place p ends in a transition t We define the probability for each outgoing arc p t of XOR split p as tA weL trans x t where A is the set of all arcs in the Petri net weL trans z e t p t A 4 2 4 Time between two transitions Another KPI that should be calculated is the time between two transitions where the user should have the possibility to select the two transitions Given replay output L and case c the definition of the absolute tim
144. zation ine Sopu ime EE High a E Medium None C Low POEPEN ow Metrics stdev 2 12345 days 0 42143 TA Change Settings Sh er f a a aa j Maa j j a M aM NMI OO Ot amp wl pm case_ case_ case_ case_ case_ case_ case_ case_ case_1 case_ case_ case_ case_ case_ case_ case_ case_ _ as eo Figure F 2 Prototype of main GUI be set in this window The user can choose between using standard waiting time level settings or to manually specify these settings When the latter option is chosen the user can set the bound values and the colors of each waiting time level In figure F 2 a prototype is shown of the GUI that may appear after log replay is completed In this GUI the input Petri net is displayed but with the places colored according to their average waiting time level and probabilities shown at XOR splits Below the Petri net a legend is depicted which describes which color belongs to which level The settings for these waiting time levels can be changed by clicking on the Change Settings button below the legend after which a window similar to the initial options window pops up in which the current settings can be adjusted Left of the Petri net the identifier names of the process instances that are contained in the input log are shown in a table These process instances can be selected or deselected The displayed performance information should be based only on the selected process instances O
Download Pdf Manuals
Related Search
Related Contents
Manual de Instrucciones testo 816 Medidor de ruido Operators Manual / Guide d'Utilisation Kompernass Bubbling Footbath KH 5557 Operating Instructions Stop video preview MANUEL D`UTILISATION Copyright © All rights reserved.
Failed to retrieve file