Home

Interuniversity Master in Statistics and Operations Research

image

Contents

1. FFI1 Direct connection to data sources A 2 2 4 50 00 T FFI2 BigData sources 2 1 2 50 00 T FFI3 Apache Hadoop A 2 1 2 50 00 1 FFI4 Microsoft Access 2 2 4 50 00 1 FFIS Excel files A 3 3 9 75 00 T From an excel file import all sheets FFI6 at the same time A 2 6 75 00 1 FFI7 Cross tabs A 2 8 100 00 FFIS Plain text 3 75 00 yt Connecting to different data sources FFI9 at the same time 3 2 6 75 00 1 Easy integration of many data FFI10 sources A 3 2 6 75 0096 1 FFI11 Visualizing data before the loading 3 2 6 75 00 1 FFI12 Determining data format A 3 2 6 75 0096 1 FFI13 Determining data type A 3 2 6 75 0096 1 Allowing column filtering before the FFI14 loading A 3 2 6 75 0096 1 Allowing row filtering before the FFI15 loading 3 2 6 75 00 ii FFI16 Automatic measures creation A 3 2 9 75 00 FFI17 Allow renaming datasets 3 2 6 75 00 1 FFI18 Allow renaming fields A 3 3 g 75 00 1 FFI19 Data cleansing A 3 2 6 75 00 1 FFD1 Data model is done automatically A 0 2 0 0 00 0 The done data model is the correct FFD2 one 1 2 25 0096 0 FFD3 Data model can be visualized A 3 3 9 75 0096 1 FFF1 Alerting about circular references 4 3 il 100 00 1 FFF2 Skiping with circular references 4 3 12 100 00 1 A same table can be used several FFF3 times A 3 2 6 75 00 1 Creating new measures based on FFA1 previous measures A 3 3 9 75 0096 1 Creating new measures based on FFA2 dimension
2. FFI1 Direct connection to data sources A 3 al 6 75 00 1 FFI2 BigData sources 3 1 8 75 00 1 FFI3 Apache Hadoop A 3 1 3 75 00 1 FFI4 Microsoft Access A 3 2 6 75 00 1 FFIS Excel files 3 B B 75 00 1 From an excel file import all sheets FFI6 at the same time 3 B 6 75 0096 1 FFI7 Cross tabs 2 2 4 50 0096 FFI8 Plain text 75 00 1 Connecting to different data sources FFI9 at the same time 3 2 6 75 00 1 Easy integration of many data FFI10 sources A 3 2 6 75 00 1 FFI11 Visualizing data before the loading A 3 2 6 75 00 1 FFI12 Determining data format 13 2 6 75 00 1 FFI13 Determining data type 3 2 6 75 00 1 Allowing column filtering before the FFI14 loading A 3 2 6 75 00 1 Allowing row filtering before the FFI15 loading A 0 2 0 0 00 O FFI16 Automatic measures creation 3 2 9 75 00 1 FFI17 Allow renaming datasets 3 2 6 75 00 1 FFI18 Allow renaming fields A 3 3 9 75 00 1 FFI19 Data cleansing 0 2 0 0 00 O FFD1 Data model is done automatically 4 2 8 100 00 1 The done data model is the correct FFD2 one 3 2 6 75 00 1 FFD3 Data model can be visualized A 3 3 9 75 00 1 FFF1 Alerting about circular references A 3 3 B 75 00 1 FFF2 Skiping with circular references A O 3 0 0 00 O A same table can be used several FFF3 times A 3 2 6 75 00 1 Creating new measures based on FFA1 previous measures A 3 3 9 75 00 1 Creating new meas
3. SAP Lumira Desktop Standard Edtion used in the project is a 30 day free trial This edition is the personal edition of SAP Lumira Server It can be connected to the same type of databases than the server one and it can also open projects done by other users SAP Lumira Desktop cannot be connected to a server with other users and it has not security devices Tableau Desktop used in this project is a 14 day free trial This edition is the personal edition of Tableau Server It can be connected to the same type of databases than the server one and it can also open projects done by other users Although it cannot be connected to a Server with other users and it has not security devices Once time the user has the evaluation sheets and he has already used the corresponding applications with the database 20141220 Initial test he has to score the metrics Scoring the metrics is the key step in order to get results about each of the applications in each of the three categories Functionality Usability and Efficiency Remember that the results could not be considered as concluding because more user opinions should be considered If there are more evaluators the four 4 sheets the four 4 applications and the database 20141220 Initial test must be offered to each of them 54 It must be consider that the database used in the evaluation is in a excel file and therefore we don t have the experience to connect tools to databases Therefore the
4. There exist an icon to refresh data If there is any change in the data source user can refreshes data by it because it is not done automatically FFA S User can create a time hierarchy Year level Quarter level Month level and Day level from a date field just in one click Also geographical and personalized hierarchies can be built in order to be ploted in maps FFA7 FFAS User can filter data in a visualization filtering by any dimension FFA selecting data points in a chart to filter or exclude them and displaying only the top or the bottom ranking data points for dimension or measure But for example there is no the possibility to filter by a specific requirement of a measure by a formula or for a specific value of a dimension FFA10 Other tools offer much better filtering capabilities User can interact with a chart filtering data by the legend or selecting data points But a filter done in a chart does not interact with other charts It means the interactions is in a chart and not between charts FFA 2 And moreover there is not exist the possibility to create data sets FFA9 Null data are considered by default as a one more value of the field Null values does not intervene in calculated measures User can choose if the null value is represented in a visualization or not FFA 14 There is not any problem with null data When data is loaded from a file null data must be represented as an empty space FFA13 Moreo
5. 52 6 Evaluation Results cere ada 53 6 1 Results uei reiecit e Free tote pet erp ei RN 55 7 EonclusionSsesscecott tete etate de t 61 8 Bibliography ict eto ette tee ts n tee ees td ci oe eee tese dub ads 62 O A nere te re Edu OH OH 64 10 Tables index oct uite tt at bt teta 65 Annex 1 Scripts for 20141220 Initial test database ccccncncccocncncnnconononnnnnnnnnnonananononcnncnnnnnnns 66 Annex 2 Questionnalires o bct aeu elita te n ip ier Ede exe d tenda es 76 Annex 3 QlickView evaluation esee esee enne enne 85 Annex 4 SAP Lumira evaluation sessi nennen nennen tenente nnns nnne nnne 94 Annex 5 MicroStrategy Analytics evaluation cococccconncncnonnnonaonnnnnnononanonannnnnnnncnnnnnnnonnnncnnannnns 101 Annex 6 Tableau evaluation tert ettet a age annue 108 Annex 7 Reporting examples 113 2 Introduction This study belongs to the business sector In particular the Business Intelligence sector where I participated doing this Master s degree thesis Business Intelligence BI is the name associated to the set of tools and techniques for the transformation of raw data into meaningful and useful information for business analysis purposes BI technologies are capable of handling large amounts of unstructured data to help identify develop and otherwise create new strategic business opportunities And the main goal of
6. A n gt gt 3 p 6 Hm 3 2 6 1 3 2 6 1 0 2 0 0 3 3 9 T 3 3 9 m 3 3 9 Hm Average learning time Consistency between icons in the toolbars and their actions 75 0096 Displaying right click menus 75 0096 Ease of understanding the terminology EASE OF User guide quality 3 2 6 75 00 UNDERSTANDING User guide adquisition 3 2 6 75 00 AND SAIN On line documentation 75 00 100 00 Availability of tailor made training courses 2 6 75 00 1 2 2 25 00 0 2 6 75 00 1 2 6 75 00 1 2 8 100 00 1 Community 2 6 75 00 1 2 6 m 2 6 di 2 6 1 2 4 1 Phone technical support On line support Availability of consulting services Free formation gt gt gt gt GRAPHICAL UGW1 Editing elements by double clicking 75 0096 INTERFACE UGW2 Dragging and dropping elements 75 0096 100 00 CHARACTERISTIC UGD1 Editing the screen layout 75 00 OPERABILITY UOVi Automatic update 2 50 00 50 0096 3 3L 3 3 4 3 3 3 3 gt gt N gt EEC1 Compilation Speed CPU processor type 100 00 EXECUTION PERFORMANCE Minimum RAM 100 00 N DIN pje Hard disk space required 75 00 100 00 Additional software requirements 100 00 100 00 84 Annex 3 QlickView evaluation This chapter explains how does QlikView meet or not the metrics evaluated in this project
7. In a client table John Smith Male 30 11 1975 would represent one client entity e Every table has a set of attributes that taken together as a key uniquely identifies each entity e g in a client table ClientID would uniquely identify each client no two clients would have the same clientID Certain fields are designated as keys which means that searches for specific values of that field will uniquely identify each entity There are many types of keys however quite possibly the two most important are the primary key and the foreign key The primary key is what uniquely identifies each entity within a table The foreign key is a primary key of one table that is also present into another table Where fields in two different tables take values from the same set a join operation can be performed to select related records in the two tables by matching values in those fields Usually but not always the fields will have the same name in both tables Ultimately the use of foreign keys is the heart of the relational database model This linkage that the foreign key provides is what allows to link data together In the relational data model there are two important rules that help to ensure data integrity They are e Every tuple is unique This means that for every record in a table there is something that uniquely identifies it from any other tuple the primary key e Table names in the database must be unique and attribute names i
8. In the web of SAP Lumira there is no information about tailor made training courses in the web SAP Lumira offers a phone technical support but its schedule is not showed But user can also ask questions in the web to an expertise This option is available at full time UES UES2 UES3 SAP Lumira also offers in its web consulting solutions classified in Sales Management with Sales Team Performance and Sales Quota amp Comission Marketing with Campaign Analysis and Segment Analysis Financial Plannig with Financial Statement Analysis and Corporate Project Monitoring and finally Human Resources with SuccessFAcots Compensation Analysis UES4 It offers free webinars and free video and interactive tutorials in the web of SAP Lumira as free formation Morover user can register to events by free UES5 And there is also a coumminity UES6 Graphical interface User cannot edit elements by double clicking UGW but dragging and dropping method can be used UGW2 Screen layout can be modified by the user UGD Operability SAP Lumira can check automatically dayly weekly monthly for updates connecting by itself to SAP Public Portal or SAP Pulic Support UOV 99 Execution performance Its compliation speed can be considered fast Server Edition requires 3 7 GB of disk space EER3 and 4 GB of RAM EER2 And it can only be installed in x64 bit processor EER 1 100 Annex 5 MicroStrategy Analytics evaluation Data l
9. Metrics are grouped in sub characteristics and metric s codes appear in the text when they are mentioned Data loading User can extract data from files table files data files and web files FFI8 or can connect to databases by ODBC Open Database Conectivity and OLEDB Object Linking and Embedding Database Examples of connections are Oracle Microsoft Access FFI4 or Microsoft SQL Server Moreover it can also be connected to BigData sources like Teradata FFI2 QlikView has not integrated connectors ODBC or OLEDB to database FFI but QV Source which is a Qlik s partner offers a variety of API connectors for QlikView These API connectors allow the connection to different social and business APIs without requiring any technical knowledge Some examples are Twitter Facebook Google Analytics Google Docs Calendar Mashape Additionally QVSource also offers developing connectors for other sources that may not have an ODBC driver such as NoSQL type databases as MongoDB or Hadoop FFT3 The loading data is done by the Editor Script where user has to write a specific code in SQL like language There also exist the option to click on tabs and the code is written on the Editor Script by the machine and finally user just executes the code in order to load data Additionally user can load data easily from spreadsheets e g Excel files by an assistance Wizard Assistance FFI5 Unfortunately if user wants to import data from m
10. according to the rules established by SQMO And in order to get differences between QlikView and Tableau we did the second evaluation being more restrictive in the satisfaction limit And the results were that Tableau got an advanced quality level and QlikView was rejected according to the rules established by SQMO As a further analysis it would be adequate to run this evaluation by more users of different profiles Farmers Explorers and Tourist in order to get proper results And it would be advisable to try different satisfaction levels depending on the final purpose of the company 61 8 Bibliography Bass L Clements P amp Kazzman R 1998 Software Architecture in Practice SEI Series in Software Engineering Massachussetts Addison Wesley Longman Inc Callaos N C 1993 Intention action and design The 5th International Conversation on Comprehensive Systems Design pp 36 47 Monterey California Asilomar Callaos N amp Callaos B 1993 Intention action and design The 5th International Conversation on Comprehensive Systems Design pp 36 47 Monterey California Asilomar Callaos N amp Callaos B 1996 Design with a systematic total quality pp ISAS 96 15 23 Proceeding of the International conference on information Systems Analysis and Syntesis Callaos N amp Callaos B 1996 Designing with a systemic total quality Proceeding of the INternational Conference on Information Systems A
11. represents a Dimension tables which is connected to another Dimension tables It means that a parent dimension table offers information about other Dimension table When the model is only composed by a Fact table and several Dimension tables without parents it is called star schema Which is a simpler snow flake schema Finally there are two types of relationships between tables in relational data models e A i n relationships is typically modelled using a simple foreign key Where one column in table A references a similar column in table B typically the primary key Since the primary key uniquely identifies exactly one row this row can be referenced by many rows in table B but each row in table B can only reference one row in table A The name of this type of relationships means 1 to many e An m or n n relationship occurs when each row in table A can reference many rows in table B and each row in table B can reference many rows in table A The name of this type of relationships means many to many Once the reader has been introduced in the relational data modelling it is time to present the database created in this project 20141220 Initial test 5 2 20141220 Initial test The created database used in order to evaluate the applications is called 20141220 Initial test It is composed by nine 9 tables forming a relational database and particularly a snow flake schema In Fig 7 it is showed the relational data mode
12. sets groups and hierarchies Advanced capabilities include semantic auto discovery intelligent joins intelligent profiling hierarchy generation data lineage and data blending on varied data sources including multi structured data e Internal Platform Integration A common look and feel install query engine shared metadata promo ability across all platform components e BI Platform Administration Capabilities that enable securing and administering users scaling the platform optimizing performance and ensuring high availability and disaster recovery e Metadata Management Tools for enabling users to leverage the same systems of record semantic model and metadata They should provide a robust and centralized way for administrators to search capture store reuse and publish metadata objects such as dimensions hierarchies measures performance metrics KPIs and report layout objects e Cloud Deployment Platform as a service and analytic application as a service capabilities for building deploying and managing analytics in the cloud e Development and Integration The platform should provide a set of programmatic and visual tools and a development workbench for building reports dashboards queries and analysis Produce e Free Form Interactive Exploration Enables the exploration of data via the manipulation of chart images with the colour brightness size shape and motion of visual objects representing aspects of the dataset
13. some of this tools offers mobile applications to visualize and interact with dashboards on tablets or mobiles Finally most of these tools offers the option of Server Editions This type of editions can be implemented in corporative servers which usually are more powerful than users systems and several users can profit of this power Moreover these editions allows to define and manage security devices with password protection and user permissions Finally Server Editions ease the sharing of projects Additionally to Server Editions some tools offers the SaaS version Software as a Service SaaS business intelligence is a delivery model in which applications are typically deployed outside of a company s firewall at a hosted location and accessed by an end user which only requires a secure Internet connection and a browser That is SaaS BI lets companies use BI tools without having to install a program on a computer independently of the Operative System requirements which sometimes might suppose a problem There are different type of Self Service BI users which are explained in the following sub chapter 3 3 As every assessment must be based on the users opinion it is important to know the type of users of Self Service BI tools 2 3 Bl users A rigorous evaluation should be done by several users in order to get trustworthy results In particular Self Service BI tools as data systems usually have different user profiles and several u
14. 113 Fig 30 Example of a Network chart built by MicroStrategy 114 Fig 31 Example of a Map chart built by MicroStrategy 114 Fig 32 Example of a Map chart built by MicroStrategy 115 Fig 33 Example of a k means classification plot done by the previous connection of MicroStrategy Analytics to R serena a a iaa aa a E A AT A CET A aaa 115 Fig 34 Example of a Heat Map built by QlikView o occccconononnnnnononocanonannnnnononncnnonannnnnnnnos 116 Fig 35 Example of a Radar Map built by QlikView ccococonoooonncnonocanonannnnnononnnananancnnnnnnos 116 Fig 36 Example of a Radar Map built by QlikView esses eren 117 Fig 37 Example of a forecasting built by SAP Lumira esses 118 Fig 38 Example of a Funnel map built by SAP Lumira cccccccnnnoononnnnnnocanonnonnnonononanonannnnnnnnos 118 Fig 39 Pie charts built by Tableau sc eri theta 119 64 10 Tables index Tab 1 Characteristics for Product sub model eene 15 Tab 2 Characteristics for Process sub model esee 16 Tab 3 Quality levels for the Product Software eese enne 17 Tab 4 Quality levels for Development eee eene 18 Tab 5 Systemic quality
15. Elo 1G Sinisters table tede et detect e tete etd terre Tat ege eere ae ta 53 Fig 17 Results for Functionality category ooccccccnononoononcnnnnnonannnnnoncnnonannnononnnnnnnnnnnonenonnnnnnnnnnnnns 55 Fig 18 Functinality characteristics results for MicroStrategy Analytics 56 Fig 19 Category results metet tette 57 Fig 20 Efficiency characteristics results for SAP Lumira cccconococconnnncnonononannnnnoncnanannnnnnnnnononnnns 57 Fig 21 Execution Performance sub characteristics results for SAP 58 Fig 22 Quality levels depending on satisfied categories The particular 58 Fig 23 Category results for a second evaluation eese nnn 59 Fig 24 Characteristic results for a second evaluation cccconocconncnnononononannnnnonncanananonnnnncnnnnnnns 59 Fig 25 Heat Map chart built in MicroStrategy cccoconooconcnncnnnnonononancnnnnananonanonnnnnnnnnonononnnnncanonnnns 88 Fig 26 Block chart built in ennemi 89 Fig 27 Block Chart with background color assigned to an expression built in OlikView 90 Fig 28 Example of a Heat Map build by MicroStrategy Analyitics 113 Fig 29 Example of a Network chart built by MicroStrategy
16. FID It evaluates the capability of a tool to export data txt 2 Exportation in CSV FID2 It evaluates the capability of a tool to export data in CSV format 3 Exportation in HTML FID3 It evaluates the capability of a tool to export data in HTML format 4 Exportation in Excel file FID4 It evaluates the capability of a tool to export data in Excel files 3 Security characteristic This characteristic is composed by a unique sub characteristic which groups metrics about the security process i Security devices This sub characteristic is composed by two metrics related with the protection of data 1 Password protection FSS1 It evaluates the capability to protect projects with password 2 Permissions FSS2 It evaluates the capability to assign different permissions to different users 3 4 2 Usability category Ease of understanding and learning characteristic This characteristic includes different sub characteristics i Learning time This sub characteristic includes only one metric 1 Average learning time UEL1 This metric measures the time spent by the user in learning the functionality of the tool 32 11 111 1v Browsing facilities This sub characteristic evaluates how the user can browse inside the tool 1 Consistency between icons in the toolbars and their actions UEBI This metric measures the capability of the tool to be consistent with its icons 2 Displaying right click menus UEB2 This metric m
17. Module Anyway creating macros is not an easy process The whole Project cannot be shared by the Personal Edition but they can be shared using the Server edition Reporting There exist the option to create a static report from a dashboard adding object sheets images and text but the options to design reports are poor comparing with other tools There exist a corporative complement called NPrinting that is a report generation and it seems to have more options in the report designing But as QlikView offers options to do reports although they are 90 poor the integrated reporting is the evaluated feature FFR3 Reports can be printed or saved as XPS file FFR1 And there are not templates to do that FFR2 Languages The interface language can change between Chinese Dutch English French German Italian Japanese Korean Polish Portuguese Russian Spanish Swedish and Turkish depending on the user demands Moreover it offers the option to use a different language for the Help assistance FILI Portability OlikView Personal Edition and QlikView Server are compatible with Windows XP Windows Vista Windows Server 2003 and Windows Server 2008 for 32 bit and 64 bit There 1s not the option to install it in Mac FIP1 QlikView does not directly provide a SaaS service offering However many QlikView partners offer QlikView built applications as a SaaS offering to their own customers FIP2 QlikView on Mobile is p
18. because the special attention was focused on the evaluation of the software quality features observed on its execution But if anyone ever considers appropriately to include the sub model Process or the Efficiency dimension due to his owns interests there exists the option to do so by following the steps explained above Hence Fig 2 reflects the adapted model used in the current evaluation for BI tools PRODUCT Level 0 Dimensions Effectiveness Level 1 Level 2 Level 3 Level 4 Metrics Fig 2 Diagram of the adapted Systemic Quality Model Rincon Alvarez Perez amp Hernandez 2005 Besides the Functionality category we chose the Usability because this type of tools are focused on non technical users and the difficulty of the product must be minimum Moreover it must be an attractive product because the success of the tool depends on the user s satisfaction Finally the Efficiency category was chosen because the processor type the hard disk space and the minimum RAM required are factors that play an important role in making the deployment 19 of the tool a successful one Self Service BI tools are popular thanks to its working in memory Then it is important to evaluate the minimum amount of memory required 3 3 1 Scales of measurement In the current evaluation all the evaluated metrics are ordinal variables because they have more than two categories and they can be ordered or ranked Recalling that metri
19. being analysed e Analytic Dashboards and Content The ability to create highly interactive dashboards and content with visual exploration and embedded advanced and geospatial analytics to be consumed by others e IT Developed Reporting and Dashboards Provides the ability to create highly formatted print ready and interactive reports with or without parameters This includes the ability to publish multi object linked reports and parameters with intuitive and interactive displays e Traditional Styles of Analysis Ad hoc query enables users to ask their own questions of the data without relying on IT to create a report In particular the tools must have a reusable semantic layer to enable users to navigate available data sources predefined metrics hierarchies and so on 37 Consume Mobile Enables organizations to develop and deliver content to mobile devices in a publishing and or interactive mode Collaboration and Social Integration Enables users to share and discuss information analysis analytic content and decisions via discussion threads chat annotations and storytelling Embedded BI Capabilities for creating and modifying analytic content visualizations and applications and embedding them into a business process and or an application or portal Moreover platforms had met other non technical criteria Generating at least 20 million in total BI related software license revenue annually or at least 17 million in to
20. built by OlikView 117 Sinisters in Alava by Quarter and its forecast for 3 years All Measures Count SinisteriD Forecast Count SinisterID Count SinisterlD amp Forecast Count 0 00 v PY IET RAR ORAL DIL ATE trot WC FREE PARTE mc TIT PEARSE LATER Tyr 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Year Quarter Sinisters in Granada by quarter and its forecast for 3 years All Measures Count SinisterlD Forecast Count SinisterlD Count SinisterlD amp Forecast Count QQQ p r ea 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 Year Quarter Fig 37 Example of a forecasting built by SAP Lumira Amount of sinisters assigned to each Guaratee Guarantees Bl health assistance 8 windows driver insurance D theft B fire BB claims 8 travelling 8 total loss 8 Fig 38 Example of a Funnel map built by SAP Lumira 118 Region 97 Guadalajara Fig 39 Pie charts built by Tableau 119 Month of Year of S W January W February M March April May B June B July Wi August M September 8 October B November B December
21. chapter 3 3 2 this issue is discussed 4 Measuring the quality level of the process The quality levels related with the satisfied categories are Basic level It is the minimum required level Categories Customer Supplier and Engineering are satisfied 17 Medium level In addition to the basic level categories satisfied categories Support and Management are satisfied Advanced level All categories are satisfied Qua eve atego ed Basic O y Ll Medium Engineering Advanced Support Management Organizational Tab 4 Quality levels for Development Process Finally there must be a join between the product quality measuring and the process quality measuring in order to obtain the systematic quality measuring The systemic quality levels are proposed in Tab 5 Product quality level Process quality level Systemic quality level Advanced Basic Medium Basic Medium Basic Medium Medium Medium Advanced Medium Medium Basic Advanced Medium Medium Advanced Medium Advanced Advanced Advanced Tab 5 Systemic quality levels This method of measurement is responsible for maintaining a balance between the sub models when they are both included in the model 3 3 Adoption of the systemic quality model SQMO SQMO was adopted as reference because it is a complete work influenced by many other models First of all it respects the concept of Total Quality Systemic from Callaos amp Callaos Designing with a s
22. connection with R is not very well documented and it is not very popular The integrated Data mining functions in MicroStrategy requires a connection to R FFA5 The R scripts of the functions are already built and user only needs connect to R to execute the code As stated above when data have type Time Date or DateTime Analytics Desktop create derived attributes with additional time related information Similar things happen when an attribute has assigned a geo role To sum up when a field is containing time or geographic information it is considered different FFA7 FFAS When user has a very large set of data in a dashboard it can be easier to work with that data grouping by in into logical subsets and viewing only one of the subsets at a time Adding an attribute to the Page by panel user can click an attribute elements to use to group data When user group data in a dashboard the grouping is applied to all visualizations on the current layout tab Each layout in a dashboard is grouped separately without affecting the contents of the other layouts in the dashboard FFA9 Moreover filter can be added in visualizations in order to display only filtered data And filters can be on dimensions and also on metrics 103 FFAIO FFA11 On the other hand when user is creating multiple visualizations in a same dashboard he can filter data selecting values in one visualization the source to automatically update the data displayed in anot
23. data 2 The done data model is the correct one FFD2 This metric evaluates the capability of applications to get relations between tables by as the user wants it In 28 our particular case of 20141220 Initial test data the model is showed in Fig 4 If user builds the data model manually getting the desired model should be easy While if the model is done automatically it can be more difficult depending on if the automatic model is the right one or if there exist the possibility to modify the model by the user Data model can be visualized FFD3 This metric evaluates if a tool lets seeing the data model during the analysis Visualizing the model during the analysis lets the user to check at all time the relations between fields 111 Field relations This sub characteristic includes several metrics related to the connections between fields when the data source is relational In order to clarify some of the proposed metrics the database 20141220 Initial test is used with examples 1 Alerting about circular references FFFI A circular reference exists when there are at least 3 tables related between them In the particular case of the 20141220 Initial test data the schema in Fig 5 synthesizes the concept For example user can desire to visualize Tab 8 it means particular policies and the regions where the policies have had an accident Policy table is related to Region table by the field Code which refers to the code identific
24. example if the column is assigned the Date data type Analytics Desktop can automatically generate separate attributes for year and month information just with a click User can also assign geo role or shape key to the data column to enable data to be displayed on a map based visualization When a data column has been assigned a geo role Analytics can also 101 automatically generate separated attributes containing higher level of geographical data If the data column contains city data that MicroStrategy Analytics recognize the tool automatically generates the State attribute which contains the state each city is located in Moreover for each dataset the measure Row Count is automatically created It gives information about how many rows the dataset has During the loading user can also choose not to import a column of data and rename columns while data is visualized In contrast user cannot filter the rows Finally user loads the datasets renaming it FFI16 Analytics Desktop allows user to combine data from different sources in order to analyze relations between them FFT9 The integration of different data sources is easy because fields are grouped in datasources and user can distinguish them FF110 Data Model MicroStrategy gives users a powerful option to model their data or not to model By default when user imports a new dataset directly into a dashboard that contains at least one dataset the new dataset is automatical
25. higher for the 32 bit version and Intel Core 2 Duo or higer for 64 bit version But it is not a requisite Fort he 32 bit versions the minimim RAM is 1GB but depending on the volum of data more memory can be required Fort he 64 bit version the reccomended is 2 GB The Hard Disk space required is 200MB for the 32 bit and 250MB for the 64 bit versi n Fort he QlikView Server Edition it is recommended to use Intel Core Duo ir higher for the 32 bit and Multi core for the 64 bit For the 32 bit IGB of RAM memory is required much memory can be required depending on the data volum while for the 64 bit the minimum of RAM memory is 4B EERI EER2 EER3 Software requirements QlikView Server requires a browser and it can be Internet Explorer 7 or Firefox 3 EES 93 Annex 4 SAP Lumira evaluation This chapter explains how does SAP Lumira meet or not the metrics evaluated in this project Metrics are grouped in sub characteristics and metric s codes appear in the text when they are mentioned Data loading With SAP Lumira user can load an Excel worksheet FFI5 and a text file csv txt log pm tsv FFIS as a dataset and can also copy data from clipboard As a part of the enormous platform SAP Lumira can obviously connect directly to SAP HANA and can also load data downloaded by SAP HANA without requiring a connection SAP Lumira can be also connected to SAP Business Objects Universe and to SAP Business Warehouse to lo
26. in Geo Business Intelligence but it is a component and it is not integrated in OlikView versions FFA7 On the other hand user can create expressions or fields based on time functions For example user can use the year function as Year sinister date when sinisterdate has the corresponding date format Some tools create some of the fields Year Month and Quarter of a date automatically based on date But it is not the case of QlikView in which user must create them by himself FFAS QlikView distinguish between nulls and empty spaces When data comes from a database and there are nulls they traspass automatically to QlikView as nulls But when data comes from files white spaces are considered missing values and not nulls Calculation are made although some operands or function parameters are null or missing values In charts missing values are considered as other values while null values are special and user can decide to show them in a chart or not User can transform missing values to null values by functions in QlikView Editor FFA 14 Then the only requisite to treat null values is that when they are coming from files they must be a white space FFA 13 In order to display data QlikView offers a variety of objects they are List box Statistic box Multibox Table box Chart Input Box Current Selections Box Button Text Objects Line Arrow objects Slider Calendar objects Bookmark object Search object Container and Custom o
27. is composed by the following fields RiskArea and RiskAreadesc They are described in Tab 12 Categorical fields Field name Description Values RiskArea Risk area 1 2 3 4 5 identification RiskAreadessc Risk area description gold silver master plus and regular Tab 12 RiskArea table description Values for RiskArea were extracted from freMPL6 dataset but they were modified in order to reduce diversity We select the values for our 26 000 clients from the risk area field of freMPL6 But only distinct values are stored in RiskArea field from the current table In fact the 26 000 registers for the risk area are stored in the field RiskArea from the Policy table And by this way RsikArea table and Policy table will be related by the common field RiskArea We can say that RiskArea table is a parent table of Policy table because extends the information about the risk area 47 At the last point a name was assigned to each risk area category by ourselves in the RiskAreadesc field Finally the resulting Risk Area table Fig 11 has 5 registers and 2 columns RiskArea RiskAreadesc 1 gold 2 silver 3 master 4 plus 5 regular Fig 11 RiskArea table 5 3 5 Guarantees table Guarantees table corresponds to a dimension table including information about guarantees offered for the insurance company and it is composed by the following fields Guarantees and Base They are described in Tab
28. latest release notes and product manuals Moreover user can also ask questions find solutions and share insights about MicroStrategy implementation UES 106 Windows and mouse interface Elements cannot be edited by double clicking UGW But elements can be moved by drag and drop UG W2 Display The screen layout can be modified by user changing the background color adding bar tools UGD1 Versatility By default Analytics automatically checks for updates downloads the update file and then notifies you to install an update when it is available User can also prevent Analytics from automatically checking for updates UOV Compilation Speed The Desktop Edition which is the used in this evaluation compiled slower than the other tools EECI Resources utilization Server Edition requires 4GB memory RAM EER2 and 1GB of Hard disk space required EER3 Analytics can be installed in processors of x86 and x64 bits EER Additionally it requires a web browser and an Adobe Reader and Flash Player Web Browser EESI 107 Annex 6 Tableau evaluation Data loading Tableau supports a wide variety of data source types It is compatible with files of type Tableau Data Extract Microsoft Access files Microsoft Excel files Text files and importing files from Tableau workbook Moreover Tableau can also be connected directly to some types of databases as Tableau Server Microsoft Access Actian VectorWi
29. must provide an attribute whose values include the names of each area in the map s base map The base map is an ESRI map that contains the shape of each area that can be displayed in the visualization The base maps available in MicroStrategy Analytics includes maps from continents countries of the world United States counties United States regions United States state abbreviations United States state names United States ZIP codes and World administrative divisions Finally MicroStrategy Analytics is not the tool with a wider variety of functions but is the tool in which graphs can be more customized by user Moreover they can be totally modify by the user FFAI6 There has not been limitation to show many data dots during the testing FFA 7 There is an option to refresh data but it is not done automatically user must order that clicking on an icon And user can choose between overwrite existing data update existing data as well as add new data and keep existing data and add new data FFA S Dashboard User can export the whole MicroStrategy file to keep the interaction in visualizations When user exports a dashboard the entire dashboard including visualizations filters and so on as well as the associated dataset are exported And then a other user can import it and works on it FFD1 Unfortunately there are not templates integrated in dashboards FFD2 And user has totally freedom to build them adding text images link
30. skills technologies and practices for continuous iterative exploration and investigation of past business performance to gain insight and drive business planning based on data and statistical methods To gain future vision of the business predictive modelling takes an important role It helps to get different scenarios depending on different possible business paths The implementation of predictive modelling can be considered the biggest difference between Business Intelligence and Business Analysis Although predictive techniques are not in the pure definition of Business Intelligence offering predictive techniques will be positively evaluated on this thesis because it is considered that those tools must also evolve with the needs and interests of the companies 2 1 Approach To carry out an assessment a series of steps must be followed First of all the responsible of preparing the assessment known as the evaluator must know the area of use Then the evaluator has to fix a methodology and adapt it to the particular area of use The adaption implies decide the interesting metrics which will be evaluated Users can advice to the evaluator about the interesting metrics and the evaluator have to design a questionnaire to enclose the interesting metrics Additionally the evaluator has to fix the area of application in order to not misuse the methodology Following evaluator has to send a questionnaire to the users in order to get opinions from e
31. the Editor Script after the loading executing only the part of the script corresponding to the creation of the new field These fields are considered as loaded fields Morover in sheet objects user can create calculated expressions and or dimensions from loaded fields but they can only be used in the respective sheet object It means they are not considered fields FFA1 FFA2 OlikView offers a variety of functions to create new fields expressions from all type of imported fields and they are classified in Aggregation Color Conditional Counter functions Date and Time Exponential and Logarithmic Financial Formatting General Numeric Inter 86 record Logical Mapping Mathematical constants and Parameter Free Functions None Null Number interpretation Range Ranking String System and Trigonometric and Hyperbolic FFA3 With these functions a descripitive analysis can be done FFA4 but unfortunately it does not offer predictive functions FFA5 Anyway QlikView can be connected to R project which is an open source programming language and software environment for statistical computing and graphics R has its own language and for this reason user who wants to use its functions must know it The integration of R in QlikView is not very popular yet and for this reason there is not much information on the net and either on the website of QlikView FFA6 QlikView is famous because of its Visual Perspective Linking When a value or
32. to be considered as different fields and for this reason they were called different Although once the database was created the names were changed in order to build relations between tables by a common field Tab 18 shows the fields which take a different name during the simulation Variable Name in Simulation Table Data base field VehBrand_N Policy VehBrand RiskArea_N Policy RiskArea Code_O Policy Code SinistersX Year 2000 2010 PolicyID_S Sinisters PolicyID Guarantees_s Sinisters Guarantees Code_S Sinisters Code CLIENT TABLE LIBRARIES THHHHHHHHHHHHHHHHHEHEHHN library xts library zoo library CASdatasets data freMPL6 HPARAMETERS N 26000 HFIELDS ClientID lt c 1 N Gender lt freMPL6SGender ClientID MariStat lt freMPL6SMariStat ClientID CSP freMPL6SSocioCateg ClientID actual lt as Date 2011 01 01 66 LicAge lt freMPL6SLicAge ClientID LicBeg lt actual LicAge 30 DrivAge freMPL6SDrivAge ClientID DrivBeg actual DrivAge 365 HDATAFRAME df Client data frame ClientlD Gender MariStat CSP LicBeg DrivBeg View df Client HEXPORTATION library foreign write table df Client C Users jorcajo Desktop Tables Client txt sep t row names F HH AUTOTABLE HH HLIBRARIES library xts library zoo library CASdatasets data freMTPL2freq HFIELDS set seed 12342 autos sample 1 dim freMTPL2freq 1 N replace FALSE VehBrand N freMTPL2freqSVehBrand aut
33. to use some of the advanced features only comes with time and experience In my opinion it is one of the tools with a fast learning UEL Browsing facilites QlikView has many icons in toolbars but they are intuitive UEB Right menu can be displayed when the cursor is on the sheet or on a sheet object Then a menu appears in order to change properties from the sheet and the sheet object respectively UEB2 Terminology QlikView has an easy terminology For example all fields are considered as fields except when user is building a sheet object For example if user is building a bar chart one of the fields takes the name of dimension and the other field is called expression The terminology is understable for everybody and it does not require depth knowledge in Data Analysis UETZ Help and documentation Userguide which is for free in the web UEH2 is very detailed for a basic learning Advanced topics must be searched in the net UEH3 for example in QlikView s community where users can ask questions and an expert user will answer them UEH Moreover there is a option in QlikView web where user can ask questions to a group of experts in the nearest sales office and it is free Support training It is important to remark that OlikView personal edition does not have any type of support offered by QlikView Servers users has the option to assist to courses delivered in person at Public classrooms or privately where an i
34. translates the queries of the BI tool to queries in the particular database language Since now some of these tools were complex and technical users were the responsible to use them Currently Self Service BI tools have been appeared which have revolutionized the BI world The main targets of Self Service BI tools is to ease the analysis of data do fast analysis and get understandable information beneficial for the business In order to get easy analysis they generally have intuitive interfaces To get fast analysis they load data in RAM Random Assigned Memory memory and the tool works on it By this way they do not need to access data in the hard disk which implies spending more time Finally they offer a huge variety of graphs and many ways to report the results attractively Data analysis are done by creating new fields filtering data visualizing data in graphics or grids When a user creates data visualizations like graphics or grids in many of the Self Service BI tools the user can interact with them easing the discovery of the characteristics of data User creates graphics in interactive sheets called dashboards there can be many dashboards in the same project The graphics can be exported as static images or user can save the whole project to interact with the graphs Most of Self Service BI tools have the option to report the results creating a PDF file with graphics text added by the user links images etc Moreover
35. useful before load them 16 Automatic measures creation FFI16 The ability of the tool to automatically create some measures possibly useful from the already loaded data 17 Allow renaming datasets FFI17 It evaluates the capability to assign a name to datasets which should be loaded in the application 18 Allow renaming fields FFI18 It evaluates the capability to rename fields It can be useful when user has not named the fields in the database by himself and prefers to rename them with more appropriate names to the analysis Renaming fields before the loading is the best choice but in some applications it can be done after the loading and it is equal evaluated 19 Data cleansing FFI19 It evaluates the capability of the applications to allow user to clean data For example drop registers with null values or substitute particular values Data model This sub characteristic includes various sub metrics to evaluate the modelling process for each tool In order to clarify some of the proposed metrics the database 20141220 Initial test is used with examples 1 Data model is done automatically FFD1 It refers to the capability of the applications to relate automatically tables Some applications relate two tables if they have fields with same name and structure therefore these applications model data automatically Sinisters Guarantees IRiskAreaXGuarantees Fig 4 The correct data model for 20141220 Initial test
36. wants Others tools require to load the table as many time as relations it will have iv Analysis This sub characteristic includes several metrics about the capabilities of the analysis 1 10 11 12 Creating new measures based on previous measures FFA1 All the applications analyzed in that project must be able to create a measure based on already loaded measures This sub characteristic evaluates how easy is to build new measures based on loaded measures The creation of new measure based on dimensions FFA2 This sub characteristic evaluates how easy is to build new measures based on loaded dimensions Variety of functions FFA3 It measures the diversity of functions offered by the application to build a new field Applications can offer functions related with statistics economics mathematics and also with strings and logic functions Descriptive statistics FFA4 It refers to the possibility to analyze data statistically from a descriptive point of view All the applications analyzed in that project are able to do descriptive statistics Therefore this metric evaluates the complexity of the descriptive statistic allowed in each program Predictive Statistics FFA5 It measures the ability of getting indicators by predictive functions It is not a common feature in Self Service BI tools and because of that the presence of few predictive methods will be positively evaluated R Connection FFA6 It evaluates the capability
37. 0 dImenslorns oo ed a aR 13 3 1 2 Level 1 categories d eder date emiten taies ud 14 Sue owel2zcharacterislibsS ee oett be e eidem qe 14 344 Level 3 MeIII6GS oo AEn exit uto ee voip o RS 16 32 Algorithiri ii ie ES ripa I 16 3 2 1 Produci SOIWADE estela de odd aquo eine 16 Q2 Development POCOS Sii 17 3 3 Adoption of the systemic quality model SQMO 18 3 3 1 Scales of MOS init ida d pg iet dione des 20 3 3 2 JTheconcept or satisfaciendo 24 3 4 Sub characteristics and metrics for Self Service Bl tools evaluation 26 3 4 1 Functionality CACA AAA ERES 27 3 4 2 Usability category tt a 32 3 4 3 Efficiency CA do O 34 Software selection for the 35 4 1 AlgorithM goto ctv 35 4 2 The 4 evaluated E TT 40 Dr mM 41 5 1 Relatiorial data model en er 41 5 2 20141220 Initial test t ret teet t eee a Guanes leas 42 5 3 nt re A dese 44 5 3 1 AE A tea ee 45 532 Tabl e ctn potro ut toe tet Rente tete uae edite 46 A cH c cc T NM T 46 5 3 4 RiskArea table rtt m ett ees 47 5 35 Guarantees Table noter ere esta pe AS 48 5 3 6 RiskAreaXGuarantees table bi 49 5 3 7 E 50 538 eta dn edid eig ehe y qe rp Diego apes 51 5 3 9 59IDISIel Stable oret e ttam bete lia tabe ee
38. 00 1 1 fi lt as Date 2010 12 31 dates lt as Date inici fi origin 1970 1 1 The vector dates is modified in order to delete all the 29th February to prevent errors dates_sinisters lt dates for i in 1 length dates_sinisters if day dates sinisters i 2229 amp amp month dates_sinisters i 02 dates sinisters i dates sinisters i 1 HPROBABILITIES OF SINISTER every client has a probability of 0 2 to have a sinister as minimum p lt rep 0 2 N Some characteristics make this probability increases for iin 1 N if DrivAge i lt 24 p i lt p i 0 1 if LicAge i lt 12 p i lt p i 0 2 if DrivAge i gt 50 amp amp DrivAge i lt 65 amp amp Gender i Male p i lt p i 0 2 if DrivAge i gt 40 amp amp DrivAge i lt 45 amp amp Gender i Female p i lt p i 0 2 SINISTERDATE FIELD AND SINISTERXYEAR MATRIX Hit spend 20 minutes 71 Sinisterdate lt as Date 1970 1 1 SinistersXYear lt matrix data 0 nrow N ncol 11 set seed 12342 for i in 1 N nsinisters lt rep 0 11 for j in 1 11 prob p i s NULL for k in 1 3 HA maximum of 3 accidents per year s k sample c 0 1 1 replacez TRUE prob c 1 prob prob if s k 0 prob prob 0 1 HAfter a happening a sinister the probability to have a sinister decreases if sum s zO Sinisterdate 2 sample dates sinisters sum s replace FALSE year Sinisterdate_2 lt 2000 j 1 Sinis
39. 1 i Languages This sub characteristic is composed by a metric which evaluates the variety of languages displayable in the tool 1 Languages displayed FILI It evaluates the variety of displayed languages offered by the tool In particular it evaluates if the tool can be displayed in more than two languages or not 2 Portability This sub characteristic is composed by three metrics which evaluates the ability of a tool to be executed in different environments 3 Operating systems FIP1 This metric measures the variety of different operating systems compatible with the tool In particular it evaluates if the tool can work at least in two different operating systems 4 SaaS Web FIP2 The acronym SaaS means Software as a Service This metric evaluates if a tool offers access to projects via web browser for hosting their own deployments in the cloud 5 Mobile FIP3 It evaluates the possibility to have reports and dashboards available in the mobile device via a mobile app ii Use project by third parts This sub characteristic is composed by a unique metric and it measures the capability of sharing and modifying projects by other people 1 Using the project by third parts FIUI It evaluates the capability to share projects and modify them by other users iii Data exchange This sub characteristic is composed by metrics which evaluate the data exportation when they have already been manipulated in the tool 1 Exportation in txt
40. 100 00 12 100 00 100 00 gt r gt 3 2 3 2 3 2 3 2 3 3 3 3 3 3 MP UEL1 Average learning time Consistency between icons in the toolbars and their actions 9 75 00 Displaying right click menus 9 75 00 100 00 Ease of understanding the terminology 100 00 100 00 EASE OF User guide quality 2 6 75 0096 UNDERSTANDING User guide adquisition 2 75 0096 OS On line documentation 2 6 75 00 100 00 Availability of tailor made training courses 3 6 75 00 1 3 6 75 00 1 3 6 75 00 1 3 6 75 00 1 3 6 75 00 1 Community 3 6 75 00 1 0 0 0 3 6 1 3 6 1 2 4 al Phone technical support On line support Availability of consulting services Free formation gt gt gt gt gt l GRAPHICAL UGW1 Editing elements by double clicking 0 00 INTERFACE UGW2 Dragging and dropping elements 75 00 CHARACTERISTIC UGD1 Editing the screen layout 75 00 OPERABILITY UOV1 Automatic update 50 0096 100 00 100 00 gt 2 2 2 2 2 2 2 2 2 2 N gt EEC1 Compilation Speed CPU processor type 100 00 EXECUTION PERFORMANCE Minimum RAM 75 00 N DIN Dip gt Hard disk space required 100 00 100 00 Additional software requirements 100 00 100 00 80 SAP Lumira FIT TO PURPOSE
41. 13 Numeric fields Field name Description Min The cost which is 25 responsible the company Categorical fields Field name Description Values Guarantees Guarantees windows travelling driver insurance claims fire theft total loss health assistance Tab 13 Guarantees table description Both fields were created by ourselves Finally the resulting Guarantees table Fig 12 has 8 registers and 2 columns Guarantees Base windows 50 travelling 500 driver insurance 100 claims 25 fire 1000 theft 2000 total loss 3000 health assistance 300 Fig 12 Guarantees table 48 5 3 6 RiskAreaXGuarantees table RiskAreaXGuarantees table corresponds to a dimension table which provides the relation between RiskArea table and Guarantees table It shows which guarantees are offered by each risk area And it is composed by the following fields RiskArea and Guarantees They are described in Tab 14 Categorical fields Field name Description Values RiskArea Risk area 1 2 3 45 identification Guarantees Guarantees windows travelling driver insurance claims fire theft total loss health assistance Tab 14 RiskXGuarantees table description We created the RiskArea field repeating the value of each risk area as many times as guarantees it offers And Guarantees is the field with the respective guarantees offered by eac
42. 2 Y L x Q LLI E x Y lt gt Interuniversity Master in Statistics and Operations Research Title Business Intelligence s Self Service tools evaluation Author Jordina Orcajo Advisor Pau Fonseca Department Statistics and Operative Research University UPC UB Academic year 2015 BARCELONATEC Facultat de Matem tiques i Estad stica UNIVERSITAT DE BARCELONA Facultat de Matematiques i Estad stica Universitat Polit cnica de Catalunya Master s degree thesis Business Intelligence s Self Service tools evaluation Jordina Orcajo Hern ndez Director Pau Fonseca Department of Statistics and Operational Research 1 Abstract This project proposes a comparison analysis between four different tools called Self Service tools from the Business Intelligence area The comparison was done adapting a Systemic Quality Model already formalized and using a database simulated with R In order to assess the quality of this tipe of software seven 7 characteristics and eighty two 82 metrics were considered Index 1 2 Dirige EE 5 Introduction i c oco ettet tere ehe t e ettet tt ete a Aa 8 2 1 APPO a aaa 9 2 2 Introduction TO Blsystems its 9 2 3 BUSES o dente tette derer Dedit deseen A Et iut ore Sah etu 11 Methodology ete ete is 13 3 1 The systemic quality model SQMO cconcoccccnnocnccnononanonanonnncnonnncnnonnonnnononnncnnnnnnnnnnns 13 3 1 1 Level
43. 6 75 00 1 FFI5 Excel files 3 3 9 75 00 1 From an excel file import all sheets FFI6 at the same time A 0 2 0 00 FFI7 Cross tabs A 25 00 FFI8 Plain text 75 00 Connecting to different data sources FFI9 at the same time A 3 2 6 75 0096 1 Easy integration of many data FFI10 sources 3 2 6 75 00 1 11 Visualizing data before the loading 3 a 6 75 00 1 FFI12 Determining data format 3 2 6 75 00 1 FFI13 Determining data type A 3 2 6 75 00 1 Allowing column filtering before the FFI14 loading 3 2 6 75 00 a Allowing row filtering before the FFI15 loading A O 2 0 0 00 0 FFI16 Automatic measures creation A 3 3 9 75 0096 T FFI17 Allow renaming datasets 3 2 6 75 00 FFI18 Allow renaming fields 3 3 9 75 0096 1 FFI19 Data cleansing A 2 2 4 50 00 1 FFD1 Data model is done automatically 4 2 8 100 00 1 The done data model is the correct FFD2 one 3 2 6 75 00 Hm FFD3 Data model can be visualized 1 3 3 25 0096 0 FFF1 Alerting about circular references 0 3 0 0 00 0 FFF2 Skiping with circular references A 0 3 0 0 00 0 A same table can be used several FFF3 times 0 2 0 0 00 0 Creating new measures based on FFA1 previous measures A 3 Bi 75 00 1 Creating new measures based on FFA2 dimensions A 2 3 6 50 0096 a FFA3 Variety of functions 4 3 12 100 00 1 FFA4 Descriptive statistics A B 2 6 75 00 al FFA5 Preduction functions A E 2 4 50 0096 d FFA6 R connection 3 2 6 75 0096 1 FFA7 G
44. Area_N i 5 RiskArea_N i 11 RiskArea N i 3 if RiskArea_N i 9 RiskArea_N i 6 RiskArea N i 4 if RiskArea_N i 10 RiskArea_N i 7 RiskArea N i 5 RiskArea lt levels factor RiskArea_N RISKAREADESC RiskAreadesc lt c gold silver master DATAFRAME df_RiskArea lt data frame RiskArea RiskAreadesc View df RiskArea HEXPORTATION library foreign write table df RiskArea C Users jorcajo Desktop Tables RiskArea txt sep t row names F plus regular HH RISKAREAXGUARANTEES HFIELDS RiskArea guarantees c rep 1 8 rep 2 7 rep 3 6 rep 4 5 rep 5 3 g1 c windows travelling driver insurance claims fire theft total loss health assistance g2 c windows travelling driver insurance claims fire theft health assistance g3 c windows driver insurance claims fire theft health assistance 69 g4 lt c windows driver insurance fire theft health assistance g5 c windows driver insurance health assistance Guarantees c g1 g2 g3 g4 g5 HDATAFRAME df_RiskAreaXGuarantees lt data frame RiskArea_guarantees Guarantees View df RiskAreaXGuarantees HEXPORTATION library foreign write table df RiskAreaXGuarantees C Users jorcajo Desktop Tables RiskAreaXGuarantees txt sep Xt row names F HH POLICY TABLE HFIELDS PolicyID c 1 N prob Population sum
45. BI is to allow the easy interpretation of these large volumes of data In particular the Self Service BI aims to boost that the company is able to get useful information from their own data The idea behind deploying self service software is to empower business people to analyze and understand data without specialized expertise There are many benefits that can be derived through the implementation of a self service BI system Functional workers can make faster better decisions because they no longer have to wait during long reporting backlogs At the same time technical teams will be freed from the burden of satisfying end user report requests so they can focus their efforts on more strategic IT initiatives There are many Self Service BI tools in the market and before recommending a particular one a depth analysis of the available tools on the market must be done according to own requirements And because of this the aim of this thesis is to build a comparative assessment of Self Service BI tools adapting a Systemic Quality Model SQMO and apply this methodology in the evaluation of four 4 particular tools In order to accomplish this first of all we had to learn how to use Self Service BI tools in order to know its operation what they can do and understand how useful they are for the BI sector Knowing with a minimum level of depth tools in order to evaluate them demands spending much time in addition to technical and functional kn
46. Code This field corresponds to the code of the region where the policy is registered On the other hand Sinisters table is also connected to Region table by the field Code But this time it corresponds to the code of the region where the accident had happened Both relations have different meaning but they are related to the same table We added these circular references in order to know how the Self Service applications managed them Skipping circular references can be done easily by duplicating tables In fact we have loaded two tables identically equal to Region table one is related to Policy table and the other one to Sinisters table But this action implies the use of more memory and it is not a recommended The fact that we decided to simulate a car s insurance company database is due to it is a common case of use in consultancy like INDRA Moreover we were lucky to know an actuarial expert who offered us some information about the car s insurance area RiskArea RiskAreaXGuarantees nm Fig 7 Data model for 20141220 Initial test 43 The main point of the thesis was not to do an accurate analysis of data For this reason the simulation was just a way of getting data and they cannot be considered as real data because the process to get them is just a roughly approximation We obtained some data from two existing datasets of R In order to not have to invent all data although some fields were invented by ourselv
47. Fig 15 SinistersXYear table 5 3 9 Sinisters table This table corresponds to a fact table Error No se encuentra el origen de la referencia including information about the accidents and it is composed by the following fields PolicyID RiskArea Guarantees Sinisterdate and Code They are described in Numeric fields Field name Description Min Max PolicyID Policy identification 1 26 000 Date fields Field name Description Min Max Sinisterdate Policy starts date 2000 01 01 2010 12 31 Categoric fields Field name Description Values RiskArea Risk Area included 172 34 5 in the policy Guarantees Guarantees windows travelling driver insurance claims fire theft total loss health assistance Region identification 1 2 3 52 different values Tab 17 Sinisters table description This table is based on SinistersXYears table because SinistersXYears table fixes the amount of sinister for each policy Data for Sinisterdate are random dates with the year fixed for the SinistersXYears table If a policy does not have any accident along the 11 years it also appears in the table but with 9999 01 01 as Sinisterdate It is not usual to add a policy without any accident in that type of tables but it is useful to analyse how the applications manage null values To create Code field random numbers referring to the code of the region from 1 to 52 were ass
48. Population Hprobabilities of each region depending on its population set seed 12342 Code O sample c 1 52 replacez TRUE prob prob set seed 12342 inici as Date 2000 1 1 fi as Date 2010 12 31 dates lt as Date inici fi origin 1970 1 1 RecordBeg sample dates N replace T RecordEnd lt rep 0 N RecordEnd as Date RecordEnd origin 1970 1 1 Supposing that the 70 of the clients keep Htheir policy during the following 10 years for i in 1 N x lt sample c 0 1 1 replaceZ TRUE 0 7 0 3 0 means that there is no an end date for the policy if x 0 RecordEnd i lt 9999 01 01 else Supposing that policies keep one year as minimum in the company inici lt RecordBeg i 365 dates lt as Date inici fi origin 1970 1 1 RecordEnd i lt sample dates 1 VehAge lt freMTPL2freqSVehAge autos actual lt as Date 2011 01 01 VehBeg lt actual VehAge 365 BonusMalus freMPLeSBonusMalus ClientID 70 HDATAFRAME df Policy data frame PolicyID ClientID RecordBeg RecordEnd VehBeg VehBrand BonusMalus RiskArea N Code O View df Policy HRiskArea N is created in RiskArea script HEXPORTATION library foreign write table df Policy C Users jorcajo Desktop Tables Policy txt sep t row names F HH SINISTERS TABLE AND SINISTERXYEAR TABLE HLIBRARIES library lubridate lib loc C Program Files R R 3 1 1 library inici lt as Date 20
49. QlikView application without re running loading script These dynamic updates must be written as macros in the Edit Module There is no any bottom to do that automatically If there is not many data and the update is punctual user can reload the script from the Editor Script In order to get updated charts But if there are many data and the updates are usual is recommended to write a macro and the update will be automatic FFA18 Finally QlikView Desktop has not present any problem displaying big amount of data with the 20141220 Initial test database 17 Dashboards QlikView offers plain freedom to design dashboards In QlikView dashboards are composed by sheets OlikView offers sheet objects of type List box Statistic box Multibox Table box Chart Input Box Current Selections Box Button Text Objects Line Arrow objects Slider Calendar objects Bookmark object Searchobject Container and Custom object User can also change the background color of the sheet FFD3 There are many examples in the net and user can take them as reference but there are not templates included in QlikView Althought user can save settings for particular charts in order to use its settings more times FFD2 User can save sheets individually in PDF file and print them Moreover images can also be copied to clipboard or saved On the other hand user can export sheets directly to email word file and Excel files using the macros in the Editor
50. ad datasets Finally it also can run freehand SQL on a database to load datasets Some examples are Microsoft Access FFI4 Apache Hive FFI3 Cloudera IBM DB2 IBM Netezza Microsoft SQL Server Oracle Sybase Mongodb Teradata Then SAP Lumira can be connected to many different types of Big Data sources FFI2 In order to connect a database to SAP Lumira the requirement is to be provided by a JDBC connector to access to data from foreign applications because the connectors are not integrated in the application FF As the way to download data is by SQL code user can create a query with its requisites for example loading only columns that he wants or adding filters it means user can clean data before loading them When user is loading data from Excel files there is the option to append all sheets and then all worksheets in the workbook are added to the dataset appending the common columns and adding as new columns the different ones FFI6 With excel files user can also choose which columns and which rows he wants to import There is also the option to load cross tables just selecting the option FFI7 The connection to many data sources at same time is possible and easy just repeating the same process to acquire for each different database FFI9 The integration works well because a graph must come from a unique dataset Then datasets are always separated and user just have to select in which dataset he wants to work Moreover use
51. aims of the organization was built The particular area of application is Business Intelligence and there are many platforms specialized in this area in the market Therefore we focus on which have been mentioned in the report Magic Quadrant for Business Intelligence and Analytics Platforms February 2015 from Gartner Gartner is an information technology research and advisory company which presents every year different market research reports on IT products Magic Quadrant is an annual report that reflects the innovations and changes that are driving the BI market and shows the relative position of each competitor in the business analytics space They consider all tools in the market and if these tools met the inclusion criteria they are included in the evaluation In this first step we used Gartner as a data source of all Business Intelligence and Analytics platforms in the market Each year it edits an updates reports and also their inclusion criteria changes depending on how the market changes so it is a reference company to have knowledge of BI tools By this way all the tools mentioned in the Magic Quadrant report of February 2015 although Gartner finally have not evaluated them composed our long list of 63 different platforms which is the following 35 1 Adaptive Insights 2 Advizor Solutions 3 AFS Technologies 4 Alteryx 5 Antivia 6 Arcplan 7 Automated Insgihts 8 BeyondCore 9 Birst 10 Bitam 11 Board Internatio
52. allam R L Hostmann B Schlegel K Tapadinhas J Parenteau J amp Oestreich T W 2015 Magic Quadrant for Business Intelligence and Analytics Platforms wikipedia n d Provincias y ciudades aut nomas de Espa a Retrieved from http es wikipedia org wiki Anexo Provincias y ciudades aut96C396B3nomas de Esp a C3 B1a 63 9 Figures index Fig 1 Diagram of the systemic quality model SQMO Callaos amp Callaos 1996 13 Fig 2 Diagram of the adapted Systemic Quality Model Rincon Alvarez Perez amp Hernandez 2005 AAA eve AA E tera Beca 19 Fig 3 Characteristic schema for each category according to Mendoza P rez amp Grim n 2005 PA Ex RE 26 Fig 4 The correct data model for 20141220 Initial test data 28 Fig 5 Circular Reference oc oet RE RR RE e OR Ed EE 29 Fig 6 Schema for the selection process ener nnne enne nn 40 Fig 7 Data model for 20141220 Initial test nennen 43 Eig 9 Ch nt table tr Re t HEP ated otra eae pee bee aires 45 Big 9 Auto tables ideada 46 Eig 10 Region table oiii reet ter AE E EATE EH du banana 47 Eis T1 RiskArea table ener ree ep e ete e ER RI ct data 48 Fis 12 Guarantees m epe e at e E ER aa 48 Fig 13 RiskAreaXGuarariees table neret npe e then erii tnn ihi cdta 49 Fig TA Policy tableu so 51 Fig 15 SinistersXYear table nro tete 52
53. an create a case with a particular question and with a couple of days an expert will answer it Moreover there is a community that can also solve problems Additionally Tableau also offers consulting services UES4 There are also a community where users can share their doubts UES Windows and mouse interface Elements can be edited by double clicking Moreover Tableau Works with dragging and dropping elements UGW2 Display User can set the screen layout UGD 111 Versatility The upgrades are not automatic and user must go to the web to upgrade the version UOV Compilation Speed During the test it is not the tool which is faster but it runs acceptably EEC Resources Utilization Tableau can be installed in processors x86 and also in x64 EER The minimum required RAM are 2GB EER2 and the Hard disk space required is 750MB EER3 Software requirements Non extra softwares are required excepto f a browser an adobe reader and a Flash Player EESTI 112 Annex 7 Reporting examples In this chapter there are some charts created by the 4 evaluated Self Service BI applications These charts are showed in order to introduce their functionality to the reader The current Heat Map shows the amount of sinisters and the population by region Block s color is based on the amount of sinisi sed on the regio Granada differs from other regions with the same population in the amount of sinisters It a
54. and he has choose a board or a report page he can select a dimension and in the board there will appear a input control with all the values of the selected dimension and user can filter by them FFR5 Languages SAP Lumira is displayed in several languages Deutsch English Spanish French Hungarian Polish Portuguese Japanese and simplified Chinese FFLI Portability SAP Lumira only works in Windows 7 8 with 64 or 32 bit Windows Server 2008 with 64 bit and Windows 2012 operative systems FIP1 SAP Lumira Cloud is the cloud computing SaaS of SAP and it is free But SAP Lumira Cloud has some limitations for example predictive charts are not supported FIP2 SAP Lumira Cloud can be also accessed from any mobile device supporting HTMLS but there is not any particular mobile application for SAP Lumira There exist the free mobile application SAP Business Object Explorer connected to the SAP Business Object Explorer that can visualize documents built in firstly in SAP Lumira and then published in SAP Business Object Explorer 2 Data exchange and Using the project by third parts User can share data charts and the full project Datasets can be exported to csv or Microsoft Excel file and published to SAP HANA FID1 FID4 Data cannot be exported in txt FID2 files or HTML files FID4 Charts can be shared via email and printing or saving it in PDF but they can t be saved in Excel files Moreover user can also publish proj
55. are computed using the weights assigned to the metrics The final score of a sub characteristic corresponds to the following formula n _ Wj SCOY E sub characteristici Y w jd Where vj is the value for the score assigned to metric j while is the weight for the corresponding metric And n corresponds to the number of metrics in the sub characteristic This adaption of the methodology is applied when the importance level of the metrics is not the same in all metrics By this way we got a score for each sub characteristic considering the weights of metrics Tab 6 shows the weights assigned to each metric in that case of use Recall that metrics are defined in the next sub chapter 3 4 METRIC WEIGHT 21 i 2 2 Determining data format Determining data type Alerting about circular references Creating sets of data Filtering data by expression Filtering data by dimension Visual Perspective Linking Data model can be visualized 3 3 Time hierarchy 3 3 3 2 3 2 3 2 2 2 3 3 2 3 2 2 2 2 2 2 2 No Null data specifications Considering nulls 3 22 23 Tab 6 Weights of metrics 2 2 2 2 2 2 2 2 2 2 3 3 2 The concept of satisfaction As it is said in sub chapters 3 2 1 and 3 2 2 the term satisfaction can vary depending on the case of use In fact the evaluator can assign a limit for example 50 and sentence that a feature is satisfied 1f its
56. arning time 50 00 50 00 Consistency between icons in the UEB1 toolbars and their actions A 75 00 1 0 00 Displaying right click menus Ease of understanding the terminology 9 75 0096 1 75 00 EASE OF UEH1 User guide quality 2 4 2 8 100 00 UNDERSTANDING UEH2 User guide adquisition 3 2 6 75 00 1 IS On line documentation 3 al 6 75 00 100 00 Availability of tailor made training 100 00 UES1 courses A 3 2 6 75 0096 dt UES2 Phone technical support 2 a 4 50 00 1 UES3 On line support 2 2 4 50 00 1 UES4 Availability of consulting services 3 2 6 75 00 1 UES5 Free formation 4 2 8 100 00 1 UES6 Community 3 2 6 75 00 a 100 00 INTERFACE UGW2 Dragging and dropping elements 3 2 6 75 00 H 50 00 AUS UGD1 Editing the screen layout A 3 2 6 75 00 75 00 100 00 100 00 N gt EEC1 Compilation Speed 25 00 0 25 00 EER1 CPU processor type 100 00 EXECUTION PERFORMANCE EER2 Minimum RAM 6 75 00 T MP 100 00 Hard disk space required 50 00 66 67 IN Additional software requirements 4 2 8 100 00 1 100 00 78 QlikView FIT TO PURPOSE FFI1 Direct connection to d
57. as Baleares Jaen Leon Lerida Lugo Madrid Malaga Murcia Navarra Orense Palencia Palmas Pontevedra La Rioja Salamanca Segovia Sevilla Soria Tarragona Teruel Tenerife Toledo Valencia Valladolid Vizacaia Zamora Zaragoza Melilla Ceuta Population c 319227 402318 1934127 702819 1081487 172704 693921 5529099 375657 415446 1243519 593121 604344 530175 805857 1147124 219138 756810 924550 256461 709607 521968 228361 1113114 670600 529799 442308 351350 6489680 1625827 1470069 642051 333257 171668 1096980 963511 322955 352986 164169 1928962 95223 811401 144607 1029789 707242 2578719 534874 1155772 191612 973325 78476 82376 HDATAFRAME df_Region lt data frame Code Region Population View df Region HEXPORTATION library foreign write table df Region C Users jorcajo Desktop Tables Region txt sep t row names F 68 HH RISKAREA TABLE HFIELDS RiskArea_N lt freMPL6SRiskArea ClientID Clusters RiskArea in 5 group depending on their frequency gt sort table freMPL6SRiskArea 113 12 23 4 8 511 9 6 10 7 13 25 57 345 471 895 1236 1535 3022 3828 3864 4620 6089 THHHHHHHHHHHHHHHHHHHHH HH H HERE for i in 1 N f RiskArea_N i 1 RiskArea_N i 13 RiskArea_N i 12 RiskArea N i 1 if RiskArea N i 22 RiskArea_N i 3 RiskArea_N i 4 RiskArea N i 2 if RiskArea_N i 8 Risk
58. ata analysis Because of its popularity among clients it was evaluated SAP Lumira was chosen because of the huge number of SAP implementations in management systems SAP had been incorporated recently in the Data discovery with SAP Lumira but its success in management systems and its huge number of implementations maked it a natural competitor to consider As many clients had implemented SAP in their management system they surely would opt to implement SAP Lumira thinking in a better integration with their system and an easier architecture because of a unique provider 40 5 Data In order to use and evaluate the applications we needed a set of data and we decided to simulate it The data set was simulated using the tool R and it was constructed replying an car insurance company database and using a relational structure 5 1 Relational data model A relational database is based on the relational model developed by E F Codd In such models data are organized into tables related one to each other by at least one common field The main important properties of relational data model are that e Data are presented as a collection of relations between tables e Each relation is defined by one or more column field in common between tables e Columns are attributes that belong to the entity modelled by the table ex In a client table you could have name gender birthday etc e Fach row also called tuple represents a single entity ex
59. ata sources A 2 2 4 50 00 1 FFI2 BigData sources A 2 1 2 50 00 1 FFI3 Apache Hadoop A 2 1 2 50 00 1 FFI4 Microsoft Access A 2 2 4 50 00 1 FFI5 Excel files A 3 3 9 75 00 1 From an excel file import all sheets FFI6 at the same time A 2 25 00 0 FFI7 Cross tabs A 4 8 100 00 1 FFI8 Plain text A 75 00 1 Connecting to different data sources FFI9 atthe same time A 3 2 6 75 0096 1 Easy integration of many data FFI10 sources A 3 2 6 75 0096 1 11 Visualizing data before the loading A 2 2 25 00 0 FFI12 Determining data format A 3 2 6 75 00 1 FFI13 Determining data type A 4 2 8 100 00 1 Allowing column filtering before the FFI14 loading A 3 2 6 75 00 1 Allowing row filtering before the FFI15 loading A 2 2 4 50 00 1 FFI16 Automatic measures creation A 0 3 0 0 00 O FFI17 Allow renaming datasets A B 2 6 75 0096 1 FFI18 Allow renaming fields A 3 3 9 75 00 1 FFI19 Data cleansing 2 7 4 50 00 1 FFD1 Data model is done automatically A 3 2 6 75 00 1 The done data model is the correct FFD2 one A 2 2 4 50 00 1 FFD3 Data model can be visualized A 4 3 12 100 00 1 FFF1 Alerting about circular references A 3 a 75 00 1 FFF2 Skiping with circular references A 3 3 75 00 1 A same table can be used several FFF3 times A 0 2 0 0 00 O Creating new measures based on FFA1 previous measures A 3 9 75 00 1 Creating new measures based on FFA2 dimensions A 5 3 B 75 00 1 FFA3 Variety o
60. ation for the region where the policy is registered Region table have other fields additionally to the Code as the name of the region On the other hand Sinisters table is also related to Region table by the field Code which refers to the code identification for the region where accidents occur Fig 5 Circular Reference Policy id Code of the Region Region Tab 8 Circular reference In that particular case some applications could show non correct values for Region because they could doubt about which way take in order to reach the Region table If it passes by Policy table then it shows regions where the policy is registered but if it passes by the Sinister table it shows regions where accidents ocure This metric evaluates the capability of a tool to realize about a circular reference and alert the user about it 29 2 Skipping circular references FFF2 This sub characteristic evaluates the capability of the software to omit circular references A same table can be used several times FFF3 It evaluates the capability of the application to use a table directly related to more than one table For example if there 1s a table with coordinates it can be related with more than one table for example with two tables where in the first one table there is a place of birth and in the second one there is a place of death Some tools allows to load just once the table and use it as many times as the user
61. ation of socio professional categories CSP was conceived by INSEE in 1954 The objective was to categorize individuals according to their professional situation taking account of several criteria their profession economic activity qualification hierarchical position and status 45 5 3 2 Auto table Auto table corresponds to a dimension table including vehicle information and it is composed by the following fields VehBrand VehPow and VehType Fields are described in Tab 10 Categoric fields Field name Description Values VehBrand Vehicle brand 1 gt 2 34 56 107 117 12 13 14 VehPow Vehicle Power CITI VehType Vehicle type familiar compact sport terrain Tab 10Auto table description Values for VehBrand and VehPow were extracted from freMTPL2freq dataset Data for VehBrand field were extracted selecting 26 000 random values of the vehicle brand field from freMTPL2freq But only the distinct values are stored in VehBrand of the current table In fact the 26 000 registers for the vehicle brand are stored in the field VehBrand from Policy table And by this way Auto table and Policy table will be related by the common field VehBrand We can say that Auto table is a parent table of Policy table because extends the information about the vehicle brand Moreover VehPow is the field with the corresponding powers for each one of the VehBrand values At the last point VehType values were assigned acco
62. ause MicroStrategy Anlytics does not offer the option to visualize the data model Analysis During the data analysis user can create new metrics called derived metrics based on attributes and metrics that have already been loaded to a dashboard FFA1 A derived metric performs a calculation on the fly with the data available on dashboard without re executing the dashboard against the data source Derived metrics are saved and displayed only in the specific 102 dashboard in which they are created MicroStrategy Analytics offers a plethora of functions to create new metrics e User can create easily a new metric based on an arithmetic calculation x from two metrics already in the dataset User can also create a new metric by aggregation functions as create a new metric to calculate a running total moving total assigning a numeric rank to each value in a metric displaying metric values as percentages of a cumulative total combining the values of two or more metrics e User can also create a metric based on a dimension FFA2 with the functions Average Count Maximum Minimum Standard Deviation Sum and Variance If the attribute is not numeric it seems to don t have sense But if user wants to use a dimension to create a new measure for example He wants to create a new metric with value 1 if gender is Male and 0 if the gender is Female he cannot do it directly because MicroStrategy formulas only support metr
63. ause of that the structure of that is more special than others It is composed by the following fields PolicyID Sinisters 2000 2001 2010 They are described in Tab 16 Qualifier field Field name Description ClientID Policy identification Attribute field 2001 Year 2002 Year 2010 Year Data field Field name Description Sinisters Amount of sinisters Tab 16 SinistersXYears table description In order to assign the amount of accidents per year to each policy we assigned a probability of 0 2 to have an accident in a year to every policy Depending on the characteristics of the client the probability could be increase e If client is younger than 24 the probability increases in 0 1 e If license is less than 12 months old the probability increases 0 2 e If client is between 50 and 65 years old and is a Male the probability increases in 0 2 e If client it is between 40 and 45 years old and is a woman the probability increases in 0 2 51 After one accident the probability of accident decreases on 0 1 And we supposed that a policy could not have more than 3 accidents in the same year The resulting SinistersXYear table Fig 15 SinistersXYear tableFig 15 has 26 000 registers and 12 columns The first column corresponds to the ClientID field and each one of the others corresponds to a year from 2000 to 2010 ClientiD 2000 2001 2002 2003 1 0 0 0 1 2 2 0 0 0 3 1 0 1 0 E 1 0 1 0
64. based on defined relationships These relationships enables users to retrieve and combine data from one or more tables with a single query In the relational model every row must have a unique identification or primary key on the data For example a social security account number SSAN can be a key that uniquely identifies each row The relations between tables are done by this primary key field which uniquely identifies rows In chapter 6 relational data model is explained in more detail Nowadays there are other ways of storing data besides the relational database model although it is still the most used The need for the data to be well structured actually has become a substantial burden with extremely large volumes with result the decline on performance as size gets bigger Thus relational DBMS is generally not thought of as a scalable solution to meet the needs of big data Other database infrastructure has appeared they are called NoSQL which represents a completely different framework of databases Unlike relational databases that are highly structured NoSQL databases are unstructured in nature trading off stringent consistency requirements for speed and agility Unstructured data may be stored across multiple processing nodes and often across multiple servers This distributed architecture allows NoSQL databases to be horizontally scalable as data continues to explode just add more hardware to keep up with no slowdown in per
65. bject And particularly the charts are Bar chart Line chart Radar chart Gauge chart Mekko chart Scatter chart Grid chart Pie chart Funnel chart and Block chart It is not the tool which offers more distinc charts but it has a good selection FFA 5 Charts offer a variety of customizing settings Dimension limits Sort Style Presentation Axes Colors Number format 87 Font Layout and Caption and user can modify them whenever he wants It is the tool of the evaluated which offers more flexibility in the design of charts FFA16 Although it is not the tool with a more variety of different graphs thanks to the graphs settings user can get similar graphs to graphs done with other tools For example OlikView does not offer a Heat map but it offers a Block Chart The difference between them is that in a Heat map two expressions can be displayed in addition to a dimension While with Block chart only one expression and one dimension can be displayed Heat maps can relates the color of the blocks on a expression and the size of the blocks to another expression By this way two expressions can be showed in a block chart and user can realize if there exist any relation between them or not Basically he can visualize if there exist any pattern related with both expressions The following image is an example of a heat map where blocks represents the values of the field Region sinister the size of blocks is based on the population of each re
66. cause not every software is appropriate for every area If the area of application is pre established the selected software will be according with it Secondly a new level of depth should be considered with more specifications about the tool functionality It should consider the features that make the tool useful for what we want to do And finally there is the identification of the required attributes based on the particular aims of the organization who will use the tool Some of these attributes must be mandatory and others must be non mandatory Mandatory attributes are those that must be met by the selected software while non mandatory are those that will be evaluated that are the metrics Therefore this aspect takes an important role in the selection and also in the evaluation 4 1 Algorithm In order to select the software the first step was to decide which tools could be evaluated with this model Nowadays there are many applications in the market related with Business Intelligence And because of that deciding which applications should be included in an evaluation is a laborious task In this stage we were inspired by the methodology for selecting software proposed by Le Blanc In the first place a long list of BI tools was elaborated Next step was to reduce this to a medium list containing popular tools which accomplish critical capabilities for business intelligence and analytics And finally a short list provided with particular
67. cs are explained in detail in the next sub chapter 4 9 There are different types of scale measurement depending on the metric e Type A of scale measurement The main part of metrics are measured by the following scale With this scale metrics are measured from 0 to 4 as follows 0 The application does not have the feature 1 The application matches the feature poorly or it does not matches strictly the feature but it can get similar results 2 The application has the feature and matches the expectations although it needs an extra corporative complement This mark should be also assigned when the feature implies a manual job e g typing code click a bottom and the metric is requiring an automatic job 3 The application has the feature and matches the expectations successfully without a complement 4 The application has the feature and moreover present advantages behind others Even so other metrics need to be measured in a special way e Sub type A 1 of scale measurement is assigned to binary metrics We assign O value if the application does not have the feature and 4 value if the application has it We chose these values in order to be consistent with the rest of the measurement scales e Sub type A 2 of scale measurement is assigned when the metric is measurable We assign 4 to the application with a better result and lower score to the others As there are 4 applications the scale is from 4 to 1 Although if s
68. d Personal Editions are single user editions usually free trials with the same operational characteristics except for the connection to a server And consequently they can have limitations in the projects sharing Moreover the connection of different users to a server usually implies the option of security devices assigning permissions and passwords to data or projects These options are not offered by Personal Editions In this project the evaluated applications are Server Editions although in order to evaluate their operation we used Personal editions which can be installed in local PC and they are offered in their respective corporative webs by free Particularly the used editions were QlikView View Personal Edition is the free trial for QlikView With QlikView Personal Edition user cannot open projects done by another Personal Edition s user and does not have security devices As we said QlikView Personal Edition cannot be connected to a server with other OlikView users On the other hand it can be connected to the same type of databases than QlikView MicroStrategy Analytics Desktop is the free edition for MicroStrategy Analytics Enterprise MicroStrategy Analytics Desktop does not have any problem using projects done by other free edition s users Moreover it can be connected to the same type of databases than the MicroStrategy Analytics But it cannot be connected to a server with other users and it has not security devices
69. d 12342 j lt 1 for i in 1 N if total i 0 for k in O total i 1 4 PolicyID s j k PolicyID i RiskArea_s j k lt RiskArea_N i if Code_O_sinisters j k 25 amp amp Code_O_sinisters j k 19 amp amp Code_O_sinisters j k 15 amp amp Code_O_sinisters j k 2 amp amp Code_O_sinisters j k 30 amp amp Code sinisters j k 24 amp amp month Sinisterdate j k in c 7 8 9 p probs 0 002 p 19 0 104 19 is the position for Granada 73 Code S j k sample Code 1 replace TRUE prob p if RiskArea_N i 1 Guarantees s jr k sample g1 1 if RiskArea N i 22 Guarantees s jr k sample g2 1 if RiskArea N i 23 Guarantees s j k sample g3 1 if RiskArea N i 24 Guarantees s j k sample g4 1 if RiskArea_N i 5 Guarantees_s j k lt sample g5 1 j lt j k 1 if total i 0 PolicyID s j PolicyID i RiskArea s j RiskArea N i Guarantees s j Code S j j lt j 1 Hlength Guarantees_s Hlength PolicyID s Hlength RiskArea s sinisterID rep 0 length sinisterdate2 min lt fi for j in 1 216856 for i in 1 216856 if sinisterdate2 i lt min min sinisterdate2 i sinisterlD i lt j DATAFRAME df Sinisters data frame PolicylD s RiskArea s Guarantees s Sinisterdate Code S View df Sinisters HEXPORTATION 74 library foreign write table df Sinisters C Users jorcajo Desktop Tables Sinisters txt sep t r
70. d protection 75 0096 SECURTIY FSS2 Permissions 75 0096 100 00 75 00 gt gt gt 3 2 0 2 3 2 0 2 3 3 3 3 3 3 MP Average learning time Consistency between icons in the toolbars and their actions 100 00 Displaying right click menus 0 0 00 Ease of understanding the terminology 100 00 EASE OF User guide quality 3 2 6 75 00 UNDERSTANDING User guide adquisition 3 2 6 75 00 ENS On line documentation 75 0096 100 00 Availability of tailor made training courses 2 0 0 00 0 2 6 75 00 m 2 6 75 00 1 a 0 0 00 0 2 6 75 00 1 Community 2 6 75 00 T 2 0 0 2 6 H 2 6 1 2 6 1 Phone technical support On line support Availability of consulting services Free formation gt gt gt gt gt gt gt GRAPHICAL UGW1 Editing elements by double clicking 0 00 INTERFACE UGW2 Dragging and dropping elements 75 00 A CHARAGIERISUIG UGD1 Editing the screen layout 75 0076 OPERABILITY UOV1 Automatic update 3 75 00 75 00 0 3 3 0 3 3 0 3 3 gt N gt EEC1 Compilation Speed CPU processor type 0 00 EXECUTION PERFORMANCE Minimum RAM 75 00 N DIN pje Hard disk space required 25 00 Additional software requirements 100 00 82 Tableau FIT TO PURPOSE
71. d them Determining data format FF112 It evaluates the capability to show data formats integer double date string of the fields before the data loading Some applications assign formats to fields automatically while some others lets the user to assign them before the loading Determining data formats before the loading is the best choice but in some applications it can be done after the loading and it is equal evaluated Determining da a type FFII3 It evaluates the capability to show data types dimension measure of the fields before the data loading Some applications assign Each metric has a corresponding code which is used to identify them In questionnaires from Annex 2 they are also identified by this code like in the detailed evaluations from Annex 3 27 types to fields automatically while some others lets the user to assign them before the loading Depending on the applications terminology data types can be attribute or dimension and measure Determining data types before the loading is the best choice but in some applications it can be done after the loading and it is equal evaluated 14 Allowing column filtering before the loading FFI14 It evaluates the capability to load only the columns that user wants 15 Allowing row filtering before the loading FFI5 It evaluates the capability to filter registers before loading them Sometimes user does not want to analyse the whole dataset and data filtering can be
72. e although it is loaded only one time It is useful when a table with coordinates is related to a dataset containing regions where people are born and it is also related to another dataset containing regions where people are dead Some tools needs to load coordinates data in two distinct tables one for each relation However Tableau can load only one time the table and then use it in many relations FFF3 Analysis During the analysis user can create new measures from dimensions and measures already loaded FFAI FFA2 There are many functions to create new measures although it is not the tool which offers more functions but they are enough for a typical descriptive analysis FFA3 FFA4 There are not many integrated data mining functions User can forecast quantitative time series data using exponential smoothing models integrated Tableau With exponential smoothing recent observations are given relatively more weight than older observations These models capture the evolving trend or seasonality of your data and extrapolate them into the future Forecasting is fully automatic yet configurable FFA5 Even so Tableau can be connected to R Project and from there user can create a data mining analysis for Tableau s data and visualize the results in Tableau FFAG6 Tableau automatically assign geographic roles to fields with common location names such as State Country and so on User can manually assign geographic roles to fields that don t u
73. e related if there exist two fields one in each one with same name case sensitive and matched values FFD1 Therefore during the loading process is important to pay attention to field names in order to get the right model Automatic modeling is an advantage because it saves time but sometimes user can visualize the model and realize that it is not the desired In that case user has the alternative of modifying the loading script in order to get the desired model FFD2 The visualization of the data model is key to understand the relations between fields and with QlikView user can visualize the data model every time FFD3 Field relations Unlikely because of the automatic relation by name there can appears circular references and OlikView doesn t support them A circular loop appears when there are two ways to get the same field by two different tables As a response of the circular reference QlikView alerts about that FFF1 and disconnects one of the tables in fact it disconnect the biggest one in order to display data FFF2 User can realize the disconecction visualizing the data model Circular references can be repaired duplicating tables but with QlikView it implies to load the table one time more in memory FFF3 Analysis Once time data are loaded user can create calculated fields to display in different objects as list boxes statistics boxes multi boxes table boxes and charts User can also create new fields in
74. e the tool and the same number of Explorer users should evaluate the tool After that a mean is done with the results In this project as an approximation an evaluation by one Explorer user is done but it is important to do not interpret the results as concluding results 12 3 Methodology Level 0 Dimensions ere Dee Pm Functionality Usability Maintainbility Customer Level 24 Characteristics cx A Cu A GAO Cz AS C FUNT D oo oo oo oo oo oo oo oy fo oy o o o o fo o oy fo oy fo oy oy fo o Vo 900 Vo o o o Fo Fo Vo Fo lo 9 o 9 Fig 1 Diagram of the systemic quality model SQMO Callaos amp Callaos 1996 3 1 The systemic quality model SQMO The Systemic Quality Model SQMO was developed in 2001 by the Universidad Sim n Bol var Venezuela Since then the University adopted the SQMO for software evaluations and provided successfully implementation examples Mendoza P rez amp Grim n 2005 Rincon Alvarez Perez amp Hernandez 2005 Until then it existed several models to evaluate the product software and other to evaluate the process software but any one with the capability to evaluate both aspects accurately As Humprey 1997 said having different models capable of measuring individually the product qua
75. each of the three evaluable characteristics there are listed the sub characteristics and their respective metrics Functionality Fit to purpose Interoperability Fig 3 Characteristic schema for each category according to Mendoza P rez Grim n 2005 Usability Efficiency Ease of understanding and learning Execution Performance Graphical Interface Operability 26 3 4 1 Functionality category 1 Fitto purpose characteristic This characteristic includes different metrics classified in 5 sub characteristics Data loading Data model Field relations Analysis Dashboards and Reporting i Data loading This sub characteristic includes various metrics to evaluate the loading process 1 Direct connection to data source FFI1 It measures the possibility of a direct 10 11 12 13 connection to data sources There are some applications which integrates connector drivers e g ODBC JDBC compatible with some databases and user does not need to install it in order to connect the application to the data source Big Data sources FFI2 It measures the capability to connect to any Big Data source different from Hadoop Apache Hadoop FFI3 It refers to the ability to connect to Hadoop infrastructure This technology is used to manage large volumes of structured or non structured data allowing fast access to data Hadoop simply becomes one more data source and it is the most common
76. easures if the tool offers a displaying menu by right clicking Terminology This sub characteristic evaluates 1f the terminology is consistent with the global business intelligence terminology 1 Ease of understanding the terminology UET1 This metric measures how easy is for the user to understand the terminology Help and documentation This sub characteristic is composed by metrics which measures the help offered by the tool to a user when he has doubts about the functionality or management of the tool User guide quality UEHI This metric evaluates if the user guide is understandable Highlighting that Self Service tools are also offered for non technical users 2 User guide acquisition UEH2 This metrics measures the process to get to the user manual For example if it is free if it is difficult to find in the web 3 On line help UEH3 It measures the offering of on line help Support and training This sub characteristic measures the quality and variety of the support offered by the tool Availability of tailor made training courses UESI It measures if the tool offers training courses adapted to organizations and it is positively measured if the course can be done in the organization 2 Phone technical support UES2 It measures if the tool offers a phone for technical support and the timetable of it 3 On line support UES3 It measures if the tool offers on line support and if it is in life or not 4 A
77. ects including data and charts to an SAP StreamWork activity to share with the community also publishing in SAP Lumira Cloud SAP Lumira Server to share with colleagues and publishing in SAP business Objects Intelligence platform 98 Security devices In SAP Lumira Desktop Standard Edition password is not requested to open a locally saved document Note In SAP Lumira Desktop Standard Edition only the creator of a document can open it again If the document is in SAP Lumira Cloud a user name and a password is required By the way it would be useful the possibility to require a password in the Desktop Standard Edition Moreover with SAP Lumira Server an administrator can assign passwords FSSI to projects and permissions to users 552 Ease of understanding and learning It is the second tool evaluated in this thesis which requires a shorter average learning time UEL1 Addtionally SAP Lumira is very intuitive and its icons and tabs are very consistency an let user easily learn the management It is the tool with the easiest interface UEB1 Right cliclking does not display any menu UEB2 Its terminology is agree with the BI terminology and it is not complicated UET There are many tutorials of different topics in web of SAP Lumira not only a user guide There are tutorials of more advanced actions All are understandable and actions are explained step by step There are also video tutorials in the web UESI UES2 UES3
78. ed Line chart with 2 Y Axes Waterfall Chart Box Plot Parallel Coordinates Chart e Geographical Geo Bubble Chart Geo Choropleth Chart Geo Pie Chart Geo Map But to create a Geo Map user must have an Esri ARcGis Online Account There exist Trellis to display a particular chart for each value of an additional dimension For example if user creates a bar chart that compares revenue by region and then he adds country to the trellis multiple charts will appear Each cart will display the revenue by region for one country In other tools is also possible but is not as comfortable to do User can always change the color of a legend but when the measure showed in the legend is continuous there is no the possibility to change the rank limits for each color Then although there is much variety between data user can not realize it In fact the data rank is always divided in 5 equal portions and for each portion a color is assigned User can change this color but not the rank limits An example is the following heat map in where it is not possible to realize about a pattern which in other tools can be visualized 96 SinisterlD and Population sinisters by Region sinisters SinisterlD 50 000 40 000 Madrid 30 000 20 000 10 000 0 It can be considered an important con when user is doing an analysis SAP Lumira is one of the tools with more distinct graphs to display data But they can not be very customized by user FFA16
79. ematic quality by the SQMO referenced in Mendoza L E P rez M A amp Grim n A C 2005 is the following explained First of all the Product Software is measured and then the Development Process 3 2 1 Product software The first measured category must be always Functionality If the category Functionality is satisfied the evaluation continues with other categories If the product does not meet the Functionality category the evaluation is ended It is because the functional category is the most 16 important in the quality measuring given that Functionality identifies the software capability to fit to purpose for it was built After that a sub model is adapted depending on the requirements Two categories from the five remaining must be selected which should be satisfied by the product and evaluated The algorithm recommends working with a maximum of three product characteristics including Functionality because if more than three product features are selected some might conflict In this sense Bass Clements amp Kazzman 1998 indicates that the satisfaction of quality attributes can have an effect sometimes positive and sometimes negative on meeting other quality attributes The definition of satisfaction can vary depending on the case of use and it is not fixed by the methodology In sub chapter 3 3 2 this issue is discussed Finally to measure the quality product of the software there is shown Tab 3 in w
80. eographic information A 3 2 6 75 00 1 FFA8 Time hierarchy A 3 3 9 75 0096 m FFA9 Creating sets of data A 3 2 6 75 0096 1 FFA10 Filtering data by expression A 3 3 9 75 00 1 FFA11 Filtering data by dimension A 3 3 g 75 00 al FFA12 Visual Perspective Linking 3 2 6 75 0096 m A FFA13 No Null data specifications 1 9 2 0 0 00 0 FFA14 Considering nulls 1 3 3 25 0096 0 FFA15 Variety of graphs A 3 3 9 75 0096 1 FFA16 Modify graphs A 3 B 9 75 0096 1 No limitations to display large A FFA17 amounts of data 1 3 2 6 75 0096 ut FFA18 Data refresh A 2 2 4 50 0096 1 FFD1 Dashboards Exportation A 3 3 9 75 0096 1 FFD2 Templates A 0 2 0 0 00 0 77 85 00 57 14 0 00 88 89 71 43 66 67 66 67 FFD3 Free design A 3 2 6 75 0096 1 FFR1 Reports Exportation A 1 g 3 25 00 0 FFR2 Templates 0 2 0 0 00 0 FFR3 Free design A 1 2 2 25 0096 0 0 00 A A FIP1 Operating Systems 1 0 00 FIP2 SaaS Web A 75 00 FIP3 Mobile A 2 2 4 50 00 m 60 00 INTEROPERABILITY FIU1 Using the project by third parts 75 00 75 00 100 00 A 3 2 6 1 FID1 Exportation in txt 3 2 6 75 00 1 FID2 Exportation in CSV 3 2 6 75 00 1 FID3 Exportation in HTML 0 2 0 0 00 0 FIDA Exportation in Excel file A 3 3 9 75 0096 1 77 78 551 Password protection A 3 3 9 75 00 1 SECURTIY FSS2 Permissions A 13 3 9 75 00 1 100 00 100 00 MP N w a Average le
81. er integer Date and time Date String and can also assign a Geographic Role FFI12 Data type is not determined in the data loading process FFI13 Once time data is loaded is when user can see the data type dimension or measure automatically assigned to fields and he can change it Moreover Tableau creates an automatic field for each loaded dataset in fact it is a measure which counts the number of registers of each dataset FFI16 Tableau has not the option to clean data before load them User can only rename the fields and the datasets FFI 9 Additionally Tableau let rename fields and datasets during the loading in particular during the creation of the data model FF 17 FF S Data model Data model can be done automatically relating fields with same name But user can also do that manually and decide by which fields are the relations done do not minding if they are not named equal Moreover user can also choose the type of join iner left and right FFD1 With this option 108 the final data model is tailored to user and then is easy to get the correct data model FFD2 Moreover user can visualize the data model at every time FFD3 Fields relations Although if user does the model by himself is more difficult to find circular references sometimes there can appear and then Tableau stops to display and remark it until user repairs the circular reference FFF 1 FFF2 A same table can be used more than one tim
82. ere are many graphs to display data and they offer many possible modifications to tailor it to user s requirements FFA15 FFAI6 109 Tableau does not present any problem to display huge amounts of data during the testing FA 17 In tableau by default the data refresh is done automatically although user can deselect this option FFA16 Dashboards In dashboards user has many options to design them He can change the format of lines rows columns text alignment shading borders lines FFD1 User can share dashboards publishing in Tableau Public a free service that lets the user publish interactive data to the web or Tableau Server Users of Personal Edition can only publish them in Tableau public while Professional Edition users can publish to Tableau Server In both cases user can embed passwords and authentification to visualize it User can also save the packaged workbook to use it for other persons using Tableau The packaged workbook contains the workbook with a copy of any local file data sources and background images From individual views user can export the image in the clipboard or save it or data in a Excel file or Access database It is a convenient way to share work with coworkers who do not have access to Tableau User can also print individual images or the whole worksheets to PDF choosing what elements of the dashboard user wants to include in the printed PDF User can save a workbook as the role of bookmark to
83. ering can not be done by menus and user must to type code to get this QlikView does not assign a data type to fields dimension or measure only when fields are displayed in charts they take the names of dimensions or expressions otherwise they are called just fields FFI12 On the other hand the data format can be agreed by user before loading the data On the top of the script there are the default settings for the data format and user can 85 modify them For example the following sentence can be written on the top of the script SET DateFormat DD MM Y Y YY It means that every data with the following format DD MM YYYY will be interpreted as a date by QlickView Therefore data formats can be changed by user typing the corresponding code during the loading FFI 3 Whether user loads data from files data are showed before the loading However when data are loaded from data bases they are not showed FFII1 Finally QlikView does not create automatically any measure from fields as other tools do FFI 6 With knowledge of SQL language user has more flexibility and gain more speed during the processes but not knowing SQL language is not an impediment to use Qlikview However there exist an option Syntax check that marks the code that is not right and it can be useful to learn and improve SQL language Data model QlikView creates automatically the data model taken as reference the names of the fields In fact two tables ar
84. erminology used a long the thesis 2 2 Introduction to Bl systems The objective of the following chapter is to introduce the Business Intelligence terminology in order to ease the interpretation of the thesis When a business needs to analyse its data in order to profit them and extract information and take advantage of this business intelligence takes the role Most of the companies generates data and these data are stored in databases A database is an organized collection of data where data are typically organized in a specific way to ease the queries In this area a query is a set of commands that the user types in a specific database language in order to get specific information about the information stored in the database For example a query can be How many male clients are in our database Queries are launched from a system responsible to access to the database These types of program are called Database Management Systems DBMS Apart from storing data they can also modify data We can say that it is the connector between the data storage and the user Some examples of DBMS are Oracle My SQL Microsoft Access There are a large number of database languages like SQL QUEL ISBL SPL XQuery The use of a specific database language depends on the target database The most common database language is SQL and it is used in relational databases Relational database is a type of database that organizes data into tables and links them
85. es because they were not in the existing datasets Some data were extracted from the CASdatasets package of R It is composed by several actuarial datasets originally for the Computational Actuarial Science book Particularly we extracted some data from freMPL6 and freMTPL2freq datasets Moreover in order to evaluate the analysis capabilities of the applications data were simulated forcing patterns In particular geographical and stationary patterns were imposed In 20141220 Initial test the amount of occurred car accidents in a region is proportional to the amount of population in it But in the months of July August and September in the region of Granada we force to have more accidents Additionally some people is forced to have more probability to have accidents than the main part of the population They are woman with ages between 40 and 45 man with ages between 50 and 65 young people under 24 and beginners The following chapter explains in more detail each one the tables composing the database 20141220 Initial test And the scripts of the R code is available in Annex 1 5 3 Tables Data from 20141220 Initial test come from a mix of two datasets freMPL6 and freMTPL2freq provided in CASdatasets package and additional data were invented by us We choose fields from freMPL6 and freMTPL2freq that were interesting for us and not all fields were added in our database In one hand freMPL comes from a private motor French i
86. et the color of an expression In particular colors are set depending on the fractiles 0 2 0 40 0 60 0 80 and 0 90 of the amount of sinisters in each region It corresponds to the Background color for the population expression which is the expression visualized in the chart For example If NumericCount If NumericCount rgb 255 153 152 If NumericCount rgb 255 102 102 If NumericCount rgb 255 51 51 If NumericCount rgb 255 0 0 If NumericCount And the result is Sinis Sinis Sinis Sinis Sinis Sinis terT terl terl terl terl terl D D D D D D lt 1574 gt 1574 gt 2082 gt 2630 gt 3532 gt 7134 8 rgb 255 204 204 8 and NumericCount SinisterID 2082 4 4 and NumericCount SinisterID 2630 and NumericCount SinisterID lt 3532 6 6 and NumericCount SinisterID 7134 4 4 rgb 204 0 0 89 Block Chart ax O Population_sinister Caceres Albac PS La Rioja Alava Fig 27 Block Chart with background color assigned to an expression built in OlikView In this chart the size is based on the population and the color is based on the amount of sinisters But as the amount of sinisters is not an expression of the chart there is not a legend to relate the color with the amount of sinisters Dynamic Updates allows adding new or modify existing data in the in memory data model of a
87. f functions A 3 3 9 75 00 1 FFA4 Descriptive statistics A 3 2 6 75 00 1 FFAS Preduction functions A 0 2 0 0 00 O FFA6 R connection A 3 2 6 75 0096 1 FFA7 Geographic information A 2 2 4 50 00 1 FFA8 Time hierarchy A 2 3 6 50 00 1 FFA9 Creating sets of data A 1 2 2 25 0096 0 FFA10 Filtering data by expression A 1 3 3 25 00 O FFA11 Filtering data by dimension A 1 3 3 25 00 O FFA12 Visual Perspective Linking 4 2 8 100 00 1 A FFA13 No Null data specifications 1 0 2 0 0 00 O FFA14 Considering nulls A 4 3 12 100 00 1 FFA15 Variety of graphs 3 3 9 75 00 1 FFA16 Modify graphs A 4 3 12 100 00 1 79 82 50 100 00 75 00 73 33 83 33 1 66 67 0 No limitations to display large amounts of data 100 00 50 00 75 00 0 00 100 00 75 00 0 00 Free design 25 00 FIL1 Languages displayed 4 2 8 100 00 1 100 00 Operating Systems 0 2 0 00 SaaS Web 1 25 00 Mobile 75 00 Data refresh Dashboards Exportation Templates Reports Exportation Templates n o o o o 5 o A 1 A A A Free design A A A A A 1 INTEROPERABILITY gt 75 00 75 00 75 00 Using the project by third parts 6 il 6 1 6 1 6 75 00 1 B 1 9 9 1 Exportation in txt Exportation in CSV Exportation in HTML Exportation in Excel file 75 00 100 00 551 Password protection 75 00 SECURTIY FSS2 Permissions 75 00
88. fficieny category but SAP Lumira does not Fig 19 shows the satisfaction score in each category 56 Tool QlikView SAP Lumira Tableau FUNCTIONALITY USUABILITY EFFICIENCY o 90 Fig 19 Category results SAP Lumira does not satisfy the Efficiency category In fact it does not satisfy the unique characteristic for Efficiency which is Execution Performance as it is shown in Fig 20 Characteristic Tool Bl ser Lumira EXECUTION PERFORMANCE Fig 20 Efficiency characteristics results for SAP Lumira This characteristic has a satisfaction score of 66 67 lower than the fixed limit 75 and because of that it is considered as not satisfied Only the 66 67 of the Execution Performance sub characteristics are satisfied In particular Fig 21 shows the satisfaction scores for the corresponding sub characteristics 57 Sub characteristics o Resource utilization Tool B sap Lumira Software requirements 096 1096 2096 3096 4096 5096 60 70 80 90 Fig 21 Execution Performance sub characteristics results for SAP Lumira Resources Utilization is not satisfied with a 33 33 of satisfaction score because it is the tool which requires more hard disk space EER3 and additionally SAP Lumira cannot be installed in processors of type x32 bits EER Finally according to table Tab 3 defined in sub chapter 3 2 1 the product quality levels of OlickView SAP Lumira and Tableau are Tool F
89. formance A Hadoop cluster is a special type of distributed architecture designed specifically for storing and analysing huge amounts of unstructured data in a distributed computing environment with fast processing Basically Hadoop is a distributed file system HDFS which lets storing large amount of data files on a cloud of machines On top of that distributed file system Hadoop provides an API Application Programming Interface for processing all stored data it is called Map Reduce The basic idea of Map Reduce is that each node processes the data stored on itself and by this way data are not transferred over the network and time is not wasted Moreover to access on data in Hadoop environment it exists Hive Hive is a data warehousing package infrastructure built on top of Hadoop It provides an SQL dialect called Hive Query Language HQL for querying data stored in a Hadoop cluster Hive adds extensions to provide better performance in the context of Hadoop and to integrate with custom extensions and even external programs After introducing the most common ways to store data it is time to introduce the tools to analyse business data Business intelligence tools are a type of application software designed to retrieve analyse transform and report data for business intelligence In order to get data they 10 can be connected to data sources or can import data from files The connection to a data source must be done using an API which
90. gion and the color of the blocks is based on the number of sinisters happened there Heat Map Valencia Murcia Asturias Tenerife Zaragoza Pontevedr Granada Cordoba a na Alicante Gerona Caste Cant Valla Ciud Huel Len 7 lon dolid ad R va eal With QlikView s block chart color cannot be represented by an expression Only the size can represent an expression and in this particular example the expression is the population Guipuzcoa Barcelona La Coruna Toledo Sevilla Almer a Islas Baleares AAA Badajoz Navarra Jaen Rectangle Size Rectangle Color Percent of Count Sinisterid 0 20 40 60 80 78476 lt Population sinister 6489680 Fig 25 Heat Map chart built in MicroStrategy 88 Block Chart Fig 26 Block chart built in OlikView ax o Population_sinister m e Caceres La Rioja Even so there exist an alternative to relate color with expressions Each built expression in a chart has a backgroud color and user can set it to fix a color pattern for the values of the expression by default this expression is empty It requires type code and in my opinion it is difficult because there are not many information on the userguide about that However the option to set the background color is very useful and interesting although implementing that can be not trivial The following function is an exemple of how user can s
91. h risk area Recall that it has a composed primary key because any of the fields is capable to identify uniquely the registers but both together form a composed primary key Finally the resulting RiskXGuarantees table showed in Tab 14 has 29 registers risk area 1 offers 8 guarantees risk area 2 offers 7 guarantees risk area 3 offers 6 guarantees risk area 4offers 5 guarantees and risk area 5 offers 3 guarantees and 2 columns RiskArea guarantees Guarantees 1 windows 1 travelling 1 driver insurance 1 claims 1 fire 1 theft 1 total loss 1 health assistance 2 windows 2 travelling 2 driver insurance 2 claims 2 fire 2 theft 2 health assistance 3 windows 3 driver insurance 3 claims 3 fire 3 theft 3 health assistance Fig 13 RiskAreaXGuarantees table 49 5 8 7 Policy table Policy table corresponds to a dimension table which includes information about the policies of the 26 000 clients and it is composed by the following fields PolicyID ClientID RecordBeg RecordEnd VehBeg VehBrand BonusMalus RiskArea and Code Fields are described in Tab 15 Numeric fields Field name Description Min Max PolicyID Policy identification 1 26 000 ClientID Client identification 1 26 000 Date fields Field name Description Min Max RecordBeg Policy starts date 2000 01 01 2010 12 31 RecordEnd Policy ends date 2000 01 01 2010 12 31 VehBeg Date when vehicle 1911 01 26 2011 01 01 was build Categorical fields VehicleB
92. her visualization the target In the settings menu of the source visualization user can choose the option to use as filter and can choose in which target visualizations the filter should be applied Then MicroStrategy allows the interaction between graphs but it is not automatic FFA 2 There are some assumptions about NULL data in files from where data is loaded For Excel csv and text files user should leave cells of data empty to represent NULL values rather than using the text NULL FFA13 On the other hand on measures null data are not considered In fact they are omitted for new calculated measures but there exist functions to create a new calculate measure related with null data of a measure For example Null function determines how many nulls are displayed In contrast null data for dimensions are considered as a one more value but when this attribute is also in other dataset Analytical Engine only use the non null form value from the other dataset instead of the null form FFA 4 MicroStrategy Analytics offers many types of visualizations FFA 5 There are graph visualization where user can choose between lines bar area scatter bubble grid and pie graph Moreover there is also the option to add double axis There are also grid visualizations network visualization heat map visualizations map visualization density map visualizations and map with areas visualization Particularly for a map with areas visualization user
93. hich there are the quality levels related with the satisfied categories Functionality Second category Third category Quality level Satisfied No satisfied No satisfied Basic Satisfied Satisfied No satisfied Medium Satisfied No satisfied Satisfied Medium Satisfied Satisfied Satisfied Advanced Tab 3 Quality levels for the Product Software Once the evaluation of the product software has ended recalling that only if the quality level is at least basic the Development Process evaluation may start 3 2 2 Development Process In order to evaluate the Development Process there are 4 steps to follow The algorithm used in the Development Process evaluation is fixed unlike the Product Software evaluation The steps are the followings 1 Determining the percentage of N A Not applying answers in the questionnaire for each category If this percentage is greater than 11 appliance of the measuring instrument must be analysed and the algorithm stops Otherwise the step two is the next 2 Determining the percentage of N K Not knowing answers in the questionnaire for each category If this percentage is greater than 15 it shows that there exist a high level of ignorance for the activities of the particular category If the percentage is lower step three is the next 3 Determining the satisfaction level for each category The definition of satisfaction can vary depending on the case of use and it is not fixed by the methodology In sub
94. iarized with data models and prefer a big table with all the columns but in business intelligence area data models application is well extended Additionally a same table can be joined more than one time with different relations FFD3 Field relations Because of the joins are done manually and user can determine which is the foreign key circular references does not exist FFF1 FFF2 FFFI Analysis User can create new measures to enrich the dataset at any time either directly from a measure or dimension or using the formula language to create a calculated measure FFA1 FFA2 When using Connection to SAP HANA data source it is not possible to create a measure from a numeric or string dimension Measures need to be created in the SAP HANA View before being acquired automatically in the application By default all numeric dimensions are also created automatically as aggregated measures The Aggregated measures are the typical ones SUM MIN MAX Count Distinct Count All Average If user wants to transform a dimension to a mesure without aggregation then he must select NONE in the menu of Aggregations 95 Moreover user can create calculated measures and calculated dimensions by the formula editor script There two fields can be combined to create a new one User can apply functions from a predefined set of numeric date and text functions Using also If Then Else clauses called logical functions and a calendar picke
95. ics as arguments Then the solution is to create firstly a metric based on Gender by Average Maximum Minimum Standard Deviation Sum and Variance These functions have only transform from attribute to metric but the resulting values are the same Male and Female Once time the new metric is created with exactly same values than the attribute user can use it in logical formulas to create other new metric e User can also create a new derived metric based on a MicroStrategy function The categories of functions offered are Basi functions Data Mining functions Date and Time Financial Internal Math Null Zero OLAP Rank and NTile Statistical and String e User can create a derived metric from scratch and use conditional calculations provide conditional analysis by combining data into different groups based on the value of one or more metrics in a dashboard Once time a derived metric is created 1t cannot be edited If user wants to edit it he must to remove it and created another one derived metric Summarizing there are many functions FFA3 in order to let the user to do a well descriptive analysis FFA4 Moreover user can perform statistical analysis in Analytics using R FFA Analytics supports the deployment of R analytics from the R statistical environment as derived metrics and once an R analytic is deployed to Analytics as a derived metric the statistical analysis can be added to and analyzed on visualizations The
96. igned but imposing that having a accident in the same region where the policy is registered 52 is most probable probability of 0 7 than in another region probability of 0 3 51 Moreover we also imposed that in the particular region Granada the probability of accident increases in summer for people who are not from the region or environs Jaen Cordoba Albacete Malaga Almeria and decrease for people who are from the region or environs The Guarantees field has random guarantees as components depending on the risk area contracted by the policy At the last RiskArea is a field with the same components as RiskArea from Policy table but repeating each value as many times as accidents have the particular policy Finally the resulting Sinisters table has 196 235 registers and 5 columns PolicyID RiskArea Guarantees Sinisterdate Code 1 4 theft 30 09 2003 29 1 4 windows 16 06 2006 9 1 4 windows 09 03 2008 50 1 4 driverinsurance 20 09 2008 29 1 4 driver insurance 05 09 2009 29 1 4 health assistance 03 08 2010 29 2 4 windows 22 09 2000 29 2 4 health assistance 30 12 2000 21 2 4 windows 27 01 2008 21 Fig 16 Sinisters table The 9 datasets were exported in an Excel files and from excel file they were loaded to the corresponding applications 6 Evaluation Results Once the metrics are chosen weights are assigned to each metric applications are selected and data are available it i
97. information about connecting to databases provided here is extracted from external sources user guides or corporative webs and our experience do not prove it 6 1 Results Once time every metric has been evaluated it is the time to get the results of the assessment In the case than more than one user is being implied in the evaluation of the metrics we recommend to calculate a mean score for each metric On the other hand one of the basis of the methodology of Mendoza P rez amp Grim n 2005 is that if Functionality category is not satisfied the evaluation is aborted and other categories are not evaluated Because of that the analysis starts with the satisfaction score of Functionality category In the current evaluation using the satisfaction limits mentioned in Tab 7 from sub chapter 3 3 1 the obtained satisfaction scores for Functionality are showed in Fig 17 Tool MicroStrategy Qlikview SAP Lumira Tableau FUNCTIONALITY Fig 17 Results for Functionality category In the adaption of the methodology in sub chapter 3 3 1 we sentence that a category is satisfied if the 75 of their characteristics are satisfied And applying that MicroStrategy Analytics did not satisfy the Functionality category because it only satisfies the 66 67 of the functional characteristics Then the evaluation of MicroStrategy is aborted On the other hand the other three 3 tools satisfy the Functionallity category because they sat
98. ing that clearly improves business They are also known as power users Thanks to big data explorers become data scientist A data scientist has to be able to extract information from large volumes of data according to a not randomly clear business objective and then present it in a simple way to the non expert users in organization Therefore it consists in a cross profile with skills in computer mathematics statistics data mining graphic design data visualization and usability Tourist Usually they are a group of two or more people On one side there is a person with an overview of the company that comes up with the possibility of a study on a certain topic On the other hand there would be a computer expert knowing the systems analysis of the company and he is the manager to find out if the study is feasible with the data and available tools This team will access to data without following any pattern and rarely observe same data twice Therefore their requirements cannot be known priori Tools used by tourist are browsers or search engines to search both data and metadata and the result of their work will be projects carried out by Farmers or Explorers In short a tourist is a casual user of the information This project aims to develop a method of evaluation that should be applicable taking into consideration the different profiles of tool users For example if the tool is used by Farmers and Explorers some Farmer users should evaluat
99. ion adding a dataset merging joining datasets This data cleansed is easy to do because it is done by menus 9 Data model The data model is not done automatically like other applications FFD1 in fact there not exists a data model Lumira offers an alternative option called Combine and it let user to merge datasets combining data from two datasets using JOIN The columns for the second dataset are matched based on compatibility with a key column The matched columns are proposed with the probability of the match Combine option let user to use a particular dataset in several merges with different merging fields and loading it in memory just one time By this way a dataset can be connected to two different other datasets avoiding circular references because user impose what are the pair of dataset merged and by which field primary key It is possible to build inner joins or left outer joins user can decide it To build a merge the foreign key must be a primary key of the right table Moreover some datasets don t have the foreign key a a primary key and then it is not possible to merge it therefore it is not possible to relate this table with other ones FFD2 User can always visualize the loaded datasets and therefore the merged datasets to know the relations between fields If the data model is complex append data by this way can be a laborious task On the other hand It can be a good alternative for users who are not famil
100. iption All data for the corresponding fields were extracted from the freMPL6 dataset with the exception of ClientID which was generated We created a discrete vector called ClientID with values from 1 to N 26 000 They were the client s identifications We chose N 26 000 because the dataset freMPL6 had 26 000 registers of clients becoming ClientID the primary key of the table because of its uniqueness The dataset also provides two fields related to the driving license age and the driver age But driving license age was measured in months and driver age in years Then we did the necessary calculations to get the values as dates The results are the date when the license was gotten and the birthday of driver which are called LicBeg and DrivBeg respectively Finally the resulting Client table which is showed in Fig 8 has 26 000 registers and 6 columns lientlD Gender MariStat CSP DrivBeg LicBeg 1 Male Other CSPSO 12 01 1965 25 09 1983 2 Male Other CSP50 12 01 1965 26 08 1983 3 Male Other CSP50 09 01 1979 16 10 1996 4 Female Other CSP55 14 01 1959 07 02 1981 5 Male Other CSP60 15 01 1954 05 01 1976 6 Male Other CSP60 15 01 1954 07 10 1975 7 Male Other CSP48 15 01 1953 19 02 1973 8 Male Other CSP48 15 01 1953 20 01 1973 9 Female Other 50 14 01 1958 26 09 1977 10 Male Other CSP55 15 01 1954 19 04 1979 11 Male Other CSP55 15 01 1954 20 11 1978 a sa mio renra A les lana sa lm lam Fig 8 Client table 2 The classific
101. isfy the 100 of the corresponding characteristics In order to know the reason why MicroStrategy does not satisfy the Functionality category a deeper level helped us to know what are the scores for each functional characteristic Functional characteristics are Fit to Purpose Interoperability and Security and Fig 18 shows their respective 55 satisfaction score We could see that the characteristic Fit to purpose is not satisfied because only the 66 6796 of its sub characteristics are satisfied Particularly the sub characteristics non satisfied are Field Relations and Reporting Grade Characteristic Tool Bl Microstrateay FIT TO PURPOSE INTEROPERABILITY SECURTIY 10096 Fig 18 Functinality characteristics results for MicroStrategy Analytics MicroStrategy Analytics does not satisfy the sub characteristic Fields relations because it is not capable to alert about the presence of circular references FFF1 and in fact it does not skip them FFF2 Moreover it cannot directly relate a table to more than one table FFF3 On the other hand Reporting sub characteristic are not satisfied because MicroStrategy Analytics does not have an option to build reports FFRI FFR2 FFR3 Then MicroStrategy evaluation is aborted and the evaluation followes for the other three 3 tools The other three tools satisfy additionally to the Functionality the Usability category Moreover QlikView and Tableau satisfy also the E
102. itates the use of the tool by the user UEBI Right click menus are not displayed Elements in the page have a tab where user can click on and a menu is displayed UEB2 Terminology Terminology is not difficult and it is in the standard but it is not the tool with easiest terminology UETI Help and documentation The user manual is extremely detailed and it has many concepts definitions These definitions is what differentiates the MicroStrategy Analytics user manual from others UEH The user guide is available in the net UEH2 next to other many papers with specific information UEH3 Support and training There are courses offered on line or in particular education centers and many of them can be customized with user data and delivered onsite UES About the support users can hire different types of support depending on their requirements All of them have phone and online support 24x x365 UES2 and also online supports UES3 Moreover MicroStrategy also offers consulting services about technology implementations technology management data management presentation and delivery and advisor services in particular solutions as education solutions healthcare solutions financial services UES4 In the net user can sign in the Learning Portal where he can access to many free courses and certified test about basic and advanced features UES5 There is also a community where user can discover technical documents
103. l structure where two tables are related by a common field foreign key which appears in both tables and which is showed in the figure next to the type of relationship The Fact table is called Sinisters and the Dimension tables are Client Policy Auto Region SinistersXYear RiskArea Guarantees and GuaranteesXRiskArea 42 We decided to propose a relational database in order to realize how the evaluated Self Service BI tools managed the relations between tables Some of these tools built automatically the data model the relations between tables that is that the user loads tables and the tool by itself relates tables Hence we wanted to know if this automatic modelling worked well or not Moreover we wanted to evaluate if applications were capable to understand both types of relationships In fact the most common relationship is 1 n and we were almost certain that applications support them But we doubted about the supporting of n m relationships In fact there were one of the evaluated tools SAP Lumira which could not relate two tables by a n m relationship Additionally our model has a particularity There are two circular references in Region and Guarantees fields A circular reference exists when there are at least 3 tables related between them For example Region table has information about the regions and in fact it has the name of the all regions and its population Policy table is connected to Region table by the field
104. lerts that its behaviour is particular Heat Map Castellon Cantabria Valladolid Ciudad Real Leon Rectangle Size Rectangle Color Percent of Count Sinisterid 20 40 60 80 78476 Population sinister lt 6489680 Fig 28 Example of a Heat Map build by MicroStrategy Analyitics Month of Year of Sinisterdate population Ler Ca Al sa Al 10096 The current network shows relationships between regions in November For each sinister the sinister region is joined with the corresponding policy s region Color and size of lines are based on the amount of sinisters Regions which differs from others are Valencia Madrid Barcelona and Sevilla It is usual because they have more population that other regions Network Zamora Ayila Teruel ue Pontevedra Castellon Almeria Valladolid Edge size Count Sinisterid Edge color Percent of Count Sinisterid Fig 29 Example of a Network chart built by MicroStrategy Analyitics 113 Month of Year of The current network shows relationships between regions in just For each sinister the sinister region is joined with the corresponding policy s region Color and size of lines are based on the amount of sinisters In August Granada has a different behaviour than in November It alerts that its behaviour is particular Network Cadiz fyuelva Alicante Valencia Palencia Ponte
105. leve Sicilia tit tte doa 18 6 Weights of Metrics bent e erts dese a td 24 Tab 7 Satisfaction limits er entere nette rere etae erp ean ii 25 Tabs 8 Circular reference et teet ede raster pl ath eee etes 29 Tab 9 Client table description 5 retener 45 Tab 10Auto table description er rr pe rise a e ehe ee FR E PR EXER MEE IR eae 46 Tab 11 1 1 enne enne enne nnne nene 47 Tab 12 RiskArea table description eese nennen rene n cnica 47 Tab 13 Guarantees table description esee eee emen 48 Tab 14 RiskXGuarantees table description ene 49 Tab 15Policy table description sese eere ren ren rennen nnne 50 Tab 16 SinistersX Years table description oooonoconoconocononanonanonnnannnannnnnncnnnn ran nono nono 51 Tab 17 Sinisters table description 52 Tab 18 Satisfaction Limits for a second evaluatioN cccccccncnnnononnnnnnnonononanonanonanonononananinonons 58 65 Annex 1 Scripts for 20141220 Initial test database During the simulation process some variables were called different than in the database It Because of the relationships l n or n m foreign key fields were build two times with different lengths one time for each table from where they pertained And to keep the consistency in the R code they had
106. lity or the process quality of software does not guarantee the total systemic quality of the software As seen in Error No se encuentra el origen de la referencia Error No se encuentra el origen de la referencia SOMO consists of two sub models a Product and a Process sub models The SQMO can use either the Product sub model the Process sub model or both The first sub model is designed to evaluate the already developed software while the second is designed to evaluate the development process of the software The SQMO sub models have different levels in order to assess software which they are 3 1 1 Level 0 dimensions There are two dimensions for each sub model These dimensions are Efficiency and Effectiveness for the Product and Efficiency and Effectiveness for the Process Effectiveness is 13 the capability of producing a desired result while Efficiency is the capability to produce a specific outcome effectively with a minimum amount or quantity of waste expense or unnecessary effort 3 1 2 Level 1 categories There are six elements corresponding to Product and five corresponding to Process They are concretized by Callaos N C 1993 The categories for the Product sub model are the followings e Functionality is the ability of a software product to provide functions that meet specific and implicit needs when software is used under specific conditions e Reliability is the capacity of a software product to maintain a s
107. ly linked to attributes that already exist in the dashboard MicroStrategy attempts to link attributes that share the same name User can also manually link attributes that are shared across multiple existing datasets FFD1 in order to get the desired data model FFD2 An attribute that is linked across multiple datasets is displayed with a link icon and is displayed as one attribute when added to a visualization MicroStrategy Anlytics does not offer the option to visualize the data model and because of that this icons are very important FFD3 User can unlink attributes that are already linked Unlinked attributes with the same name are treated as two separate attributes when displayed in a visualization Manually linking attributes allows user to link attributes across multiple existing datasets The attributes that user link must be the same data type User can link an attribute to attributes in one or more datasets Circular references Although the link is done manually there can be circular references and MicroStrategy does not advice about them FFF1 In fact it works but results are not the correct ones FFF2 The unique way to solve circular references are to load one time more the same table and link manually or renaming the fields It is because Analytics does not let to use the same table with different connections but by the same field FFF3 The unique way to realize that there are circular references is knowing well the data bec
108. n performance of the tool ii Compilation speed This sub characteristic measures the compilation speed how fast the software build a particular chart 1 Compilation speed EEOI It measures the compilation speed It is a very subjective measure because it depends on the machine where it is installed iii Resource utilization This sub characteristic evaluates the extra hardware and software requirements 1 CPU processor type EERI This metric evaluates if the tool can be installed as much to x86 processors suc has to x64 processors 2 Minimum RAM EER2 It measures the RAM needed in the way that a maximum punctuation means it requires low memory while the minimum punctuation means it needs many memory 3 Hard disk space required EER3 It measures hard disk space needed in the way that a maximum punctuation means it requires low space while the minimum punctuation means it needs many memory iv Software requirements This sub characteristic is composed by a unique metric which measures if adding software is required to execute the tool 1 Additional software requirements EESI this metric evaluates if adding software is required to execute the tool 34 4 Software selection for the evaluation Prior to an evaluation there must be a selection of software hence some aspects should be considered Firstly the area of application and use of the software should be pre established The selection of software depends on this aspect be
109. n tables must be unique No two different tables can have the same name in a data model Attributes columns cannot have the same name in a table You can have two different tables that have similar attribute names Among relational data model there are different types of models depending on its structure Particularly we used a snow flake schema which is a type of relational data model composed by two types of tables Fact tables and Dimension tables 41 e A Fact table contains information of particular events By this way they typically have three types of columns those that identifies the tuple primary key those that contains the facts and those that are a foreign key to dimension tables e A Dimension table contains extensive information about the values that are used in the fact tables By this way it contains information about all the possible values of a particular field used in the fact table For example a Fact table contains information about occurred accidents And it has fields ClientID Date Region and Guarantee Then a dimension table can be one with information about the client With fields ClientID Gender and Birthday And another Dimension table can be one with information of the region with fields Region and Population A snow flake schema contains a Fact table and multiple dimension tables which can be connected to other Dimension tables The last ones are known as parents A parent in a relational data model
110. n there is no problem to share projects with others Moreover with the Desktop Edition have not problems about that FIU1 Data exportation From Professional Edition data can be published in Tableau server While with the Personal Edition user can publish on Tableau Public User can select any portion of a data view to export into Microsoft Excel or Access database FID1 FID2 FID3 FID4 Security In Tableau public and Server there can be permissions specified to regulate Access to the workbook FSS1 FFS2 Learning time It is not the most easy tool The automatic aggregation of all measures by dimensions can confuses the user UELI Browsing facilities There are many icons and the same action can be done by more than one way It can be confused UEB1 Right clicking displays a menu with Tableau actions UEB2 Terminology The terminology keeps on standard UET Help and documentation The user manual is available in the web there is many information and it can be understand by non expert users UEH I UEH2 UEH3 Support training There is the option of accessing to tailor made training courses UES1 Additionally there are many online course video tutorials in the web and integrated in Tableau application It is very good in free formation UES5 There is the option to get support by call but if user has an special support account UES2 UES3 There is many information to support user in the web and user c
111. nal 12 Centrifuge Systems 13 Chartio 14 ClearStory Data 15 DataHero 16 Datameer 17 DataRPM 18 Datawatch 19 Decisyon 20 Dimensional Insight 21 Domo 22 Dundas Data Visualization 23 Eligotech 24 eQ Technologic 25 FICO 26 GoodData 27 IBM Cognos 28 iDashboards 29 Incorta 30 InetSoft 31 Infor 32 Information Builder 33 Jedox 34 Kofax Altosoft 35 L 3 36 LavaStorm Analytics 37 Logi Analytics 38 Microsoft BI 39 MicroStrategy 40 Open Text Actuate 41 Oracle 42 Palantir Technologies 43 Panorama 44 Pentaho 45 Platfora 46 Prognoz 47 Pyramid Analytics 48 Qlik 49 Salesforce 50 Salient Management Company 51 SAP 52 SAS SAS Business Analytics 53 Sisense 54 Splunk 55 Strategy Comapnio 56 SynerScope 57 Tableau 58 Targit 59 ThoughtSpot 60 Tibco Software 61 Yellowfin 62 Zoomdata 63 Zucche To build the medium list we followed also the steps of Gartner in the Magic Quadrant report where they choose the platforms to be evaluated if they satisfied 13 technique features and 3 non technique The 13 technique features were by Gartner the critical capabilities that every Business Intelligence and Analytics platform must satisfy And they were classified in three categories Enable Produce and Consume Enable e Functionality and Modelling Combination of different sources and the creation of analytic models such as user defined measures
112. nalysis and Synthesis ISAS 96 15 23 Dromey R 1996 Cornering to Chimera In EE Software Vol 13 pp 33 43 Humphrey W 1997 Introduction to the Personal Software Process Massachusetts Addison Wesley Longman Inc Inmon W Imhoff C amp Susa R 1998 Corporate Information Factory EE UU John Wiley amp Sons Inc Kitchenman B 1996 Evaluating software engineering methods and tools Part 5 Principles of Feature Analysis University of Keele Department of Computer Science England Mendoza L E P rez M A amp Grim n A C 2005 Prototipo de Model Sist mico de Calidad MOSCA del Software Computaci n y Sistemas 8 3 196 217 Ortega M P rez M amp Rojas T 2000 A Model for Software Product Quality with a Systemic Focus 4th World Multiconference on Systemics Cybernetics and Informatics SCI 2000 and The 6th International Conference on Information Systems Analysis and Synthesis ISAS 2000 pp 395 401 Orlando Florida P rez M Rojas T Mendoza L amp Grim n A 2001 Systemic Quality for System Development Process Case Study Seventh Americas Conference on Information Systems AMCIS 2001 pp 1297 1304 Boston Massachusetts Rincon G Alvarez M Perez M amp Hernandez S 2005 A discrete event simulation and continuous software evaluation on a systemic quality model An oil industry case Elsevier Information amp Management 42 1051 1066 62 S
113. nd of project or process within a primary life cycle e Organizational contains processes that establish the organization s commercial goals and develop process product and resource good values that will help the organization attain the goals set in the project 3 1 3 Level 2 characteristics SQMO states that each category has associated a set of characteristics which define the key points that must be fulfilled in order to guarantee and control de Product and or Process quality of software Product characteristics are specified in Tab 1 and Process characteristics in Error No se encuentra el origen de la referencia They are defined more accurately in Mendoza P rez amp Grim n 2005 14 Characteristics Category Product Effectiveness Product Efficiency Fit to purpose Correctness Functionality Precision E Structured Interoperability Encapsulated Security Specified Maturity Correctness Reliability Fault tolerance Structured Recovery Encapsulated Ease of understanding Complete Ease of learning Consistent Usability Graphical Interface Effective Operability Specified Conformity of standards Documented Auto descriptive Execution performance Effective Efficiency Resource utilization is redundant Direct Used Analysis Capability Attachment Ease of changing Cohesion Stability Encapsulated Testability Software maturity Maintainability Structure information Descriptive Correctness Structural Modularit
114. nstructor led offers support UES Moreover phone technical support is avaible for Server users from Monday to Friday and between 8 00a m and 5 00pm UES2 Online support is also offered by server users asking the nearest sales office UES3 Additionally OlikView also offers consulting services classified in three categories Implementation Services Application Services and Business Services UES4 Moreover QlikView proposes some online free courses and also free tutorials and papers of particular features for every type of users UES5 Finally there is also a Community where every users can ask for questions or look for questions already answered y experts UES6 Windows and mouse interface User cannot edit elements by double clicking UGW but dragging and dropping method can be used in QlikView For example user can drag fields to the dimension layout and then drop them when he is building a chart UGW2 Display The screen layout can be editted dropping or adding icons in toolbars UGD 92 Versatility Version updated must be done manually from the web of QlikView login with a client user and a password UOV1 Compilation speed By our experience with QlikView Desktop it has a fast compilation speed QlikView differs from other tools in its compilation speed additionallt to the Visual Perspective Linking EECI Resources utilization For Qlikview Personal Editon it is recommended to use Intel Core duo or
115. nsurer It includes client s information of 26 000 policies in the year 2004 On the other hand freMTPL2freq includes information about 413 169 policies observed on an unknown year like the vehicle power the vehicle age the value bonus malus the vehicle brand the region where the policy is registered and its population density We wanted to have data for 10 years and more fields different from the ones offered from freMPL6 and freMTPL2freq For this reason we invented some fields and also extrapolated data to 10 years In the following sub chapters detailed information about the creation of the tables is introduced Reader can access to the R scripts which are in Annex 1 Although it could seem that tables are just tables the creation and transformation of original data has become a laborious task Hence we included the transformations carried out yet the reader might skip them 44 5 3 1 Client table Client table corresponded to a Dimension table including clients information It was composed by the following fields ClientID Gender MariStat CSP DrivBeg and LicBeg Fields are described in Tab 9 Numeric fields Field name Description Min Max ClientID Client identification 1 26 000 Date fields Field name Description Min Max LicBeg Date of license 1941 09 05 2009 01 11 DrivBeg Birthday 1916 01 25 1991 01 06 Categorical fields CSP Social category CSP2 CSP7 33 different values Tab 9 Client table descr
116. o the vehicle brand of the selected vehicles And RiskArea is the field mentioned in the RiskArea table paragraph The field BonusMalus corresponds to a risk indicator of the policy Its data were extracted from the BonusMalus field of freMPL6 dataset for our 26 000 clients Finally the field Code refers to the code of the region where the policy is registered It is created by assigning randomly numbers from 1 to 52 each number refers to a region depending on the population of each region 50 Finally the resulting Policy table Fig 14 has 26 000 registers and 8 columns 50 21 29 29 40 29 40 PolicylD ClientiD RecordBeg RecordEnd VehBeg VehBrand BonusMalus RiskArea Code 1 1 12 11 2006 01 01 9999 04 01 1998 1 50 4 2 2 05 02 2008 01 01 9999 05 01 1996 1 50 4 3 3 27 02 2000 01 01 9999 08 01 1984 2 68 5 4 4 21 09 2000 22 04 2010 03 01 2004 13 50 3 5 5 28 01 2004 01 01 9999 03 01 2004 2 50 5 6 6 29 03 2002 01 01 9999 01 01 2011 2 50 5 7 7 25 04 2000 01 01 9999 03 01 2003 13 50 3 8 8 15 05 2002 20 09 2010 04 01 1998 3 50 3 9 9 01 12 2003 13 05 2009 03 01 2001 10 50 5 10 10 17 11 2010 01 01 9999 01 01 2010 12 50 4 11 11 27 08 2001 01 01 9999 02 01 2005 2 50 4 Fig 14 Policy table 5 3 8 SinistersXYears SinistersXYears table corresponds to a fact table which shows how many accidents are registered in each policy along the years from 2000 to 2010 This is a cross tab Error No se encuentra el origen de la referencia and bec
117. oading Data can be imported from a file xls xlsx txt csv FFI8 FFI5 of 200MB as maximum from a database or using a database query For xls and xlsx files multiple worksheets can be included in the file but only one worksheet can be uploaded at a time FFI6 And crosstabs can be also uploaded but they must be in a particular format FFI7 Moreover data can be also loaded by a connection to databases In order to load data from a database the connection is done by ODBC Some data sources can be directly connected to Analyitcs because it already has several particular ODBC integrated FFI1 Some examples are Amazon EMR Cloud Apache Hadoop FFI5 Cloudera CDH Cloudera Impala Greenplum Hortonworks HDP IBM DB2 IBM Netezza Microsoft Acces Microsoft SQL Database Microsoft SQL Server Oracle PostfreSQL Web services data sources FFI4 Other data sources can be connected but an ODBC previously installed is needed to establish the connection between SAP Lumira and the database Some examples are Mongo My sql Community and Enterprise SAP CDBMS SAP HANA Teradata Aster Google BigQuery HP Vertica An intuitive visual interface makes easy to import data by dragging and dropping tables selecting columns FF1 4 rename fields and datasets FFII7 FFI18 and specifying filter conditions 5 Moreover data is visualized before being loaded FFI By default when user selects data to import MicroStrategy aut
118. of applications to connect to R in order to get advanced analytical functions Geographic Information FFA7 This sub characteristic measures the capability of display data in maps Time hierarchy FFA8 It evaluates the capability of the application to create time intelligence It consist in from a particular date create other fields like month quarter or year This set of fields are grouped in a hierarchy Particularly a time hierarchy This metric evaluates the capability of the tool to create directly time hierarchies Creating sets of data FFA9 It evaluates the capability of a tool to create sets of data During the analysis the user can be interested in analyse in more depth a set of registers Some tools lets to save these datasets and work with them Filtering data by an expression FFA10 It evaluates the capability of a tool to filter data during the analysis by expression values Filtering data by a dimension FFA11 It evaluates the capability of a tool to filter data during the analysis by dimension values Visual Perspective Linking FFA12 It evaluates the capability to link multiple images so a selection on one image shows related and relevant data in others images 30 13 14 15 16 17 18 No null data specifications FFA13 This metric evaluates if the applications have any requisite to the null values for example that null values must be noted as NULL or just with an space or by contrary tha
119. omatically generates the SQL query that is required to select the data from the database Alternatively there is the option to load data by Freeform script With Freeform script user writes his owns database queries to retrieve data from a relational database For example user can load data from a database using SQL from third party web services using XQuery from Hadoop using HiveQL always using a previous connection to the data source User can then customize how the data is imported by changing the SQL query displayed in the Editor panel User can use joins expressions aggregations and filters to define the data that he wants to load Therefore Freeform script loading let user to do a data cleansing before load data FF1 9 Moreover user can also import a dashboard and data that an Analytics Desktop or MicroStrategy Analytics Express user has shared with him During the loading process user can define a data column type as attribute or metric but only during the importing process because in the dashboard user cannot modify it FFI13 When it is defined as attribute user can decide to assign a geo role to the attribute Moreover user can also choose the data format between number text date date time big decimal email html tag phone number symbol and url FF 2 If the column s data type is Date Time or DateTime Analytics automatically generates additional time related information based on the contents of the data column For
120. ome applications had the same value for the metric the same score has to be assigned to them To clarify the current scale measurement we present an example of the metric Compilation speed which will be presented in sub chapter 3 4 The Compilation Speed is measured with a scale from 1 to 4 We assign 1 value to the tools which requires more time to compile and 4 to the tool which requires the shorter time 20 Additionally the official SOMO method involves a balance between all the characteristics because they have the same level of importance But sometimes the user wants to give more importance to certain characteristics depending on his owns interests and for that we provided the following alternative also very used as a variant of SQMO This alternative consists on assigning weights to the metrics Therefore the importance level of the metrics varies In the current project the weights were assigned by Carlos Barahona an expert user from INDRA He remarked that weights must depend on the requirements of each project However he tried to assign weights generalizing and based on his own experience managing projects Recalling that if the methodology is implemented for another use case they can be modified The used weights scale are the following 0 Not applicable to Organization 1 Possible usage feature or wish list item 2 Desired feature 3 Required or must have feature Finally final scores for sub characteristics
121. ore than one sheet in the same file he must repeat the same process as many time as there are sheets But if user knows SQL code it is advisable to type code in the Editor Script instead of repeat the same browsing through menus process many times FFI6 Usually in excel data sources there are cross tabs and QlikView has the option to import them from excel files During the loading user indicates if a table is a cross tab and he can establish the parameters of the cross tab and change their names FFT7 On the other hand connecting to multiple data sources is possible just repeating the same process for each different connection FFI9 OlikView can combine data from many different data sources with high performance regardless of how these data sources work on their own Tables from wherever data source will be charged in the memory of QlikView as simply datasets Therefore the integration of many datasources become the integration of different datasets FFI10 With the Editor Script user can clean and prepare data for the loading For example user can create new calculated fields rename fields FF117 filter data and columns FFI 4 FFI 5 assign name to the dataset FFTI8 As the loading data is based on the written code user can also insert data manually User can do almost all these functions typing code or by menus interchangeably because the Editor Script also offers menus to do almost all functions But for example filt
122. orms in the market Included in Gartnet M Q Particular organization aims lt lt Tibco Jaspersfot Tableau SAS SAP Lumira Qlikview Pyramid Analytics Panorama rir EY NA NA NA RS NA Ni NA NA Fig 6 Schema for the selection process 4 2 The 4 evaluated software In this project we evaluate four 4 tools from the short list The evaluated tools are those that according to the vision of INDRA INDRA has the major Business Intelligence unit in Spain have more projection As a cause of the amount of clients projects implementing them or because clients show interest in these applications the final four tools are QlikView Tableau MicroStrategy Analytics and SAP Lumira OlikView was designed in 1993 to generate business insight by accessing information from standard database applications and displaying their data associatively Moreover it already runs entirely in memory as a pioneer And 7 years later QlikView was focused on the BI market Because it was the more mature tool in the market running in memory associative search engine it was a very interesting tool for clients and it was evaluated MicroStrategy as a global BI platform was the most implemented tool among INDRA s clients And because of that its BI tool MicroStrategy Analytics was evaluated Tableau was the tool among all Self Service BI tools which was mentioned by more clients This tool was created with the objective of giving more emphasis to the visual d
123. os VehBrand lt levels factor VehBrand_N THHHHHHHBHHHHHHHHHHHEHT HCluster power in few categories depending on the VehBrand For each VehBrand do the power mean Hand select the means as the new levels for VehPow From freMTPL2freq gt levels factor freMTPL2freqSVehPow 1 4 5 6 7 8 9 10 11 12 13 14 15 H gt levels factor freMTPL2freqSVehBrand H 1 1 2 3 4 5 6 10 11 12 13 14 THHHHHHHHHHHHHHHHHHHEHT I length levels factor freMTPL2freqSVehBrand VehPow lt NULL for i in 1 14 pow freMTPL2freqSVehPow which freMTPL2freqsVehBrand levels factor freMTPL2freqSVehBran d i VehPow i round mean pow O THHHHHHHHHHHHHHHHHH HEBR 67 H gt VehPow H 1 6 6 6 6 6 6 9 9 7 9 7 THHHHHHHHHHHHHHHHHH AE VehType lt c compact compact familiar compact terrain familiar sport sport HDATAFRAME df_Auto lt data frame VehBrand VehPow VehType View df Auto HEXPORTATION library foreign write table df Auto C Users jorcajo Desktop Tables Auto txt sep Xt row names F terrain sport sport HH REGION TABLE HFIELDS Code c 1 52 Region c Alava Albacete Alicante Almeria Asturias Avila Badajoz Barcelona Burgos Caceres Cadiz Cantabria Castellon Ciudad Real Cordoba La Coruna Cuenca Gerona Granada Guadalajara Guipuzcoa Huelva Huesca Isl
124. ow names F 75 Annex 2 Questionnaires This annex attaches the questionnaires filled by the unique user implied in the evaluation me There 4 questionnaires one for each evaluated tool They shows the scores according to the scale of measurement established in sub chapter 3 3 1 for each metric Moreover according to the satisfaction limits established in subchapter 6 1 for the second evaluation the satisfaction score is showed Recall that the whole excel file will be attached with this thesis The column MLS refers to the scale of measurement established for each metric The column WEIGHTS refers to the weights established for each metric in sub chapter 3 3 1 The COMPENSED VALUE refers to the product of the weight and the metric s score The NORMALIZED VALUE is the satisfaction score for the metric While for sub characteristics characteristics categories they are called simply TOTAL The columns called INDICATOR take values or 0 depending on if the Metric Sub characteristic Characteristic Category is satisfied according to the satisfaction limit established in subchapter 6 1 76 MicroStrategy Analytics FIT TO PURPOSE FFI1 Direct connection to data sources A B 2 6 75 0096 T FFI2 BigData sources A E 1 8 75 00 1 FFI3 Apache Hadoop A 3 1 3 75 00 1 FFI4 Microsoft Access A 3 2
125. owledge And because of this we have done this work together with the department of Business Intelligence from INDRA S A and under the tutelage of Dr Pan Fonseca Secondly we adapted the SQMO to particular aims and finally four 4 tools were evaluated They were Tableau MicroStrategy Analytics QlikView and SAP Lumira As it has been pointed before in the BI world there are many Self Service tools making this thesis interesting within this sector because probably not all of them fulfil the requirements for all type of projects It has to take into account that the concept best tool is difficult to apply in this ambit And for this reason it is more usual to choose an appropriate solution for a particular project At this moment many consultant companies are interested in knowing which are the tool s closer to their clients requirements Particularly INDRA was interested in determine which of the four 4 evaluated tools is are closer to its clients requirements INDRA S A was also interested to apply this evaluation method on further comparisons with other Self Service BI tools It means that from this thesis can result an applicable method to determine which tool has to be chosen in each particular project Nowadays the term Business Intelligence it is also known as Business Analysis BA This change is due to the implemented techniques added in order to extract more information from business data BA is defined as the
126. pecified level of performance when used under specific conditions e Usability is the capacity of a software product to be attractive understood learned and used by the user under certain specific conditions e Efficiency is the capacity of the software product to perform adequately under specific conditions depending on the amount of resources used under stated conditions e Maintainability is the capacity of the software to be modified Modifications can include corrections improvements or adaptations of the software to adjust to changes in the environment in terms of functional requirements and specifications e Portability is the capacity of the software product to be transferred from one environment to another And the categories for the Process sub model are the followings e Client Supplier is made up of processes that have an impact on the client support the development and transition of the software to the client and give the correct operation and use of the software product or service e Engineering consists of processes that directly specify implement or maintain the software product its relations to the system and documentation on it e Support consists of processes that contain practices of a generic nature that can be used by anyone managing any kind of project or process within a primary life cycle e Management consists of processes that contain practices of a generic nature that can be used by anyone managing any ki
127. preserve the data connection formatting calculated fields and groups essentially provides a template from which to create future workbooks FFD2 Unfortunately there are not template in order to reuse them several times FFD3 Reporting Reports are done printing on PDF the work done in worksheets Before user prints there are several options that he can set to specify how the worksheet will look when it is printed User can configure the Page Setup settings and print the file with the settings applied Some settings that user can do are specifying the margins centering print scaling the layout legends FFRI FFR2 FFR3 With Tableau Server they can be shared among coworkers Languages English French German Spanish Brazilian Portuguese Japanese Korean and Simplified Chinese FILI Portability Tableau can be installed in Microsoft Windows and also Mac systems FIP1 Tableau offers the product Tableau online which is a hosted version of Tableau server with no setup required User can publish dashboards with Tableau Desktop and share them with colleagues partners or customers It Works in web browser or mobile device dashboards are automatically optimized for mobile tablets without any programming Only authorized users will be able to interact with data and dashboards published there F1P2 With the Server Edition user can get a mobile application FIP3 110 Using the project by third parts With Server Editio
128. project done by Desktop in the cloud you should load the project to Express manually MicroStrategy also offers a custom free mobile app which is built by MicroStrategy professionals in two weeks and it is for either the iPad iPhone or Android Then there are not a application available for users they must ask it to the corporation FIP3 Using the Project by third parts Analytics allows user to create MicroStrategy files mstr to share dashboards with other Analytics Desktop Analytics Express and MicroStrategy Analytics Enterprise users FIUI Data Exchange Analytics Desktop also allows user to easily and rapidly export data from particular visualizations to Excel and CSV files FIDI FID4 FID2 FID3 105 Security With Desktop edition there are not security devices But with the server one MicroStrategy offers a innovation it is called Usher Usher is an enterprise grade mobile identity platform designed to provide security for every business process and application across an enterprise It replaces traditional forms of enterprise identity such as IDs passwords and tokens with mobile identity badges securely delivered on a smart phone For example to log in the Express SaaS user must introduce a user name and password or access by Usher app FSS1 Additionally in the server edition an administrator can assign permissions to users 52 Browsing facilities There not many icons in toolbars and it facil
129. r can also merge datasets from different data sources and work with the resulting dataset Then the integration of many data sources is good thanks to the merge option FF110 Before the data loading the application displays a preview of data and user can filter data and columns FFI11 FFII4 5 Fields can not be renamed they can be renamed after the loading FF116 althouht datasets can be renamed FFI17 User cannot know the data type dimension measure assigned to fields by Lumira until after data is loaded and then he can change them 3 And in order to prevent specified columns from being proposed as Measures when data is acquired the applications uses the enrichment suggestions versionsnumber txt file to define which columns should not be proposed as Measures User can also prevent objects from being considered as Time objects or Geographical objects And the enrichment will be processed if in the Preferences settings user choose automatic detection of enrichment SAP Lumira creates automatically aggregated measures from every numeric field FFI16 94 After the data loading user can view and prepare data editing and cleaning it converting data to another data format integer biginteger double string date Boolean FFI12 duplicate columns convert cases replace values fill string values with a prefix or suffix creating geography and time hierarchies creating a measure from a column or a dimens
130. r for date parameters SAP Lumira is not the application with more functions but they are enough to do descriptive statistics FFA3 FFA4 SAP Lumira offers also predictive calculations By choosing the down arrow on a measure displayed with a date range user is able to choose a Forecast or Linear Regression Predictive Calculation type to add to the visualization along with specifying how many periods forward user wanted to predict FFA5 Moreover there is also another snap in product for SAP Lumira called SAP Predictive Analysis which can adds predictive features into SAP Lumira SAP Predictive Analysis is far more robust with regards to predictive features than the offered in SAP Lumira and moreover SAP Predictive Analysis supports the use of predictive algorithms from open source R unlike SAP Lumira FFA6 SAP Lumira offers a a plethora of data visualization types Moreover in the user guide they are very well classified depending on the type of analysis the user want to do It is one of the keys of SAP Lumira the variety of graphs FFA15 e Comparison Column Chart Column Chart with 2 Y Axes 3D Column Chart Radar Chart Area Chart Tag Cloud Heat Map Table e Percentage Pie chart Donut chart Pie with depth chart Stacked Column Chart Tree Funnel Chart e Correlation Scatter plot Scatter Matrix Chart Bubble Chart Network Chart Numeric Point Tree e Trend Line chart Line Chart with 2 Y Axes Combined Line chart Combin
131. rand Vehicle Brand 1 2 3 4 5 6 10 11 12 13 14 BonusMalus Bonus Malus 50 51 51 75 different values RiskArea Risk Area included 122 34 5 in the policy Region identification 1 2 3 52 different values Tab 15Policy table description The field PolicyID was created with values from 1 to 26 000 In that particular case this field is equal to the ClientID because we were supposing that a client only had a policy in order to ease the analysis The field ClientID let the join between Policy table and Client table In order to create the fields RecordBeg we took 26 000 random dates between 2000 01 01 and 2010 12 31 remembering that we wished data for 10 years And in order to create RecordEnd field we have supposed that the probability to leave the policy is 0 3 It means that with a probability of 0 3 we assigned the NULL date 9999 01 01 to RecordEnd components For the filled components we assigned randomly a date between one year after the RecordBeg and 2010 12 31 VehBeg field is composed by data extracted from the vehicle age field from freMTPL2freq for the 26 000 randomly selected vehicles in Auto table The values were previously modified because in freMTPL2freq the vehicle age was measured in years and we preferred to have the date when the car was building VehBrand is the field mentioned in the Auto table paragraph with 26 000 values corresponding t
132. rding to our opinion Finally the resulting Auto table Fig 9 has 11 registers and 3 columns VehBrand VehPow VehType 1 6 compact 2 6 compact 3 6 familiar 4 6 compact 5 6 terrain 6 6 familiar 10 9 sport 11 9 sport 12 7 terrain 13 8 sport 14 7 sport Fig 9 Auto table 5 3 3 Region table Region table corresponds to a dimension table including regions information and it is composed by the following fields Code Region and Population Fields are described in Tab 11 Numeric fields Field name Description Min Max Population Population 78 476 6 489 680 Categorical fields Field name Description Values 46 Region identification 1 2 3 52 Region name Alava Albacete Alicante 52 different values Tab 11 Region table description There had been supposed that regions were the 52 Spanish provinces Each value of de Code field represented a region and Population was the population of each region extracted from wikipedia Finally the resulting Region table Fig 10 had 52 registers and 3 columns Code Region Population 1 Alava 319227 2 Albacete 402318 3 Alicante 1934127 4 Almeria 702819 5 Asturias 1081487 6 Avila 172704 7 Badajoz 693921 8 Barcelona 5529099 9 Burgos 375657 10 Caceres 415446 Fig 10 Region table 5 3 4 RiskArea table RiskArea table corresponds to a dimension table including information about the type of insurance risk area and it
133. rmation he wants and build his owns reports In traditional tools user asked to a technical team for the information he needed and he ordered how information had be displayed and the technical team prepared data and built the ordered reports Against that Self Service tools are being imposed on others because the working methodology is changing from being driven by the business model to being driven by the data model The main features of Self Service tools according to INDRA S A are e Ease of use These tools are designed to be used by non technical people It means that users do not need to spend much time in learning how the tool works before doing a basic analysis e Ability to incorporate data sources both corporative data base Oracle SAP etc as local Basically excels and also external data base Twitter etc e Intelligence to interpret correctly data models As they are auto service tools and they face to many type of data model without a previous modeling by a technical team the interpretation of the model from the tool must be the correct one If it is not the correct one it can be misleading How easy is to discover that the data model is wrong and how easy is to arrange the data model are also important points to consider e Analysis functions Besides the typical pie and bar graphs they must incorporate other tools in order to get advanced analysis integration in R statistic routines always remembering the easy
134. rouped in different professional specializations Such as the complexity of the architecture behind a database in that case a relational database Which we learned during the creation of the database with R Also we learned different ways of storing data and its characteristics Moreover we learned the interests of companies in such kind of products In fact there were more metrics related to the ease of a exploratory analysis than related to the power of predictions of the tools Once we were capable of evaluating the tools we carried out the study We learned that there can be a huge structure behind an evaluation and building it can be hard task because we had to know about the area of BI and moreover we had to know how to measure them additional to all the work done to get the database While we were deciding how to measure the metrics we realize that the cutoff of satisfaction it might be subjective That is why we did the evaluation with two different satisfaction limits The first one established that a characteristic was satisfied if the 75 of its sub characteristics were satisfied While the second one established that it was satisfied if the 80 of its sub characteristics were satisfied In both cases the rest of the satisfaction limits kept constant In the first scenario we observed that Tableau and QlikView got an advanced quality level unlike SAP Lumira which got a medium quality level and MicroStrategy which was rejected
135. rovided free as part of OlikView Server FIP3 Using he Project by third parts With the QlikView Personal Edition user can only open files that he created himself On the other hand with the licenced Server edition it does not occur In fact documents created by QlikView Server can be shared with other users and they can even open projects created by Personal Editions F U1 Moreover they can be published and accessed in mobile platforms by users with permissions Data exchange Data from graphs can be copied to clipboard and pasted in several type of files for example Comma Delimited csv txt Semicolon Delimited skv txt Tab delimited tab txt FIDI FID2 Hypertext html htm FID3 XML xml Excel xls FID4 There is a direct option to export data directly in an Excel sheet Security devices With OlikView Personal Edition there is not the possibility to assign security devices to projects or data However with the Server Edition assigning permissions and passwords to users is posible by the Editor Script FSS1 FSS2 OlikView Personal Edition does not offer the option to assign permissions because it is a personal edition and documents are not shared to other users Moreover there not exist the possibilty to require a pasword 91 Learning time In few minutes user is able to create a chart following the user guide Its intuitive interface let the user to fastly adapt to it Being able
136. s 3 3 g 75 00 1 FFA3 Variety of functions 3 3 9 75 0096 1 FFA4 Descriptive statistics 3 gl 6 75 0096 1 FFAS Preduction functions A 3 2 6 75 00 T FFA6 R connection 2 2 4 50 00 2 FFA7 Geographic information A 3 2 6 75 0096 dl FFA8 Time hierarchy A 3 3 9 75 00 1 FFA9 Creating sets of data A 0 2 0 0 00 0 FFA10 Filtering data by expression A 2 3 6 50 0096 1 FFA11 Filtering data by dimension A 1 3 3 25 0096 0 FFA12 Visual Perspective Linking A O a 0 0 00 0 A FFA13 No Null data specifications 1 0 2 0 0 00 0 FFA14 Considering nulls 4 3 12 100 00 il FFA15 Variety of graphs 4 3 12 100 00 T FFA16 Modify graphs 1 3 B 25 0096 0 81 100 00 42 86 100 00 68 89 83 33 66 67 No limitations to display large amounts of data 0 0096 50 0096 75 00 0 00 75 0096 75 0096 75 00 Free design 75 00 FIL1 Languages displayed 4 2 8 100 00 1 100 00 Operating Systems 0 2 0 00 SaaS Web 3 1 75 0096 Mobile 2 25 00 INTEROPERABILITY FIU1 Using the project by third parts 75 00 A 6 1 Exportation in txt 0 0 00 0 Exportation in CSV 6 75 00 1 Exportation in HTML 0 0 0096 0 9 1 9 1 9 1 Data refresh Dashboards Exportation Templates Reports Exportation Templates m 0o n2 mn WIN N o o o o o o eee ae jo je Jo jo A al A A A Free design A A A A A i Exportation in Excel file 75 00 FSS1 Passwor
137. s FFD3 104 Reports There is not the option to build reports with MicroStrategy Analytics Reports are considered dashboards saved them as pdf FFR Then there are not templates for specific reports FFR2 And free design is the offered in dashboards FFR3 Languages MicroStrategy Analytics Desktop can be displayed in more than two languages They are Chinese Danish Dutch English French German Italian Japanese Korean Polish Portuguese Russian Spanish and Swedish FILI Portability MicroStrategy Analytics is a solution only for Windows operating systems FIPI The compatible versions are Windows Vista Business Edition SP2 on x86 or x64 Windows Vista Enterprise Edition SP2 on x86 or x64 Windows 7 Professional Edition SP1 on x86 or x64 Windows 8 all editions on x64 Windows Server 2008 Standard Edition R2 SP1 and SP2 on x64 Windows Server 2008 Enterprise Edition R2 SP1 and SP2 on x64 and Windows Server 2012 Standard on x64 MicroStrategy platform offers MicroStrategy Analytics Express which is a cloud based self service visual analytics FIP2 It is free for a year and it let user to be able to establish an account invite colleagues to connect analyze and share their data insight and do it all at no charge With Express user can easily access and explore data in Analytics using interactive visualizations Analytics Express and Analytics Desktop are not connected then if you want to have a
138. s time to carry out the evaluation The current evaluation was done only by me as an Explorer user But as it is said in sub chapter 3 3 an evaluation should be done by several users representing all the different types of users In this project it could not be possible but in order to get concluding results it should be done In order to store the scores an excel sheet with the 82 metrics was build It 1s where the evaluator have to fill the cells with the score for each of the metrics The sheet was build considering the weights and the satisfaction score established in sub chapters 3 3 1 and 3 3 2 respectively The sheet was replicated identically assigning a sheet to each application Therefore a total of 4 excel sheets were filled by the evaluator Reader can visualize them in Annex 2 although the whole file will be attached to the thesis 53 Additionally to the sheets user has to work with the particular Self Service BI applications in order to evaluate them and they must be available to him Mostly all Self Service applications offer distinct editions They usually have a Server Edition and a Personal Edition also known as Desktop Editions A Server edition is focused on companies They offer the connection of several users to a central server which access to data and process them with much power than a local computer Moreover projects and data can be easily shared between users connected to the same server On the other han
139. score is higher than the 50 of its maximum score in the measuring scale For example as our metric measuring scale is from 0 to 4 a metric is satisfied if its score is higher than 2 But the evaluator can also sentence the limit to 3 and by this way a metric is satisfied if its score is higher than 3 Usually assessments are done to determine which tools are better than others supposing that all the evaluated tools satisfy the main part of the features When the evaluator is looking for a distinction between tools these type of limits can be useful This concept is applicable to our units of measurement which are metrics sub characteristics characteristics and categories Once metrics are evaluated with their respective scales of measurement A A 1 A 2 the methodology used to determine the satisfaction score is as follows e Metrics scores are normalized with a percentage e A metric is satisfied if its percentage score is higher or equal than the fixed limit e Sub characteristics are measured by the amount of metrics satisfied satisfaction score Then a particular sub characteristic is satisfied if the amount of satisfied metrics is higher or equal than its fixed limit As weights were added the satisfaction score become n Seay SCOTY E sub characteristici Y jT 24 Where vj While w is the weight for the corresponding metric j And n corresponds to the number of metrics in the sub characteris
140. se Amazon Redshift Aster Database Cloudera Hadoop DataStax Enterprise EXA Solutions Firebird Google Analytics Google Bigquery Hortonworks Hadoop Hive HP Vertica IBM BiglInsights IBM DB2 IBM Netezza MapR Hadoop Hive and MarkLogic FFI FFI2 FFI3 Other databases with ODBC connection are compatible with Tableau On the other hand Tableau can be also connected to Microsoft Access 2003 or later FFI4 Microsoft Excel 2007 or later FFI5 Microsoft Windows Azufre Marketplace DataMarket OData Tableau Data Extract and text files FFI6 Connecting to an Excel file all sheets from the file are loaded in Tableau s memory at same time Moreover user can also load crosstabs from Excel files Tableau needs crosstabs in a specific format and to get this Tableau offers an Excel complement to normalize cross tabs in order to load the crosstab to Tableau FFI6 FFI7 The connection to different data source at same time is possible and the integration of them is easy because the relation between datasets is done automatically if the names are equal or it can be done manually by the user Then if there is a link field that connect two datasets a chart can be build with fields from different datasets FFI9 FFIIO During the loading user can visualize some rows of the dataset 1 User can also filter columns but he cannot filter rows FFI 4 FFI 5 User can also decide the data format of the fields between number decimal numb
141. se common names and weren t automatically detected When user assign a field geographic role it is marked with a globe icon It means that Tableau has automatically geo coded the information in that field and associated each value with a latitude and longitude which user can use to geo code his data If Tableau does not recognize a geographic role user can create new geographic roles and assign them to the geographic fields in data FFA7 Time hierarchies are also created automatically FFAS A particularity of Tableau is that user can create sub datasets and analyze only them FFA9 Then user can create charts only for particular datasets Moreover user can also add typical filters to a visualization and filter displayed data by dimensions and 0 11 Tableau offers interaction in a visualization but interaction between visualizations it is not possible FFA12 There are some assumptions about NULL data in files from where data is loaded For Excel csv and text files user should leave cells of data empty to represent NULL values rather than using the text NULL When data comes from a database and there are nulls they pass automatically to Tableau as nulls FFAI3 Tableau don not consider Nulls values In fact if there are null values in a field it can not be displayed in a visualization and the tool alarms about the type of problem The solutions then is filter data to display all the non values FFA 14 Th
142. sers of each type should evaluate the tools from their particular point of view There are three different profiles of user in data systems according to Inmon Imhoff amp Susa 1998 Farmers They access to information in an absolutely predictable and repetitive way We could say that they have their parcel of information and they cultivate and extract profit from this regularly They do not access to huge amount of data because they do not leave the parcel and they usually ask for aggregated data These users usually use OLAP Online Analytical 11 Processing tools which are focus on non informatics users They are simple and their main objective is the data visualization As farmers there are employers providers and customers to whom the organization offers informational services Currently Business Intelligence which promotes the use of these systems at all levels of the organization allows business users to use data and information in business processes naturally without having to leave their applications Explorer At the contrary of farmers explorers have totally unpredictable and irregular accesses They spend much time planning and preparing their studies and when they have everything ready they start to explore a lot of such detailed information as possible They really do not know exactly what are looking until they find it and in any case the results are guaranteed However sometimes they find something really interest
143. several values in a field are selected QlikView makes a split second association showing only values in other fields associated with the current selection Simultaneously sheet objects holding one or several general expressions are calculated to show the result of the current selection For example there exist interaction between charts when user select some values in a chart automatically another chart will only show values associated with the selection This fact eases discovery relations between fields and it is key in data discovery science FFA 2 On the other hand creating new data sets is useful to analyze directly particular samples in the same workbook but OlikView has not the option to do create them Similarly OlikView analizes particular datasets using its visual perspective linking FFA9 Due to the same reason OlikView have not got filters User filters data using the interaction between sheet objects FFAIO FFA11 Moreover QlikView also offers the option to lock sheet objects in order to not being modified due the interaction OlikView offers a corporative complement GeoQlik which is a GIS component for Geo Business Intelligence within QlikView It offers normalized GIS formats as ShapeFile PostGIS Oracle Spatial Oracle Locator Esri spatial databases virtual globes Google Maps OpenStreetMap all kinds of geometries rasters and Web services and also csv files with the coordinates It gives much power to QlikView
144. share with other people to visualize and interact with the results Templates FFD2 It evaluates the capability to fix a schema dashboard or access to templates in order to use it several times with different types of data It is a useful feature to homogenize projects Free design FFD3 It measures the ability to let the user to build dashboards with total freedom Some tools have limited options to building dashboards while others lets the user to insert text format it inserting images vi Reporting This sub characteristic includes several metrics to measure reporting capabilities of a tool 1 3 Reports Exportation FFRI It evaluates the diversity of formats to export reports Some formats are excel spread sheet PDF files HTML files Flash file the own tool format Templates FFR2 It evaluates the capability to fix a schema report or access to templates in order to use it several times with different data It is a useful feature to gain consistency when the user builds the same type of report periodically Free design FFR3 It measures the ability to let the user to build reports with total freedom Some tools have limited options to building dashboards while others lets the user to insert text format it insert images 2 Interoperability characteristic This characteristic includes several sub characteristics in order to evaluate the capability of an application to work with other organizations and systems 3
145. t the user can define how are the null values represented in the data source Considering nulls FFA14 This metric measures if applications consider null values as another value Considering null as other value might be useful because the user can visualize the behaviour of null data and then detect a pattern for them This metric also evaluates if null values are skipped from a calculated expression Variety of graphs FFA15 It measures the diversity of graphs offered by the application Modifying graphs FFA16 It measures the capability to modify the default setting of graphs For example if there is the possibility to change levels of a legend change colours change the shapes of markers It is an important characteristic because sometimes it is the key to understand a data pattern Huge amount of data FFA17 It measures the capability to display huge amount of data Particularly it measures the capability to display datasets without any data problem because of its size Data refresh FFA18 It measures the capability to update data automatically For example if data are modified in the original file some applications update automatically the data while in others the user must do it manually v Dashboard This sub characteristic includes several metrics to measure the capabilities of a tool relating to dashboards 1 Dashboards exportation FFD1 It evaluates the capability of the tool to export the dashboard in order to
146. tal BI related software license revenue annually plus 15 year over year in new license growth In the case of vendors that also supply transactional applications show that its BI platform is used routinely by organizations that do not use its transactional applications Had a minimum of 35 customer survey responses from companies that use the vendor s BI platform in productions With this added non technical features they guaranty that at least 35 companies use each one of the tools Moreover they guaranty that companies which are growing year over year use these tools That s why INDRA is interested in these particular tools because of their popularity and as a consultant they want to be up to date on this area And the medium list obtained is 9 Alteryx Birst Board International Datawatch GoodData IBM Cognos Information Builder Logi Analytics Microsoft BI 10 MicroStrategy MicroStrategy Visual Insight 11 Open Text Actuate 12 Oracle 13 Panorama 14 Pentaho 15 Prognoz 16 Pyramid Analytics 17 Qlik QlikView 18 Salient Management Company 19 SAP SAP Lumira 20 SAS SAS Business Analytics 21 Tableau 22 Targit 23 Tibco Software 24 Yellowfin 38 Finally to build the short list we focus on the particular aims of the organizational unit The particular tools that we wanted to evaluate are Self Service BI tools and it means that the business user should be able to analyze the info
147. terdate lt c Sinisterdate Sinisterdate 2 nsinisters j lt sum s total number of sinisters for policy i in year j SinistersXYear i nsinisters if sum nsinisters 0 Sinisterdate lt c Sinisterdate as Date 9999 01 01 Sinisterdate lt Sinisterdate 2 length Sinisterdate Delete the first value of Sinisterdate DATAFRAME SINISTERSXYEAR df_SinistersXYear lt data frame ClientID SinistersXYear View df SinistersXYear HEXPORTATION SINISTERSXYEAR write table df SinistersXYear C Users jorcajo Desktop Tables SinistersXYear txt sep t row names F HOTHER FIELDS HCode S I lt length Sinisterdate 5 lt 0 1 Code O lt 0 1 probs rep 1 52 52 total lt rep 0 N for i in 1 N 72 total i lt sum SinistersXYearl i HCode O SINISTERS HCode O sinisters is needed to create Code S j lt 1 for i in 1 N if total i 0 for k in O total i 1 4 Code _sinisters j k lt Code_O i j lt j k 1 if total i 0 Code sinisters j Code O i j lt j 1 probs lt NULL HCode S is equal to Code with a probability of 0 7 set seed 12342 for i in 1 1 for j in 1 52 if Code sinisters i 2Code j f probs 1 j 1 0 3 51 probs j 0 7 probs j 1 52 lt 0 3 51 probs lt probs 1 52 Code S i sample Code 1 prob probs PolicylD s rep 0 l RiskArea_s lt rep 0 1 Guarantees s rep O 1 set see
148. teristics are satisfied Characteristic Tool lll oiikview FIT TO PURPOSE INTEROPERABILITY SECURTIY Fig 24 Characteristic results for a second evaluation It is because the Portability sub characteristic is not satisfied as a consequence of QlikView only works on Windows operating systems F P and it does not offer an available SaaS Software as a service edition FIP2 59 Then Tableau reached an advanced quality level And it can be considered the most appropriate tool for the established requirements Although recalling that the results cannot be interpreted as concluding because only a user has done the evaluation There is detailed information about the evaluation in Annex 3 60 7 Conclusions This project had the purpose of building an assessment of Self Service BI tools and evaluate in particular 4 tools MicroStrategy Analytics QlikView SAP Lumira and Tableau In order to build the assessment an existing quality model was taken as a reference It was the Systemic Quality Model SQMO developed by the Universidad Sim n Bol var Venezuela We adapted this model to our particular aims and then we established the metrics with the help of a BI expert To evaluate all the metrics established by the expert we had to learn many new concepts of the Bl area We came from a different area of the BI world and we had to learn the basics of all this concepts and we also learned that many of this concepts are g
149. tic e Characteristics are measured by the amount of satisfied sub characteristics satisfaction score Then a particular characteristic is satisfied if the amount of satisfied sub characteristics is higher or equal than its fixed limit e Categories are measured by the amount of satisfied characteristics satisfaction score Then a particular category is satisfied if the amount of satisfied characteristics is higher or equal than its fixed limit In the current evaluation we decided to use the following limits in order to get distinctions between tools lif the metric j is satisfied 0 if the metric j is not satisfied Limit for metric 5096 Limit for sub characteristic 50 Limit for Characteristic 7596 Limit for Category 7596 Tab 7 Satisfaction limits The evaluator can decide to modify the levels in the case that any distinction exist between tools or to be more restrictive or unrestrictive 25 3 4 Sub characteristics and metrics for Self Service Bl tools evaluation In a evaluation the most important step is to decide which characteristics must be evaluating According to the SOMO schema showed in Fig 3 these characteristics were already agreed but we had to establish the metrics related to each characteristic By the experience in BI department and after working with these type of tools we felt able to decide which particular topics must be checked from Self Service BI software For
150. unctionality Usability Efficiency Quality level Qlik View Satisfied Satisfied Satisfied Advanced SAP Lumira Satisfied Satisfied No satisfied Medium Tableau Satisfied Satisfied Satisfied Advanced Fig 22 Quality levels depending on satisfied categories The particular case Then QlikView and Tableau offeres an advanced quality level while SAP Lumira has a medium quality level In order to get differences between QlikView and SAP Lumira the fixed levels for satisfaction are increased being more restrictive Particularly we use the following levels Limit for metric 50 Limit for sub characteristic 50 Limit for Characteristic 80 Limit for Category 75 Tab 18 Satisfaction Limits for a second evaluation By this way a characteristic becomes satisfied if only the 80 of its sub characteristics are satisfied And as it can be seen in Fig 23 only Tableau satisfy the Functionality category 58 unlike QlikView which does not because only the 66 67 of its functional characteristics are satisfied Tool MicroStrategy Qlikview SAP Lumira Tableau FUNCTIONALITY USUABILITY EFFICIENCY o 9096 Fig 23 Category results for a second evaluation In fact as it its seen in Fig 24 OlikView does not satisfy the Functionality category because the nteroperability characteristic is not satisfied in fact it has a score of 7596 meaning that only the 75 of the Interoperability sub charac
151. ures based on FFA2 dimensions 3 B 9 75 0096 1 FFA3 Variety of functions A 3 3 9 75 00 1 FFA4 Descriptive statistics A 3 2 6 75 00 1 FFAS Preduction functions A 3 2 6 75 00 1 FFA6 R connection 3 2 6 75 00 1 FFA7 Geographic information 3 2 6 75 00 1 FFA8 Time hierarchy A 3 3 9 75 00 1 FFA9 Creating sets of data A 3 2 0 00 O FFA10 Filtering data by expression 3 3 o 75 00 1 FFA11 Filtering data by dimension A 3 2 6 75 00 1 FFA12 Visual Perspective Linking A O 2 0 0 00 O A FFA13 No Null data specifications 1 0 2 0 0 00 O FFA14 Considering nulls A 0 3 0 00 O FFA15 Variety of graphs A 3 3 9 75 00 1 FFA16 Modify graphs A 4 3 12 100 00 1 83 90 00 100 00 62 50 79 55 100 00 1 100 00 No limitations to display large amounts of data 75 0096 100 00 75 00 0 00 75 00 75 00 0 00 Free design 75 00 FIL1 Languages displayed 4 2 8 100 00 1 100 00 Operating Systems 4 2 100 00 SaaS Web 1 75 0096 Mobile 2 25 00 INTEROPERABILITY FIUL Using the project by third parts 75 00 Y Exportation in txt Exportation in CSV Exportation in HTML Exportation in Excel file FSS1 Password protection 75 00 SECURTIY FSS2 Permissions 75 0096 100 00 25 00 Data refresh Dashboards Exportation Templates Reports Exportation Templates o o o o jojojo o le le e A al A A A Free design A A A A
152. use e Possible integration with corporative systems and efficiency Usually the user will work with huge volume of data and therefore the analysis cannot be in a local PC Tools should have the option of a central server which access to data and process them Big companies as INDRA S A clients needs security when the server is incorporated to the corporative environment And then the role of an administrator to manage the user s access is key for big companies e Support In the case that an open source tool was included in the larger list it will not be considered in the medium list because open source cannot offer an instantly customer support In open source there are communities of users who can help others in their problems and for INDRA as a big company and as a company who offers they workers as a service the customer support is very important and must be fast Therefore these five 5 features characterize the particular aims of the organization for the tools to be evaluated with the adapted SQMO And then from the medium list the short list includes only tools which by our point of view satisfy the mentioned features and they are 1 MicroStrategy Visual Insight from MicroStrategy platform 2 Panorama 3 Pyramid Analytics 4 QlikView tool from Qlik platform 5 SAP Lumira from SAP platform 6 SAS Business Analytics from SAS platform 7 Tableau 39 8 Tibco Jaspersfot from Tibco Software Platform N All BI platf
153. vailability of consulting services UES4 It measures if the company offers consulting services 5 Free formation UES5 It evaluates if the platform offers free formation for users 6 Community UES It evaluates if there exist a community to ask for doubts or to share knowledge with other users 2 Graphical interface characteristic This characteristic evaluates the graphical interface of the tool 1 Windows and mouse interface This sub characteristic evaluates the windows interface and the mouse functions 1 Editing elements by double clicking UGWI It measures if the tool offers editing elements by double clicking 2 Dragging and dropping elements UGW2 It measures the capability of the tool in dragging and dropping elements 33 ii Display This sub characteristic refers to a unique metric about the capability of editing the screen layout 1 Editing the screen layout UGD1 It measures the capability of a tool to edit the screen layout 3 Operability characteristic This characteristic evaluates the ability of the tool to keep the system and the tool in reliable functioning conditions i Versatility This sub characteristic evaluates the versatility of the tool 1 Automatic update UOVI It measures if the tool is automatically updated when new versions appears 3 4 3 Efficiency category 1 Execution performance characteristic This characteristic is composed by sub characteristics which evaluates the executio
154. vedra Salamanca Edge size Navarra Toledo Count Sinisterid Guadalajara Valladolid Burgos Cantabria Badajoz Cuenca Alava Avila Edge color Zaragoza Albacete Percent of Count Sinisterid Zamora Barcelona Tenerife Caceres Tarragona Ceuta Sevilla Cordoba Melilla Granada Lugo Huesca La Rioja Fig 30 Example of a Network chart built by MicroStrategy Analyitics Page by Month of Year of Si The current map marker shows the amount of sinisters in each region in November Madrid and Barcelona differs from other regions It is usual because they are the regions with more population E Skikda Algiers Pala s v aga tome f Fig 31 Example of a Map chart built by MicroStrategy Analyitics 114 Page by Month of Year of S The current map marker shows the amount of sinisters in each region in Agus Madrid and da differs from other regions It is usual that Madrid differs from others because it has more population The behaviour of behaviour is particular PalmaDe Mallorca Q Valencia Fig 32 Example of a Map chart built by MicroStrategy Analyitics DrivYears 12 Row Count_Sinisters Fig 33 Example of a k means classification plot done by the previous connection of MicroStrategy Analytics to R 115 Fig 34 Example of a Heat Map built by QlikView Fig 35 Example of a Radar Map built by OlikView 116 Fig 36 Example of a Radar Map
155. ver usually SAP Lumira shows the advice There are too many data points to visualize maximum is 10 000 Filter the data to reduce the number of data points It is a big con when user is displaying many data Although it is a default parameter and user can increase it also considering increasing the virtual memory allocated to SAP Lumira However there exists a option to solve the problem but in my opinion the default maximum number of data points is much low for this type of tool It totally pales in comparison to Tableau that can render 60 million points FFA17 97 Dashboards Dashboards can be exported and share with other users FFD1 There are not templates integrated in the application FFD2 Moreover user can design himself dashboards with totally freedom FFD3 Reporting Reports can be exported to SAP Lumira Cloud SAP Lumira Server and SAP Business Objects BI platform Moreover they cam be exported to pdf file FFR1 There are templates that fix only the distribution of the charts but user has totally free to customize them FFR2 There are different options to report the results Boards pages where user can add interactive charts filter and input controls Infographic where user can add chart visual properties pictograms shapes images and text and Report pages where user can add interactive charts sections and input controls Input controls let user filter data in a comfortable way When user is building the report
156. way of storing big data nowadays Microsoft Access FFI4 It evaluates the capability to connect to Microsoft Access database Excel files FFI5 It evaluates the capability to load data from Excel files From an Excel file load data from all sheets at the same time FFI6 It evaluates the capability to load data from all sheets at the same time In some applications user must do the same data loading process for each one of the sheets while other tools lets user to choose which sheets he wishes to load and import them at the same time Cross tabs FFI7 It measures the capability of loading data from cross tabs in Excel files Usually applications need cross tabs in a specific format and some of them have an excel complement to normalize the cross tabs before importing it Plain text FFIS It evaluates the capability of loading data from plain text files txt inf 80 dat tmp prv hlp htm etc Connecting to different data source at the same time FFI9 It evaluates the capability to connect the application to several data sources at the same time in order to do cross analysis between data from them Easy integration of many data sources FFII0 It evaluates how easy is for the user integrate many data source in the data analysis Showing data before the data loading 1 It evaluates the capability to show data before the data loading Showing data can be useful for the user to understand how data are before loa
157. xperienced people in the area Moreover the evaluator has to provide every item required to do the evaluation questionnaires data applications Finally the evaluator collects the questionnaires and proceeds to evaluate the results according to the chosen methodology In this particular thesis the used methodology consists in the adaption of a Systemic Quality Model SQMO which is a model to evaluate software The Systemic Quality Model and its adaption is explained in detail in chapter 3 On the other hand in chapter 4 there is the method used to select the applications which can be evaluated Finally in order to evaluate particular tools data and a questionnaire which should be provided to users were built A database called 20141220 Initial test was simulated and it is explained in chapters 5 Moreover the R scripts built to simulated it are in Annex 1 The questionnaire resulting on the adaption of the SOMO is in Annex 2 Finally the answered questionnaires were analyzed and the results are explained in chapter 6 Finally in Annex 7 there some graphs built by Self Service BI tools in order to introduce them to the reader Recalling that the first step in a assessment is to know the area of use and in this particular case it is the Business Intelligence area For this reason the terminology used in the thesis can be specific from the BI area And reading the following sub chapter 3 2 is recommended to understands the t
158. y Adaptability Consistent Installation capability Parameterized Co existence Encapsulated Replacement capability Cohesive Portability Specified Documented Auto descriptive No redundant Auditing Quality management Data Quality both dimensions Tab 1 Characteristics for Product sub model 15 Characteristics Category Process Effectiveness Process Efficiency Acquisition System or Software Supply Customer product supplit Requirements determination Operation Engineering Development Maintenance of software and systems Quality assurance Documentation Joint review Configuration management Support Auditing Verification Solving problems Validation Joint review Auditing Solving problems Management Management Quality management Project management Management Risk management Quality management Risk management Organizational Alignment Establishment of the process Management of change Process evaluation Organizational Process improvement Process improvement Measurement HHRR management Reuse Infrastructure Tab 2 Characteristics for Process sub model 3 1 4 Level 3 Metrics Each characteristic has a group of metrics to be evaluated They are the evaluable attributes of the product and the process and they are not agreed because they vary depending on each study case Metrics are defined for our particular case in sub chapter 3 4 3 2 Algorithm The algorithm to measure the syst
159. ystemic total quality 1996 It also considers the balance between the Process and Product sub models proposed by Humphrey 1997 These sub models are based on the Product and Process Quality models from Ortega P rez amp Rojas 2000 and P rez Rojas Mendoza amp Grim n 2001 respectively Moreover the product quality categories are based on the work of Dromey 1996 and the international standard ISO IEC 9126 JTC 1 SC 7 1991 And the process categories are extracted from the international standard ISO IEC 15504 ISO IEC TR 15504 2 1998 18 Some authors as Kitcheman Kitchenman 1996 have pointed out that when characteristics are complex can be divided into a set of some simpler and a new level sub characteristics can be created In that particular case sub characteristics have been considered in order to gain clarity In order to adapt the SQMO to each particular case there must be decided which sub model is considered Product Process or both which dimension Efficiency or and Effectiveness which sub characteristics for each characteristic and which respective metrics should be evaluated In the current evaluation only the Product sub model of SQMO was considered The Process sub model is excluded because our intent is to evaluate the fully already developed tools as a future tool used for the BI workforce For this reason Process sub model is not considered Moreover only Effectiveness dimension was considered

Download Pdf Manuals

image

Related Search

Related Contents

KM125A Service Manual  Worldwide Lighting W83093C39 Installation Guide  Operational Safety Guideline Air Compressor  LB30 - user manual  guía del usuario  Eine Bedienungsanleitung finden Sie hier!  Multimedia system for vehicle with portable dash pad  照明ユニット  

Copyright © All rights reserved.
Failed to retrieve file