Home
NetKarmaGUSHAdaptor Tool User Manual V2.5August 26, 2011
Contents
1. lt netkarma notificationType gt WORKFLOW_INVOKED lt netkarma notificationType gt lt netkarma notificationPartType gt DATA BLOCK lt netkarma notificationPartType gt lt netkarma dataBlocks gt lt netkarma datald gt config param lt netkarma datald gt lt netkarma dataType gt BLOCK lt netkarma dataType gt lt netkarma dataValue gt lt netkarma urilnfo gt lt netkarma identifier gt port lt netkarma identifier gt lt netkarma type gt URN lt netkarma type gt lt netkarma urilnfo gt lt netkarma selectMethod gt lt netkarma selectionType gt SUBSTRING lt netkarma selectionType gt lt netkarma substring gt lt netkarma argumentNumber gt 3 lt netkarma argumentNumber gt lt netkarma beginIndex gt 24 lt netkarma beginIndex gt lt netkarma substring gt lt netkarma selectMethod gt lt netkarma argumentNumber gt 3 lt netkarma argumentNumber gt lt netkarma dataValue gt lt netkarma dataBlocks gt lt netkarma notificationTime gt netkarma timestampLocator DERIVED netkarma timestampLocator xnetkarma timestamp lt netkarma selectionType gt COMPLETE STRING lt netkarma selectionType gt lt netkarma completeString gt lt netkarma argumentNumber gt 2 lt netkarma argumentNumber gt lt netkarma completeString gt lt netkarma timestamp gt lt netkarma notificationTime gt lt netkarma notification gt lt netkarma ruleset gt 7 dep
2. a a a a a Karma a a a a karm ukarma karma karma karma karma ukarma karma karma karma karma karma karma karma karma karma karma karma karma karma karma tkarma tkarma tkarma notification notificatonId 1 netkarma notificatonId notificationType WORKFLOW INVOKED netkarma notificationType notificationPartType DATA BLOCK lt netkarma notificationPartType gt dataBlocks datald gt config param lt netkarma datald gt dataType gt BLOCK lt netkarma dataType gt dataValue gt urilneos identifier gt port lt netkarma identifier gt type gt URN lt netkarma type gt urilnfo selectMethod selectionType gt SUBSTRING lt netkarma selectionType gt substring gt argumentNumber gt 3 lt netkarma argumentNumber gt beginIndex gt 24 lt netkarma beginIndex gt substring gt selectMethod argumentNumber gt 3 lt netkarma argumentNumber gt dataValue gt dataBlocks gt notificationTime gt timestampLocator DERIVED netkarma timestampLocator timestamp selectionType COMP completeString ETE STRING lt netkarma selectionType gt argumentNumber 2 netkarma argumentNumber completeString gt timestamp gt notificationTime gt notification gt ruleset gt net net net net net net CKarm a a CKarma CKarma a CKarma CKarma CKarma cxarma
3. readToken gt argumentNumber gt 3 lt netkarma argumentNumber gt delimiter gt lt netkarma delimiter gt maxTokens gt 2 lt netkarma maxTokens gt tokenNumbers gt 2 lt netkarma tokenNumbers gt readToken gt simpleSelection gt complexSelection gt value gt dependencyData gt dency process gt experiment output gt dependency gt sourceNotificationld gt 7 lt netkarma sourceNotificationld gt targetNotificationId 3 netkarma targetNotificationId sourceActorType gt INVOKEE lt netkarma sourceActorType gt targetActorType gt PRODUCER lt netkarma targetActorType gt matchRule matchLineType gt PREV NTH LINE lt netkarma matchLineType gt matchLineNum gt 1 lt netkarma matchLineNum gt matchDataName gt PID lt netkarma matchDataName gt matchDataValue selectionMechanism gt SIMPLE lt netkarma selectionMechanism gt simpleSelection gt selectionType gt READ TOKEN lt netkarma selectionType gt T oken gt argumentNumber gt 3 lt netkarma argumentNumber gt delimiter netkarma delimiter maxTokens gt 2 lt netkarma maxTokens gt tokenNumbers gt 2 lt netkarma tokenNumbers gt readToken gt simpleSelection gt matchDataValue gt matchRule dependency Smet net net net ne amp r net net lt net met net ne ne Depen karma karma karma karma tkarma Globa karm karm
4. a ukarm Depen karma karma karma karma karma karma Karm Karm filter gt tkarma Q mh EM tkarma value selectionMechanism gt COMP simpleSelection gt dency data HOST gt dependencyData gt name gt HOST lt netkarma name gt argumentNumber gt 0 lt netkarma argumentNumber gt argumentValue gt process block cc lt netkarma argumentValue gt comparator gt EQUALS lt netkarma comparator gt filter gt ilter gt rgumentNumber gt 3 lt netkarma argumentNumber gt rgumentValue gt Client has notified lt netkarma argumentValue gt ilterPredicate gt AND lt netkarma filterPredicate gt omparator gt CONTAINS lt netkarma comparator gt Filter EX lt netkarma selectionMechanism gt complexSelection gt T selectionType READ TOKEN lt netkarma selectionType gt 22 lt net lt net lt net lt net lt net lt ne lt ne lt ne lt ne lt ne xl xnet xnet lt net lt net lt net lt net lt net lt net lt net A A A A A A A 3 o ct net net net ne ne ne ne ne ukarm CKarm CKarm CKarm Karma Karma Karma Karma Karma Depen karma karm karm karm karm karm karm karm karm Karm Karm a a a a a a a a a a a a a a Karma a Karm Karma tkarma tkarma tkarma tkarma tkarma tkarma tkarma read1 tkarma tkarma tkarma
5. gt lt elementname entitySubtype type netkarma actorEntityEnumSubtype minOccurs 0 gt lt elementname annotations type netkarma annotationType minOccurs 0 maxOccurs un bounded gt 18 lt elementname parent type netkarma parentType minOccurs 0 maxOccurs unbounded gt lt elementname timestep type xsd int minOccurs 0 gt lt sequence gt lt complexType gt lt complexTypename dataType gt lt sequence gt lt elementname datald type xsd string gt lt elementname appendid type netkarma appenderType minOccurs 0 maxOccurs unbound ed gt lt elementname dataType type netkarma dataEnumType gt lt elementname dataValue type netkarma provenanceDataType gt lt elementname annotations type netkarma annotationType minOccurs 0 maxOccurs un bounded gt lt sequence gt lt complexType gt lt complexTypename annotationType gt lt sequence gt lt elementname property type xsd string gt lt elementname value type netkarma provenanceDataType gt lt sequence gt lt complexType gt lt complexTypename filterType gt lt sequence gt lt elementname argumentNumber type int gt lt elementname argumentValue type string gt lt elementname filterPredicate type netkarma filterPredicateEnumType minOccurs 0 7 5 lt elementname comparator type netkarma comparatorEnumType gt lt sequence gt lt complexType gt
6. karm karma tkarma tkarma LinkType gt SEQU sourceActorType gt INVOK deney link pid process gt dependencyLink gt INTIAL lt netkarma linkType gt source gt PID lt netkarma source gt target gt PROCESS lt netkarma target gt dependencyLink gt T 1 Dependencies gt globalDependency gt sourceNotificationId 4 netkarma sourceNotificationId targetNotificationId 7 netkarma targetNotificationId I E lt netkarma sourceActorType gt T targetActorType gt INVOKER lt netkarma targetActorType gt globalDependency gt karmaNotifications gt 23 Appendix D Sample Log File servicelnvoked sea ice processing 20101117181942 level1 http d2i org amsreprovenance iu filename cron l http d2i org amsreprovenance iu filename daily 20101117181942 invoked cronl is the main orchestrator of the workflow for Sea ice processing none 20091002 FINALfnonefnonefSea Ic dataProduced sea ice processing 20101117181942 1evell http d2i org amsreprovenance iu filename daily 1 http d2i org amsreprovenance iu filename clientServicelD 1203084381 dataProduced4 type empty file produced for santa processing var tmp daily 17618 std 14 none none servicelnvoked sea ice processing 20101117181942 levell http d2i org amsreprovenance iu filename daily 1 http d2i org amsreprovenance iu filename santa 20101117181944 invoked Da
7. ui m ies e GE 13 Appendix sample Rule Bite ninl os Gl secolo ei 21 Appendix D sampled 24 1 Introduction Karma Adaptor is one of the collection tools that make up the Karma provenance collection toolkit to harvest provenance from log files It uses a rule file specific to an application to map raw data into Karma specific provenance events The provenance of the data is stored into a relational database vvhich can be visualized through various plugins The provenance data can be used by the researcher to analyze their data allovv for the suspension and resumption of an experiment and provide references to find the details and data collected in an experiment This document describes 1 how to write a new rule file based on the rule set semantics for generating Karma specific provenance events and 2 an application using the adaptor methodology for parsing application log files and ingesting provenance events to a Karma provenance repository 2 Software Dependencies Karma Adaptor v2 5 has been tested with the following software packages on which it has a dependency These packages will need to be installed separately 2 1 Installation Dependencies 1 Apache ANT v1 6 or higher for building the tool from source http ant apache org 2 Java Development Kit TDK v5 or v6 http java sun com 2 2 Service Dependencies Figure 1 shows how Karma Adaptor fits in the Karma Proven
8. lt complexTypename provenanceDataType gt lt sequence gt elementname uriInfo type netkarma uriType lt elementname selectMethod type netkarma selectMethodType gt lt elementname argumentNumber type int gt lt sequence gt lt complexType gt lt complexTypename parentType gt lt sequence gt lt elementname parentId type netkarma provenanceDataType gt lt elementname parentType type netkarma parentEnumType gt lt sequence gt lt complexType gt lt complexTypename uriType gt lt sequence gt lt elementname identifier type xsd string gt elementname type type netkarma uriEnumType sequence lt complexType gt lt karma notification element gt lt complexTypename notificationType gt 19 lt seguence gt lt elementname notificatonId type xsd int gt lt elementname notificationType type netkarma notificationEnumType gt lt elementname notificationPartType type netkarma notificationPartEnumType gt lt elementname actors type netkarma actorsType minOccurs 0 maxOccurs unbounded lt elementname dataBlocks type netkarma dataType minOccurs 0 maxOccurs unbounded lt elementname annotations type netkarma annotationType minOccurs 0 maxOccurs un bounded gt lt elementname notificationTime type netkarma timestampType minOccurs 0 gt lt sequence gt lt complexType gt lt complexTy
9. netkarma argumentNumber gt filter argnum lt netkarma argumentNumber gt lt netkarma argumentValue gt value to compare lt netkarma argumentValue gt comparator EQUALS CONTAINS netkarma comparator netkarma filter netkarma netkarma notification netkarma notificatonId unique notification id netkarma notificatonId lt netkarma notificationType gt notification type lt netkarma notificationType gt lt netkarma notificationPartType gt subtype lt netkarma notificationPartType gt lt netkarma subtype gt subtype type lt netkarma subtype gt lt netkarma notificationTime gt lt netkarma timestampLocator gt derivation type lt netkarma timestampLocator gt lt netkarma timestamp gt timestamp type netkarma timestamp netkarma notificationTime netkarma notification netkarma ruleset Appendix B RulesetSchema xmlversion 1 0 encoding UTF 8 lt schemaelementFormDefault qualified targetNamespace http www dataandsearch org netkarma xmlns netkarma http www dataandsearch org netkarma xmlns xsd http www w3 org 2001 XMLSchema xmlns http www w3 org 2001 XMLSchema 1 lt TYPE DEFINITIONS gt 1 lt simpleTypename notificationEnumType gt lt restrictionbase xsd string gt lt enumerationvalue INVOKING_SERVICE gt lt enumerationvalue SERVICE_
10. sequence gt lt complexType gt lt complexTypename readPropertyOfType gt lt seguence gt lt elementname argumentNumber type int gt lt elementname key type string gt lt elementname delimiter type string minOccurs 0 gt lt sequence gt lt complexType gt lt complexTypename readArgumentOfType gt lt sequence gt lt elementname argumentNumber type int gt lt elementname methodName type string gt lt elementname paramDelimiter type string minOccurs 0 gt lt elementname paramList type int inOccurs 1 maxOccurs unbounded gt lt sequence gt lt complexType gt lt complexTypename readTokenType gt lt sequence gt lt elementname argumentNumber type int gt lt elementname delimiter type string gt lt elementname maxTokens type int gt lt elementname tokenNumbers type int minOccurs 1 maxOccurs unbounded gt lt sequence gt lt complexType gt 41 lt MAJOR DEFINITIONS gt les gt lt complexTypename actorsType gt lt sequence gt lt elementname actorId type netkarma provenanceDataType gt lt elementname appendid type netkarma appenderType minOccurs 0 maxOccurs unbound ed gt lt elementname actorType type netkarma actorEnumType gt lt elementname entityType type netkarma actorEntityEnumType gt lt elementname onBehalfOf type netkarma provenanceDataType minOccurs 0
11. Buildingand Executing the Adaptor This section describes how to build and use the Adaptor code when Karma Server is hosted as standalone server using the RabbitMQ messaging system 5 1 Build Both JAVA HOME and ANT HOME environment variables should be set before building the Adaptor code using ANT ant karma adaptor 5 2 Execution The environment parameters for executing the script to capture provenance should be set in the configuration file adaptor stdenvs cfg located in the adaptor home directory vi adaptor stdenvs cfg The lt ADAPTOR HOME and JAVA HOME path have to be set in the configuration file ADAPTOR HOME absolute path to the fadaptor home directory the adaptor code is extracted to JAVA HOME put your Java home here The script to parse log files and ingest provenance events into the Karma repository is executed as follows from within the fadaptor home directory Jprovenance collector sh l path to logfile gt where lt logfile gt is the name of the log file to be used for provenance collection If the user wants to override the existing rule files with their custom rule files the script should be executed as Jprovenance collector sh l path to logfile gt r lt path to rulefile gt where rulefile 1s the name of the custom rule file to be used 11 Sample execution provenance I logfile karma virtual m
12. INVOKED gt lt enumerationvalue INVOKING_WORKFLOW gt lt enumerationvalue WORKFLOW_INVOKED gt lt enumerationvalue DATA PRODUCED gt lt enumerationvalue DATA CONSUMED gt lt enumerationvalue DATA SEND STARTED gt lt enumerationvalue DATA RECEIVE STARTED gt 13 lt enumerationvalue DATA SEND FINISHED lt enumerationvalue DATA RECEIVE FINISHED gt lt restriction gt lt simpleType gt lt simpleTypename actorEnumType gt lt restrictionbase string gt lt enumerationvalue INVOKER gt lt enumerationvalue INVOKEE gt lt enumerationvalue PRODUCER gt lt enumerationvalue CONSUMER gt lt enumerationvalue SENDER gt lt enumerationvalue RECEIVER gt lt restriction gt lt simpleType gt lt simpleTypename dataEnumType gt lt restrictionbase string gt lt enumerationvalue FILE gt lt enumerationvalue BLOCK gt lt restriction gt lt simpleType gt lt simpleTypename timestampLocatorType gt lt restrictionbase string gt lt enumerationvalue DERIVED gt lt enumerationvalue FILE gt lt enumerationvalue HEADER gt lt enumerationvalue FOOTER gt lt restriction gt lt simpleType gt lt simpleTypename instanceLocatorType gt lt restrictionbase string gt lt enumerationvalue DERIVED gt lt enumerationvalue FILE gt lt enumerationvalue HEADER gt lt restriction gt lt si
13. NetKarmaGUSH Adaptor Tool User Manual V2 5August 26 2011 U DATA TO INSIGHT CENTER INDIANA UNIVERSITY Pervasive Technology Institute Copyright O 2011 The Trustees of Indiana University This document contains instructions for building and installing the Karma Adaptor Tool v2 5 vvhich provides core capability to derive capture and store provenance events from experiment logs into the Karma Provenance repository and returns results Karma Adaptor Tool is licensed under Apache License Version 2 0 the License http www apache org licenses LICENSE 2 0 The code is copyrighted and copyright owned by The Trustees of Indiana University Adaptor tool is a product of the Data to Insight Center at Indiana University See http pti iu edu d2i provenance for more information Contents I ya ya EE N MO DA 4 25 Software LA nn dn daaa 4 z l Installation Dependencies eseni 4 2 2 Service Dependencies eoo HERE gida aa dile R aa cio ibas en 4 Sesde tii 5 4 A anew Tula le O o EN 6 5 Building and Executing the Adaptor open A a Se Ge Eg ede epee 11 Sl Bur robore aede GR EG EG EA ie 11 3 2 ee eo EE aa OR OR ab ulu n su 11 6L Vista alain sa EN 12 Appendix A R lefile Skeleton ee tte Se Dr Po R PUR ee s s 13 Appendix B RulesetScliemmu iuo gla
14. The default value here is KarmaKey The following 2 properties are used to select the default rule files for gush These 2 properties should NOT be changed unless required karma adaptor experiment rulefile ruleset file defining the rules for experiment log files karma adaptor cmdline rulefile defines the rules for log files using gush shell commands 4 Writing a new rule file The rule file is a mapping document which specifies a set of rules to convert raw textual information present in a log file into a set of provenance events as managed by Karma The rule file is an XML file defined by a rule set schema snippet below to identify and map the raw data The full rule set schema can be found in Appendix B lt elementname karmaNotifications gt lt complexType gt lt sequence gt lt el ntname project type string gt lt el ntname argDelimiter type string gt lt elementname maxArgs type int minOccurs 0 gt elementname instanceID type Adaptor instanceType minOccurs 0 elementname startTime type Adaptor timestampType minOccurs 0 sel ntname ruleset type Adaptor rule minOccurs 1 maxOccurs unbounded gt lt el ntname dependencyData type Adaptor dependencyDataType minOccurs 0 maxOccurs unbounded gt sel ntname dependency type Adaptor dependencyType minOccurs 0 maxOccurs unbounded gt lt el ntname dependencyLink type Adaptor dependencyLin
15. a complexSelection gt lt netkarma value gt lt netkarma dependencyData gt 8 dependency Rules to define matching rules in order to redefine events with modified data This element defines any complex rule to modify the name of a process which derived by parsing a line in the log file for mapping the data into a notification into a value which derived from some other notification for the same process A sample dependency rule is shown below lt l Dependency process gt experiment output gt lt netkarma dependen cy gt lt netkarma sourceNotificationld gt 7 lt netkarma sourceNotificationld gt lt netkarma targetNotificationld gt 3 lt netkarma targetNotificationld gt T xnetkarma xnetkarma xnetkarma xnetkarma xnetkarma xnetkarma lt netkarma lt netkarma xnetkarma xnetkarma xnetkarma lt netkarma lt netkarma lt netkarma lt netkarma lt netkarma lt netkarma lt netkarma lt netkarma lt netkarma matchRule gt matchLineType gt PREV LINE lt netkarma matchLineType gt matchLineNum gt 1 lt netkarma matchLineNum gt matchDataName gt PID lt netkarma matchDataName gt matchDataValue gt readToken gt sourceActorType gt INVOKEE lt netkarma sourceActorType gt targetActorType gt PRODUCER lt netkarma targetActorType gt selectionMechanism gt SIMPLE lt netkarma selectionMechanism gt simpleSelecti
16. achine 15556 1288050231 txt Connecting to Server Using rulefile home workspace Karma Adaptor deps app desc ruleset xml Creating Notifications Applying dependeney rules Ingesting Notifications Number of Notifications 56 Workflow Instance ID urn tool gush karma virtual machine 15556 1288050231e6d0c01b d0f4 4cc6 91ab 9d6a1b6d572e Time to queue notifications for ingestion 19 814 secs The Workflow Instance ID printed in the standard output is the unique identifier to extract complete provenance graphs from the Karma repository using the query clients and visualization tools 6 Visualization Provenance retrieval and visualization plugins for Cytoscape can be downloaded from the following vvebsite to visualize provenance graphs http pti iu edu d2i provenance karma 12 Appendix A Rulefile Skeleton Described below is a basic skeleton of a rule file This skeleton shows the mandatory elements required for creating a rulefile as per the descriptions above The values in square brackets have to be substituted either vvith application specific constants or vvith element type definitions defined in the XML schema described above lt netkarma project gt project name lt netkarma project gt lt netkarma argDelimiter gt argument delimiter lt netkarma argDelimiter gt lt netkarma ruleset gt lt netkarma hasDuplicates gt true false lt netkarma hasDuplicates gt lt netkarma filter gt lt
17. ance Toolkit Karma Adaptor reguires the existence of two servers The following are two servers which are used for ingesting provenance data into a provenance repository 1 Karma Service Karma is a standalone tool that can be added to existing cyber infrastructure for purposes of collection and representation of provenance data The derived provenance events are sent to the Karma service through either an enterprise messaging bus RabbitMO or a web service interface The Karma Adaptor collection tool currently uses the RabbitMO interface to ingest provenance events into the database for a highly reliable and scalable system The URL for downloading Karma is http pti iu edu d2i provenance karma If you have the Karma servicehosted on a server there is no additional Karma dependency or download required to use the Karma Adaptor 2 RabbitMQ It is a messaging system which the Adaptor tool uses to send messages to Karma The RabbitMQ client is included in the Karma Adaptor so as long as RabbitMQ is loaded on the server hosting the Karma service there is no additional dependency for the adaptor itself 1 STAN Rule file y Notifications Queued RabbitMQ Message Bus Notifications Ingested Karma Service Provenance Events Karma Repository Figure 1 Karma Adaptor in Karma Provenance Toolkit 3 Configuring Adaptor Properties Unzip the karma adaptor 2 5 tar gz which contains the Karma adaptor and prove
18. argumentNumber gt eginIndex gt 8 lt netkarma beginIndex gt substring gt simpleSelection gt fileInstanceLocator gt instanceID tartTime imestampLocator FILE netkarma timestampLocator ileTimestampLocator gt electionMechanism gt SIMPLE lt netkarma selectionMechanism gt impleSelection gt electionType gt LAST NCHAR lt netkarma selectionType gt astNChar gt rgumentNumber gt 1 lt netkarma argumentNumber gt umChars gt 10 lt netkarma numChars gt lastNChar gt simpleSelection gt fileTimestampLocator gt startTime gt data block part of a Workflow Invoked notification in Karma gt uleset gt asDuplicates gt true lt netkarma hasDuplicates gt ilter rgumentNumber gt 0 lt netkarma argumentNumber gt rgumentValue gush cc netkarma argumentValue omparator gt EQUALS lt netkarma comparator gt riltero rgumentNumber gt 3 lt netkarma argumentNumber gt rgumentValue gt Gush constructor port lt netkarma argumentValue gt ilterPredicate gt AND lt netkarma filterPredicate gt omparator gt CONTAINS lt netkarma comparator gt iltero 21 lt net net Bd B 3 8 B Ha 0 000 00 0 ct c ck cf ch 4 D 0 c net AA A ANAN s A A AA ANA A A o ct f he 3 lt n lt net n n net net net net net net ne ne ne ne ne lt I CKarm CKarm CKarm CKarm Karm Karm Karm Karm
19. elimiter gt lt netkarma argDelimiter gt 3 maxArgs Number of arguments each line of the log file should be parsed into See sample below lt netkarma maxArgs gt 4 lt netkarma maxArgs gt 41 instancelD Contains rule to assign unique instance ids for each workflow execution See sample belovv ne lt ne ene tka tka tka lt netka ne ene ene ene ene lt netkarma substring gt lt netkarma simpleSelection gt tka tka tka tka tka rma instanceID rma instanceLocator FILE netkarma instanceLocator rma fileInstanceLocator rma selectionMechanism gt SIMPLE lt netkarma selectionMechanism gt n o 9 p H simpleSelection gt b selectionType gt SUBSTRING lt netkarma selectionType gt substring gt argumentNumber 1 netkarma argumentNumber rma beginIndex gt 8 lt netkarma beginIndex gt lt netkarma fileInstanceLocator gt netkarma instanceID 5 startTime Defines rule to obtain the start time if available See sample below lt netkarma startTime gt lt netkarma timestampLocator gt FILE lt netkarma timestampLocator gt lt netkarma fileTimestampLocator gt lt netkarma selectionMechanism gt SIMPLE lt netkarma selectionMechanism gt netkarma simpleSelection lt netkarma selectionType gt LAST NCHAR lt netkarma selectionType gt lt netkarma lastNChar gt lt netkarma argumentNumber g
20. endencyData Optional rules to identify certain data which might be dependent on some other information in the log file to redefine entities and relationships of initially generated events lt netkarma dependencyData gt lt netkarma name gt HOST lt netkarma name gt lt netkarma filter lt netkarma argumentNumber gt 0 lt netkarma argumentNumber gt lt netkarma argumentValue gt process block cc lt netkarma argumentValue gt lt netkarma comparator gt EQUALS lt netkarma comparator gt lt netkarma filter gt lt netkarma filter gt lt netkarma argumentNumber gt 3 lt netkarma argumentNumber gt lt netkarma argumentValue gt Client has notified lt netkarma argumentValue gt lt netkarma filterPredicate gt AND lt netkarma filterPredicate gt lt netkarma comparator gt CONTAINS lt netkarma comparator gt lt netkarma filter gt lt netkarma value gt lt netkarma selectionMechanism gt COMPLEX lt netkarma selectionMechanism gt lt netkarma complexSelection gt netkarma simpleSelection lt netkarma selectionType gt READ TOKEN lt netkarma selectionType gt lt netkarma readToken gt lt netkarma argumentNumber gt 3 lt netkarma argumentNumber gt 9 lt netkarma delimiter gt lt netkarma delimiter gt lt netkarma maxTokens gt 2 lt netkarma maxTokens gt lt netkarma tokenNumbers gt 2 lt netkarma tokenNumbers gt lt netkarma readToken gt lt netkarma simpleSelection gt lt netkarm
21. gt lt elementname selectMethod type netkarma selectMethodType gt lt choice gt lt complexType gt lt complexTypename timestampType gt lt sequence gt elementname timestampLocator type netkarma timestampLocatorType choice lt elementname timestamp type netkarma selectMethodType gt lt elementname fileTimestampLocator type netkarma selectMethodType gt lt elementname headerTimestampLocator type netkarma selectMethodType gt lt elementname footerTimestampLocator type netkarma selectMethodType gt lt choice gt lt sequence gt lt complexType gt lt complexTypename instanceType gt lt sequence gt lt elementname instanceLocator type netkarma instanceLocatorType gt lt choice gt elementname derivedInstanceLocator type netkarma provenanceDataType elementname fileInstanceLocator type netkarma selectMethodType lt elementname headerInstanceLocator type netkarma selectMethodType gt lt choice gt lt sequence gt lt complexType gt lt complexTypename dependencyType gt lt sequence gt elementname sourceNotificationId type int elementname targetNotificationId type int lt elementname sourceActorType type netkarma actorEnumType gt lt elementname targetActorType type netkarma actorEnumType gt lt elementname matchRule type netkarma matchRuleType minOccurs 0 maxOccurs unbou nded gt lt sequence gt l
22. ilyl1l Invocation to santa to find all the input brightness tempraturefiles none none non dataConsumed sea ice processing 20101117181942 1evell http d2i org amsreprovenance iu filename santa l http d2i org amsreprovenance iu filename clientserviceID 20101117181944 dataConsumed file6 makes a list of the the brightness temperature files to be used http d2i org amsreprovenance iu envnam env pm 10 none none servicelnvoked sea ice processing 20101117181942 1evell http d2i org amsreprovenance iu filename daily 1 http d2i org amsreprovenance iu filename L3 20101117181944 invoked Daily5 1L33 var tmp daily 17618 std 14 none none dataConsumed sea ice processing 20101117181942 1evell http d2i org amsreprovenance iu filename L3 1 http d2i org amsreprovenance iu filename clientServiceID 20101117181944 dataConsumed file5 maskfile consumed by L3 ftp ops science data level3 seaicel2 AMSR E 13 Sealcel2km V12 20090930 hdf 12 none none dataProduced sea ice processing 20101117181942 1evell http d2i org amsreprovenance iu filename L3 1 http d2i org amsreprovenance iu filename clientServiceID 20101117182916 dataProducedl final output produced by L3 the 12km products AMSR E L3 Sealcel2km V12 20091002 hdf 15 none none dataProduced sea ice processing 20101117181942 1evell http d2i org amsreprovenance iu filename L3 1 http d2i org amsreprovenance iu filename clientServiceID 20101117182916 dataProduced2 final ou
23. kType minOccurs 0 maxOccurs unbounded gt lt elementname globalDependency type Adaptor dependencyType minOccurs 0 maxOccurs unbounded gt lt sequence gt lt complexType gt lt element gt The main elements of mapping the raw data into provenance events are defined under the karmaNotifications element as showed above Each ruleset element inside karmaNotifications element in the rule file parses log lines to derive provenance events for an instance of a workflow and in terms of OPM defines process es artifact s agent s annotation s or a combination of any or all The rule file contains of three major parts 1 Rules for parsing the log files and associating each instance of a workflow execution to a log file The XML elements in the rule file are project argDelimiter maxArgs instancelD startTime ii Rules for mapping textual data into Karma provenance events The XML element is ruleset lil Rules for manipulating provenance events to enrich information representation These elements are dependencyData dependencyLink dependency and globalDependency Each element is described in detail below 1 project Name of the project An example is a name of the workflow engine See sample below lt netkarma project gt gush lt netkarma project gt 2 argDelimiter Delimiter to separate arguments for parsing An example of argument delimiter is shown below lt netkarma argD
24. lns http www dataandsearch org netkarma xmlns netkarma http www dataandsearch org netkarma xsi schemaLocation http www dataandsearch org netkarma deps notification_ruleset xsd gt lt net net ne Eb net lt net lt net met ene net net net net ne ne ne ne net net net net net net net net net ne ne ne ne e lt net lt net lt net lt net lt net lt net lt ne lt net lt net lt net lt net lt net lt ne Karm Karm Karm Karm Karm Karm Karm a a a a a a a a karm karma tkarma tkarma tkarma tkarma karma karm karm karm Karm a a a Karma a Karma a Karm Karma tkarma tkarma karma karma karma karma karma karma tkarma karma karma karma karma karma tkarma b YE s 25 15 15 a b 5 st SE 15 5 8 1 ta n tkarma tkarma Ha h f a a karma project gt gush lt netkarma project gt karma argDelimiter gt lt netkarma argDelimiter gt tkarma maxArgs gt 4 lt netkarma maxArgs gt instanceID or WorkflowID gt nstancelD gt nstanceLocator gt FILE lt netkarma instanceLocator gt ileInstanceLocator gt electionMechanism gt SIMPLE lt netkarma selectionMechanism gt impleSelection gt electionType gt SUBSTRING lt netkarma selectionType gt ubstring gt rgumentNumber gt 1 lt netkarma
25. mpleType gt lt simpleTypename actorEntityEnumType gt lt restrictionbase string gt lt enumerationvalue USER gt lt enumerationvalue WORKFLOW gt lt enumerationvalue SERVICE gt lt enumerationvalue METHOD gt lt restriction gt lt simpleType gt lt simpleTypename parentEnumType gt lt restrictionbase string gt lt enumerationvalue USER gt lt enumerationvalue WORKFLOW gt lt enumerationvalue SERVICE gt lt restriction gt lt simpleType gt lt simpleTypename actorEntityEnumSubtype gt lt restrictionbase string gt lt enumerationvalue CONTROLLER gt 14 xenumerationvaluez HUMAN PROXY gt lt restriction gt lt simpleType gt lt simpleTypename comparatorEnumType gt lt restrictionbase string gt lt enumerationvalue EQUALS gt lt enumerationvalue CONTAINS gt lt restriction gt lt simpleType gt lt simpleTypename notificationPartEnumType gt lt restrictionbase string gt lt enumerationvalue ACTOR gt lt enumerationvalue DATA BLOCK lt enumerationvalue ANNOTATION gt lt enumerationvalue ALL gt lt restriction gt lt simpleType gt lt simpleTypename uriEnumType gt lt restrictionbase string gt lt enumerationvalue URL gt lt enumerationvalue URN gt lt restriction gt lt simpleType gt lt simpleTypename filterPredicateEnumType gt lt restrictionbase string gt l
26. nance collection client to parse experiment logs and ingest provenance into the Karma repository as tar xvzf karma adaptor 2 5 tar gz This will create a directory named Karma Adaptor which we refer to in the remainder of this manual as adaptor home The adaptor depends on a number of properties for the correct execution The derivation rules for converting raw data into Karma specific provenance events are described using a rule file with a definite semantics There is a default template rule file present in deps directory ruleset xml which may need to be modified based on the target application For gush logs there are two generic rule files based on how an application is launched using gush For gush logs generated by using shell commands cmdline ruleset xml is used whereas for experiments using application description xml app desc ruleset xml is used In most cases gush users don t have to specify their own rule files and the adaptor will automatically select the one which suits the best Table 1 summarizes the classification In general each application should have a specific rule file for parsing information from all log files generated by the application Table 1 Rule file classification Rule file Types of Log files app desc ruleset xml Gush logs generated using application description XML cmdline ruleset xml Gush logs generated using shell commands The distribution package contains a prope
27. on gt selectionType gt READ TOKEN lt netkarma selectionType gt argumentNumber gt 3 lt netkarma argumentNumber gt delimiter gt lt netkarma delimiter gt maxTokens gt 2 lt netkarma maxTokens gt tokenNumbers 2 netkarma tokenNumbers readToken simpleSelection gt matchDataValue gt matchRule gt dependency gt 9 dependencyLink Creates links between defined dependency data elements See sample below lt netkarma dependencyLink gt lt netkarma linkType gt SEQUENTIAL lt netkarma linkType gt lt netkarma source gt PID lt netkarma source lt netkarma target gt PROCESS lt netkarma target gt lt netkarma dependencyLink gt 10 globalDependency Optional rules to redefine the overall structure of the provenance graph See sample below lt netkarma globalDependency gt lt netkarma sourceNotifi icationId 4 netkarma sourceNotificationId netkarma targetNotificationId 7 netkarma targetNotificationId netkarma sourceActor xnetkarma targetActor1 b rype INVOK ype gt INVOK P I E netkarma sourceActorType R netkarma targetActorType 10 lt netkarma globalDependency gt A simplified rule file is described in Appendix C A detailed sample rule file ruleset xml and the XML schema ruleset xsd defining the rule semantics are present in the deps directory 5
28. pename rule gt lt sequence gt lt elementname hasDuplicates type boolean gt elementname isDistinct type boolean minOccurs 0 elementname filter type netkarma filterType minOccurs maxOccurs unbounded gt lt elementname notification type netkarma notificationType gt lt sequence gt lt complexType gt lt elementname karmaNotifications gt lt complexType gt lt sequence gt elementname project type string lt elementname argDelimiter type string gt lt elementname maxArgs type int minOccurs 0 gt lt elementname instanceID type netkarma instanceType minOccurs 0 gt lt elementname startTime type netkarma timestampType minOccurs 0 gt lt elementname ruleset type netkarma rule minOccurs 1 maxOccurs unbounded gt lt elementname dependencyData type netkarma dependencyDataType minOccurs 0 maxOcc urs unbounded gt lt elementname dependency type netkarma dependencyType minOccurs 0 maxOccurs unb ounded gt lt elementname dependencyLink type netkarma dependencyLinkType minOccurs 0 maxOcc urs unbounded gt lt elementname globalDependency type netkarma dependencyType minOccurs 0 maxOccur s unbounded gt lt sequence gt lt complexType gt lt element gt lt schema gt 20 Appendix C Sample Rule File netkarma karmaNotificationsxmlns xsi http www w3 org 2001 XMLSchema instance xm
29. rties file karma adaptor properties which can be found in the adaptor home config directory Please use this sample file to configure Karma according to your deployment environment Below is a detailed explanation of the properties defined within this file The following 5 properties are used to connect to a RabbitMQ Server If the instance of RabbitMQ already exists contact the system admin for details messaging username RabbitMQ Username messaging password RabbitMq Password messaging hostname Hostname that hosts RabbitMQ Server messaging hostport Port number on the RabbitMQ Server messaging virtualhost The virtual host is created for administrative purposes Each connection and all channels inside must be associated with a single virtual host Each virtual host comprises of its own namespace a set of exchanges message queues and all associated objects The default value is The following 3 properties are used to configure how to send the Notifications to Karma Server messaging exchangename A message routing agent It can be durable our system uses durable temporary and auto deleted Each message is delivered to each qualifying queue The default value here 1s KarmaExchange messaging queuename Named Weak FIFO buffer The default value is KarmaQueue messaging routingkey In our implementation we use direct exchange type Same routingkey is used on both publisher and subscriber sides
30. t 1 lt netkarma argumentNumber gt lt netkarma numChars gt 10 lt netkarma numChars gt lt netkarma lastNChar gt lt netkarma simpleSelection gt lt netkarma fileTimestampLocator gt lt netkarma startTime gt 6 ruleset Set of rules to map raw data into provenance events compatible with Karma Each of these rulesets corresponds to deriving an entity process artifact annotation agent in OPM and a part or full notification in Karma See sample ruleset below lt data block part of a Workflow Invoked notification in Karma gt ne ne ne ne ne ne tka tka tka tka tka tka lt netk lt ne ene ene ene lt ne tka tka tka tka tka lt netk lt ne lt ne tka tka rma ruleset gt rma hasDuplicates gt true lt netkarma hasDuplicates gt rma filter gt rma argumentNumber gt 0 lt netkarma argumentNumber gt rma argumentValue gt gush cc lt netkarma argumentValue gt rma comparator gt EQUALS lt netkarma comparator gt arma filter gt ilter rgumentNumber gt 3 lt netkarma argumentNumber gt ilterPredicate gt AND lt netkarma filterPredicate gt omparator gt CONTAINS lt netkarma comparator gt a argumentValue Gush constructor port lt netkarma argumentValue gt f K DU bp PE arma filter rma notification gt rma notificatonId 1 netkarma notificatonId
31. t complexType gt lt complexTypename matchRuleType gt lt sequence gt lt elementname matchLineType type netkarma matchLineEnumType gt lt elementname matchLineNum type int gt lt elementname matchDataName type string gt lt elementname matchDataValue type netkarma selectMethodType gt 16 lt seguence gt lt complexType gt lt complexTypename dependencyDataType gt lt sequence gt lt elementname name type string gt lt elementname filter type netkarma filterType minOccurs 1 maxOccurs unbounded gt elementname value type netkarma selectMethodType sequence lt complexType gt lt complexTypename dependencyLinkType gt lt sequence gt lt elementname linkType type netkarma dependencyLinkEnumType gt lt elementname source type string gt lt elementname target type string gt lt sequence gt lt complexType gt lt complexTypename simpleSelectionType gt lt sequence gt lt elementname selectionType type netkarma selectMethodEnumType gt lt choice gt lt elementname completeString type netkarma completeType gt lt elementname substring type netkarma substringType gt lt elementname readPropertyOf type netkarma readPropertyOfType gt lt elementname readArgumentOf type netkarma readArgumentOfType gt lt elementname readToken type netkarma readTokenType gt lt elementname cons
32. t enumerationvalue AND gt lt enumerationvalue OR gt lt restriction gt lt simpleType gt lt simpleTypename selectMethodEnumType gt lt restrictionbase string gt lt enumerationvalue COMPLETE STRING gt lt enumerationvalue SUBSTRING gt lt enumerationvalue READ PROPERTY OF gt lt enumera ionvalue READ ARGUMENT OF gt lt enumerationvalue READ TOKEN lt enumerationvalue CONSTANT gt lt enumerationvalue LAST NCHAR lt restriction gt lt simpleType gt lt simpleTypename selectMechanismEnumType gt lt restrictionbase string gt lt enumerationvalue SIMPLE gt lt enumerationvalue COMPLEX gt lt restriction gt lt simpleType gt lt simpleTypename autoIncrementEnumType gt lt restrictionbase string gt lt enumerationvalue AUTO INCREMENT gt lt restriction gt m lt simpleType gt lt simpleTypename matchLineEnumType gt 15 lt restrictionbase string gt lt enumerationvalue PREV NTH LINE gt lt enumerationvalue NEXT NTH LINE gt lt restriction gt lt simpleType gt lt simpleTypename dependencyLinkEnumType gt lt restrictionbase string gt lt enumerationvalue SEQUENTIAL gt lt enumerationvalue DIRECT gt lt restriction gt lt simpleType gt lt complexTypename appenderType gt lt choice gt lt elementname autoIncrement type netkarma autoIncrementEnumType
33. tant type netkarma constantType gt elementname lastNChar type netkarma lastNCharType choice lt seguence gt lt complexType gt lt complexTypename complexSelectionType gt lt sequence gt lt elementname simpleSelection type netkarma simpleSelectionType minOccurs 2 maxO ccurs unbounded sequence lt complexType gt lt complexTypename selectMethodType gt lt sequence gt lt elementname selectionMechanism type netkarma selectMechanismEnumType gt lt choice gt lt elementname simpleSelection type netkarma simpleSelectionType gt lt elementname complexSelection type netkarma complexSelectionType gt lt choice gt lt sequence gt lt complexType gt lt complexTypename lastNCharType gt lt sequence gt lt elementname argumentNumber type int gt lt elementname numChars type int gt lt sequence gt lt complexType gt lt complexTypename completeType gt 17 lt seguence gt lt elementname argumentNumber type int gt lt sequence gt lt complexType gt lt complexTypename constantType gt lt sequence gt lt elementname constantValue type string gt lt sequence gt lt complexType gt lt complexTypename substringType gt lt sequence gt lt elementname argumentNumber type int gt lt elementname beginIndex type int gt lt elementname endIndex type int minOccurs 0 gt lt
34. tput produced by L3 the 6km products AMSR E L3 Sealce6km V12 20091002 hdf 15 none none addAnnotations sea ice processing 20101117181942 1evell file 1 AMSR E L3 Sealce6km V12 20091002 hdf 1203084381 add annotation Algorithm name NT2 none none non 20101117181942 1evell file 1 AMSR E L3 Sealce6 km V12 20091002 hdf 1203084381 add annotation Algorithm name SnowDepth none none non addAnnotations sea ice processing 20101117181942 1evell file 1 AMSR E L3 Sealce6km V12 20091002 hdf 1203084381 add annotation Algorithm name BBA none none non servicelnvoked sea ice processing 20101117181942 1evell http d2i org amsreprovenance iu filename 4 K 4 L3 1 http d2i org amsreprovenance iu filename L3seaice 20101117181944 invoked L32 executes the pge for producing the final sea ice products none none non 24
Download Pdf Manuals
Related Search
Related Contents
Gerador de pórticos Week 12 - Coffee Crack Home DeLonghi Electric Deep Fryer Fryer User Manual CHW659E1 Tableaux : produits et efficacité des programmes CARTA ERASMUS DE EDUCACIÓN SUPERIOR (ECHE GV-IP Speed Dome User Manual(ISD220V10-A-EN). tecnofibra estructural s JVC GR-AX999UM User's Manual Operating Instructions SINGLE SBC – Controller Copyright © All rights reserved.
Failed to retrieve file