Home
The Soar User's Manual Version 9.3.2
Contents
1. Options 0 n none Print just the preferences themselves 1 N names Print the preferences and the names of the productions that gen erated them 2 t timetags Print the information for the names option above plus the timetags of the wmes matched by the LHS of the indicated pro ductions 3 w wmes Print the information for the timetags option above plus the entire WME matched on the LHS 0 object Print the support for all the WMEs that comprise the object the specified identifier identifier Must be an existing Soar object identifier attribute Must be an existing attribute of the specified identifier Description The preferences command prints all the preferences for the given object identifier and at tribute If identifier and attribute are not specified they default to the current state and the current operator The Soar syntax attribute carat is optional when specifying the attribute The optional arguments indicates the level of detail to print about each preference 128 CHAPTER 8 THE SOAR USER INTERFACE This command is useful for examining which candidate operators have been proposed and what relationships if any exist among them If a preference has O support the string 0 will also be printed When only the identifier is specified on the commandline if the identifier is a state Soar uses the default attribute operator If the identifier is not a state Soar prints the su
2. 2 4 Preference memory Selection Knowledge The selection of the current operator is determined by the preferences in preference memory Preferences are suggestions or imperatives about the current operator or information about how suggested operators compare to other operators Preferences refer to operators by using the identifier of a working memory element that stands for the operator After preferences have been created for a state the decision procedures evaluates them to select the current operator for that state For an operator to be selected there will be at least one preference for it specifically a preference to say that the value is a candidate for the operator attribute of a state this is done with either an acceptable or require preference There may also be others for example to say that the value is best The different preferences available and the semantics of preferences are explained in Sec tion 2 4 1 Preferences remain in preference memory until removed for one of the reasons previously discussed in Section 2 3 3 2 4 1 Preference semantics This section describes the semantics of each type of preference More details on the preference resolution process are provided in Appendix D Only a single value can be selected as the current operator that is all values are mutually exclusive In addition there is no implicit transitivity in the semantics of preferences If A is indifferent to B and
3. Options If given an option can take one of two forms an integer or a production name n List the top n productions If n is 0 only the productions which haven t fired are listed production_name Print how many times the production has fired Description The firing counts command prints the number of times each production has fired production names are given from most requently fired to least frequently fired With no arguments it lists all productions If an integer argument n is given only the top n productions are listed If n is zero 0 only the productions that haven t fired at all are listed If a production name is given as an argument the firing count for that production is printed Note that firing counts are reset by a call to cmd_init_soar init soar Examples This example prints the 10 productions which have fired the most times along with their firing counts firing counts 10 This example prints the firing counts of production my first production firing counts my first production Warnings Firing counts are reset to zero after an cmd_init_soar init soar NB This command is slow because the sorting takes time O n log n See Also init soar 8 3 CONFIGURING TRACE INFORMATION AND DEBUGGING 137 pwatch Trace firings and retractions of specific productions Synopsis pwatch dle production name Default Aliases pw pwatch Options d
4. Conventions for indenting productions Productions in this manual are formatted using conventions designed to improve their read ability These conventions are not part of the required syntax First the name of the pro duction immediately follows the first curly bracket after the sp All conditions are aligned with the first letter after the first curly brace and attributes of an object are all aligned The arrow is indented to align with the conditions and actions and the closing curly brace follows the last action 3 3 1 Production Names The name of the production is an almost arbitrary constant See Section 3 1 1 for a de scription of constants By convention the name describes the role of the production but functionally the name is just a label primarily for the use of the programmer A production name should never be a single letter followed by numbers which is the format of identifiers The convention for naming productions is to separate important elements with asterisks the important elements that tend to appear in the name are 1 The name of the task or goal e g blocks world 2 The name of the architectural function e g propose 3 The name of the operator or other object at issue e g move block 4 Any other relevant details 40 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS This name convention enables one to have a good idea of the function of a production just by examining its name This can help for
5. etc These WME s are related because they are all contributing to the description of something that is internally known to Soar as B1 B1 is called an identifier the group of WME s that 14 CHAPTER 2 THE SOAR ARCHITECTURE share this identifier are referred to as an object in working memory Each WME describes a different attribute of the object for example its name or type or location each attribute has a value associated with it for example the name is A the type is block and the position is on the table Therefore each WME is an identifier attribute value triple and all WME s with the same identifier are part of the same object Objects in working memory are linked to other objects The value of one WME may be an identifier of another object For example a WME might say that B1 is ontop of T1 and another collection of WME s might describe the object T1 T1 is a table T1 is brown and T1 is ontop of F1 And still another collection of WME s might describe the object F1 F1 is a floor etc All objects in working memory must be linked to a state either directly or indirectly through other objects Objects that are not linked to a state will be automatically removed from working memory by the Soar architecture WME s are also often called augmentations because they augment the object providing more detail about it While these two terms are somewhat redundant WME is a term th
6. Apply a MOVE BLOCK operator There are two productions that are part of applying the operator Both will fire in parallel HHHHHHHRHHHHHHHHHRAAEHAA ARE HHHR AEH H RRR HHH AHHH HARRAH HHHHHHHREHHHHHHHHEHAAEAA AREER HHHR RRA H EERE HAAR REAR HHH Apply a MOVE BLOCK operator the block is no longer ontop of the thing it used to be ontop of This production is part of the application of a move block operator The conditions establish that 1 An operator has been selected for the current state a the operator is named move block b the operator has a moving block and a destination 2 The state has an ontop relation a the ontop relation has a top block that is the same as the gt moving block of the operator b the ontop relation has a bottom block that is different from the gt destination of the operator The actions 1 create a reject preference for the ontop relation sp blocks world apply move block remove old ontop state lt s gt operator lt o gt ontop lt ontop gt lt o gt name move block moving block lt block1 gt destination lt block2 gt lt ontop gt top block lt block1 gt pottom block lt gt lt block2 gt lt block3 gt gt lt s gt ontop lt ontop gt HHEHHHEHHHHHHHEHEAHHHRHHEH HEHEHE HEHEHE HAHHHHRE HEHE HAA HEHEH HEH HHRE HEH HEHEHE RHR Apply a MOVE BLOCK operator the block is now ontop
7. Grammars for production syntax This appendix contains the BNF grammars for the conditions and actions of productions BNF stands for Backus Naur form or Backus normal form consult a computer science book on theory programming languages or compilers for more information However if you don t already know what a BNF grammar is it s unlikely that you have any need for this appendix This information is provided for advanced Soar users for example those who need to write their own parsers B 1 Grammar of Soar productions A grammar for Soar productions is lt soar production gt sp lt production name gt lt documentation gt lt flags gt lt condition side gt gt lt action side gt lt documentation gt SS lt string gt lt flags gt o support i support chunk default B 1 1 Grammar for Condition Side Below is a grammar for the condition sides of productions lt condition side gt lt state imp cond gt lt cond gt lt state imp cond gt state impasse lt id_test gt lt attr_value_tests gt lt cond gt lt positive_cond gt lt positive_cond gt lt positive_cond gt lt conds_for_one_id gt lt cond gt lt conds_for_one_id gt state impasse lt id_test gt lt attr_value_tests gt lt id_test gt lt test gt 213 214 lt attr_value_tests gt lt attr_test gt lt value_test gt lt
8. Remove the specific production with this name Description This command removes productions from Soar s memory The command must be called with either a specific production name or with a flag that indicates a particular group of productions to be removed Using the flag a or all also causes an init soar Examples This command removes the production my first production and all chunks excise my first production chunks This removes all productions and does an cmd _init_soar init soar excise all 112 CHAPTER 8 THE SOAR USER INTERFACE See Also init soar SP Generate productions according to a specified pattern Synopsis gp production_body Description The gp command defines a pattern used to generate and source a set of Soar productions production body is a single argument that looks almost identical to a standard Soar rule that would be used with the sp command Indeed any syntax that is allowed in sp is also allowed in gp Patterns in gp are specified with sets of whitespace seprated values in square brackets Every combination of values across all square bracketed value lists will be generated Values with whitespaces can be used if wrapped in pipes Characters can also be escaped with a backslash so string literals with embedded pipes and spaces outside of string literals are both possible gp is primarily intended as an alternative to template rules for reinforcement learning template
9. remainder mod lt x gt lt y gt abs atan2 sqrt sin cos Thesesymbols provide prefix notation unary math ematical functions they each take one argument These symbols work similarly to C functions They will take either integer or real number arguments The first function abs returns an integer when its argument is an integer and otherwise returns a real number and the last four functions always return a real number atan2 returns as a float in radians the arctangent of first_arg second_arg sin and cos take as arguments the angle in radians sp gt lt s gt abs value abs lt x gt sqrt sqrt lt x gt int Converts a single symbol to an integer constant This function expects either an integer constant symbolic constant or floating point constant The symbolic constant must be a string which can be interpreted as a single integer The floating point 62 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS constant is truncated to only the integer portion This function essentially operates as a type casting function For example the expression 2 sqrt 6 could be printed as an integer using the following sp gt write 2 int sqrt 6 float Converts a single symbol to a floating point constant This function expects either an integer constant symbolic constant or floating point constant The symbolic constant must be a string which can be interpreted as a single floating point number
10. Although you could force your productions to provide O support or I support by using these commands regardless of the structure of the conditions and actions of the production this is not proper coding style The o support and no support flags are included to help with debugging but should not be used in a standard Soar program Examples sp blocks create problem space This creates the top level space state lt s1 gt superstate nil gt lt si gt name solve blocks world problem space lt p1 gt lt p1 gt name blocks world See Also excise learn watch stop soar Pause Soar 120 CHAPTER 8 THE SOAR USER INTERFACE Synopsis stop soar s reason string Default Aliases interrupt stop soar ss stop soar stop stop soar Options s self Stop only the soar agent where the command is issued All other agents continue running as previously specified reason_string An optional string which will be printed when Soar is stopped to indi cate why it was stopped If left blank no message will be printed when Soar is stopped Description The stop soar command stops any running Soar agents It sets a flag in the Soar kernel so that Soar will stop running at a safe point and return control to the user This command is usually not issued at the command line prompt a more common use of this command would be for instance as a side effect of
11. S5 jug N4 N4 volume 5 N4 contents 3 018 jug N4 soar gt explain backtraces c 21 chunk 65 d13 tiex 2 Explanation of why condition Production chunk 64 d13 opnochange 1 matched N4 contents 3 was included in chunk 65 d13 tiex 2 8 4 CONFIGURING SOAR S RUNTIME PARAMETERS 157 N4 contents 3 which caused production selection select failure evaluation becomes reject preference to match E3 symbolic value failure which caused A result to be generated See Also save backtraces indifferent selection Controls indifferent preference arbitration Synopsis indifferent selection indifferent selection s indifferent selection bgfx1 indifferent selection et value indifferent selection p parameter reduction_policy indifferent selection r parameter reduction_policy reduction_rate indifferent selection a setting Default Aliases inds indifferent selection 158 CHAPTER 8 THE SOAR USER INTERFACE Options s stats Summary of settings policy Set exploration policy parameter value Get Set exploration policy parame ters if value not given returns the current value parameter reduction policy Get Set exploration policy param eter reduction policy if policy not given returns the current parameter reduction_policy reduction_rate Get Set exploration policy param eter reduction rate for a policy if rate n
12. Statistics Soar tracks some RL statistics over the lifetime of the agent These can be accessed using rl stats lt statistic gt Running rl stats without a statistic will list the values of all statistics update error Difference between target and current values in last RL update total reward Total accumulated reward in the last update global reward Total accumulated reward since agent initialization See Also excise print watch save backtraces Save trace information to explain chunks and justifications Synopsis save backtraces ed Options e enable on Turn explain sysparam on d disable off Turn explain sysparam off Description The save backtraces variable is a toggle that controls whether or not backtracing information from chunks and justifications is saved When save backtraces is set to off backtracing information is not saved and explanations of the chunks and justifications that are formed can not be retrieved When save backtraces is set to on backtracing information can be retrieved by using the emd_explain_backtraces explain backtraces command Saving backtracing information may slow down the execution of your Soar program but it can be a very useful tool in understanding how chunks are formed 172 CHAPTER 8 THE SOAR USER INTERFACE See Also explain backtraces select Force the next selected operator Synopsis select id Opti
13. This function essentially operates as a type casting function For example if you wanted to print out an integer expression as a floating point num ber you could do the following sp gt write float 2 3 ifeq Conditionally return a symbol This function takes four arguments It returns the third argument if the first two are equal and the fourth argument otherwise Note that symbols of different types will always be considered unequal For example 1 0 and 1 will be unequal because the first is a float and the second is an integer sp example rule state lt s gt a lt a gt b lt b gt gt write ifeq lt a gt lt b gt equal not equal 3 3 6 10 Generating and manipulating symbols A new symbol an identifier is generated on the right hand side of a production whenever a previously unbound variable is used This section describes other ways of generating and manipulating symbols on the right hand side timestamp This function returns a symbol whose print name is a representation of the current date and time For example 3 3 PRODUCTION MEMORY 63 sp aes write timestamp When this production fires it will print out a representation of the current date and time such as soar gt run le 8 1 96 15 22 49 make constant symbol This function returns a new constant symbol guaranteed to be different from all symbols currently present in the system With no arguments it
14. color red lt s gt thing lt t1 gt lt s gt thing lt t2 gt 3 3 PRODUCTION MEMORY 57 This will add four elements to working memory with the variables replaced with whatever values they were bound to on the condition side Since Soar is case sensitive different combinations of upper and lowercase letters represent different constants For example red Red and RED are all distinct symbols in Soar In many cases it is prudent to choose one of uppercase or lowercase and write all constants in that case to avoid confusion and bugs The constants that are used for attributes and values have a few restrictions on them 1 There are a number of architecturally created augmentations for state and impasse objects see Section 3 4 for a listing of these special augmentations User defined productions can not create or remove augmentations of states that use these attribute names 2 Attribute names should not begin with a number if these attributes will be used in attribute path notation 3 3 6 3 Removing Working Memory Elements A element is explicitly removed from working memory by following the value with a dash also called a reject gt lt s gt block lt b gt If the removal of a working memory element removes the only link between the state and working memory elements that had the value of the removed element as an identifier those working memory elements will be removed This
15. state lt s1 gt name water jug operator lt op gt jug lt j1 gt lt j2 gt lt op gt name fill fill jug volume 3 5 lt j1 gt volume 3 contents 0 1 2 3 lt j2 gt volume 5 contents 0 1 2 3 4 5 gt lt s1 gt operator lt op gt 0 Esoteric example generates 24 rules gp strange example state lt s1 gt lt lt atti att2 att3 att4 gt gt val lanother vall strange val gt lt si gt foo bar lt bar gt testgp soar contains many more examples See Also Sp gp max Set the upper limit to the number of productions generated by the gp command Synopsis gp max value 114 CHAPTER 8 THE SOAR USER INTERFACE Options value Maximum number of productions to produce Description gp max is used to limit the number of productions produced by a gp production It is easy to write a gp production that has a combinatorial explosion and hangs for a long time while those productions are added to memory The gp max command bounds this Use without an argument to query the current value Examples gp max 1000 gp productions that produce more than 1000 productions will stop producing them when 1000 are made and return an error See Also Sp 8P help Provide formatted usage information about Soar commands Synopsis help command_name Default Aliases help h help man help Des
16. top block lt thing1 gt pottom block lt gt lt thing2 gt lt thing1 gt type block clear yes lt thing2 gt clear yes gt lt s gt operator lt o gt lt o gt name move block moving block lt thing1 gt destination lt thing2 gt Structured values may be nested to any depth Thus it is possible to write our example production using a single condition with multiple structured values sp blocks world propose move block state lt s gt problem space blocks thing lt thing1 gt lt gt lt thingi gt lt thing2 gt clear yes ontop top block lt thing1 gt type block clear yes pottom block lt gt lt thing2 gt gt lt s gt operator lt o gt lt o gt name move block moving block lt thing1 gt destination lt thing2 gt Notes on structured value notation e Attribute path notation and structured value notation are orthogonal and can be com bined in any way A structured value can contain an attribute path or a structure can be given as the value for an attribute path e Structured value notation may also be combined with negations and with multi attributes 56 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS e Structured value notation may not be used in the actions of productions 3 3 6 The action side of productions or RHS The action side of a production also called the right hand side or RHS of the production cons
17. Finally upon initialization semantic mem ory maintains a continuous exclusive lock to the database locking mode pragma thus other applications agents cannot make simultaneous read write calls to the database thereby re ducing the need for potentially expensive system calls to secure release file locks Finally maintaining accurate operation timers can be relatively expensive in Soar Thus these should be enabled with caution and understanding of their limitations First they will affect performance depending on the level set via the timers parameter A level of three for instance times every modification to long term identifier recency statistics Furthermore because these iterations are relatively cheap typically a single step in the linked list of a b tree timer values are typically unreliable depending upon the system resolution is 1 microsecond or more Chapter 7 Episodic Memory Episodic memory is a record of an agent s stream of experience The episodic storage mech anism will automatically record episodes as a Soar agent executes The agent can later deliberately retrieve episodic knowledge to extract information and regularities that may not have been noticed during the original experience and combine them with current knowl edge such as to improve performance on future tasks This chapter is organized as follows episodic memory structures in working memory 7 1 episodic storage 7 2 retrieving episodes
18. S RUNTIME PARAMETERS Exploration Policy P arameters Parameter Name Acceptable Values Default Value e epsilon o 1 0 1 t temperature O inf 25 Exploration Policy Parameter Auto Reduction Policies Policy Name Valid Rates Default Rate exponential default 0 1 1 linear 0 inf 0 See Also numeric indifferent mode rl learn Set the parameters for chunking Synopsis learn 1 learn dlElo learn eabnN Default Aliases 1 learn 160 CHAPTER 8 THE SOAR USER INTERFACE Options e enable on Turn chunking on Can be modified by a or b d disable off Turn all chunking off default E except Learning is on except as specified by RHS dont learn actions O only Chunking is on only as specified by RHS force learn actions 1 list Prints listings of dont learn and force learn states a all levels Build chunks whenever a subgoal returns a result Learning must be enabled b bottom up Build chunks only for subgoals that have not yet had any subgoals with chunks built Learning must be enabled n enable through local negations Build chunks when local negation encoun tered in backtrace default N disable through local negations Do not build chunks when local negation encountered in backtrace Description
19. Support for 11 S1 jug J1 S1 jug J1 0 From water jug apply initialize water jug look ahead This example shows the support for all WMEs that make up the object S1 soar gt pref o s1 Support for S1 problem space S1 problem space P1 Support for S1 name S1 name water jug 0 Support for S1 jug Si jug 14 0 S1 jug J1 0 Support for S1 desired S1 desired D1 0 Support for S1 superstate set S1 superstate set nil Preferences for S1 operator acceptables 02 fill 03 fill Arch created wmes for S1 2 S1 superstate nil 1 S1 type state Input I0 wmes for S1 3 S1 io I1 print Print items in working memory or production memory 130 CHAPTER 8 THE SOAR USER INTERFACE Synopsis print options production_name print options identifier timetag pattern Default Aliases p print pe print chunks wmes print internal Options Printing items in production memory What productions to print a all print the names of all productions currently loaded chunks print the names of all chunks currently loaded D defaults print the names of all default productions currently loaded j justifications print the names of all justifications currently loaded r rl Print Soar RL rules T template Print Soar RL templates u user print the names of all user productions currently loade
20. The learn command controls the parameters for chunking With no arguments this command prints out the current learning environment status If arguments are provided they will alter the learning environment as described in the options and arguments table The watch command can be used to provide various levels of detail when productions are learned Learning is disabled by default With the on flag chunking is on all the time With the except flag chunking is on but Soar will not create chunks for states that have had RHS dont learn actions executed in them With the only flag chunking is off but Soar will create chunks for only those states that have had RHS force learn actions executed in them With the off flag chunking is off all the time The only flag and its companion force learn RHS action allow Soar developers to turn learning on in a particular problem space so that they can focus on debugging the learning problems in that particular problem space without having to address the problems else where in their programs at the same time Similarly the except flag and its companion dont learn RHS action allow developers to temporarily turn learning off for debugging pur poses These facilities are provided as debugging tools and do not correspond to any theory of learning in Soar The all levels and bottom up flags are orthogonal to the on except only and off flags and so may be used in combination with the
21. Thus the WME depends upon this production instantiation and more specifically the features the instantiation tests When one of the conditions in the production instantiation no longer matches the instantiation is retracted resulting in the loss of the acceptable preference for the WME I support is illustrated in Figure E 1 A copy of A in the subgoal A is retracted automatically when A changes to A The substate WME persists only as long as it remains justified by A This justification is called instantiation support I support in Soar and should not be confused with result justifications In the broadest sense we can say that some feature lt b gt is dependent upon another element lt a gt if lt a gt was used in the creation of lt b gt i e if lt a gt was tested in the production instantiation that created lt b gt Further a dependent change with respect to feature lt b gt is a change to any of its instantiating features In Figure E 1 the change from A to A isa Importantly in a technical sense the WME is only retracted when it loses instantiation support not when the creating production is retracting For example a WME could receive i support from several different instantiations and the retraction of one would not lead to the retraction of the WME However the the following generally discusses direct dependency unmediated by preferences ignoring this complication for clarity 224 APPE
22. activate on query Determines if the results of queries should be acti vated on off on base decay Sets the decay parameter for base level activa tion computation gt 0 0 5 base update policy Sets the policy for re computing base level activa tion stable naive incremental stable base incremental threshes Sets time deltas after which base level activation is re computed for old memories L2 3 10 cache size Number of mem ory pages used in the SQLite cache E 10000 database Database storage method file memory memory lazy commit Delay writing semantic store changes to file until agent exits on off on learning Semantic ory enabled mem on off off merge Controls how re trievals interact with long term identifiers in working memory none add add mirroring Controls auto matic encoding of working memory changes on off off optimization Policy for com mitting data to disk safety performance performance page size each page the Size of memory used in SQLite cache Ik 2k 4k 8k 16k 32k 64k 8k path Location of empty some path empty 176 CHAPTER 8 THE SOAR USER INTERFACE If activation mode is base level three parameters control bias values The base decay parameter sets the free decay parameter in the base level
23. disable off Turn production watching off for the specified production If no production is specified turn production watching off for all productions e enable on Turn production watching on for the specified production The use of this flag is optional so this is pwatch s default behavior If no production is specified all productions currently being watched are listed production name The name of the production to watch Description The pwatch command enables and disables the tracing of the firings and retractions of individual productions This is a companion command to watch which cannot specify individual productions by name With no arguments pwatch lists the productions currently being traced With one production name argument pwatch enables tracing the production enable can be explicitly stated but it is the default action If disable is specified followed by a production name tracing is turned off for the produc tion When no production name is specified enable lists all productions currently being traced and disable disables tracing of all productions Note that pwatch now only takes one production per command Use multiple times to watch multiple functions See Also watch 138 stats CHAPTER 8 THE SOAR USER INTERFACE Print information on Soar s runtime statistics Synopsis stats options Default Aliases st
24. is the Q value of the state operator pair s a w is the numeric indifferent preference value of RL rule i 2 This is assuming the value of numeric indifferent mode is set to sum In general the RL mechanism only works correctly when this is the case and we assume this case in the rest of the chapter See page 167 for more information about this parameter 5 2 REWARD REPRESENTATION 83 oi s a 0 if RL rule does not match s a and s a 1 if it does This interpretation allows RL rules to simulate a number of popular function approximation schemes used in RL such as tile coding and sparse coding 5 2 Reward Representation RL updates are driven by reward signals In Soar these reward signals are given to the RL mechanism through a working memory link called the reward link Each state in Soar s state stack is automatically populated with a reward link structure upon creation Soar will check this structure for a numeric reward signal for the last operator executed in the associated state at the beginning of every decision phase Reward is also collected when the agent is halted or a state is retracted In order to be recognized the reward signal must follow this pattern lt ri gt reward lt r2 gt lt r2 gt value val where lt r1 gt is the reward link identifier lt r2 gt is some intermediate identifier and val is any constant numeric value Any structure that does not match this pattern is ignored
25. lt s gt must be bound to a state identifier In condition 2 the variable lt s gt must be bound to the lowest state identifier That is to say each positive condition on the LHS takes the form id attr value some of these id s match state identifiers and the system looks for the deepest matched state identifier The tested current operator must be on this state For example in the production sp elaborate state operator name state lt s gt superstate lt s1 gt lt si gt operator lt o gt Sometimes o support mode 3 does not notice that this condition is true This is a bug which is unlikely to be fixed since users are encouraged to use mode 4 215 216 APPENDIX C THE CALCULATION OF O SUPPORT lt o gt name lt name gt gt lt s gt name something the RHS action gets i support Of course the state bound to lt s gt is destroyed when lt s1 gt operator lt o gt retracts so o support would make little difference On the other hand the production sp operator superstate xapplication state lt s gt superstate lt s1 gt lt s gt operator lt o gt lt o gt name lt name gt gt lt s1 gt sub operator name lt name gt gives o support to its RHS action which remains after the substate bound to lt s gt is destroyed There is a third condition that determines support and it is in this condition that modes 3 amp 4 differ An extension of co
26. move block lt b1 gt x destination lt x2 gt y destination lt y2 gt 1 This production would create substructure on the output link that the output function could interpret as being a command to move the block to a new location Chapter 4 Chunking Chunking is Soar s mechanism to learn new procedural knowledge Chunking creates pro ductions called chunks that summarize the processing required to produce the results of subgoals When a chunk is built it is added to production memory where it will be matched in similar situations avoiding the need for the subgoal Chunks are created only when results are formed in subgoals since most Soar programs are continuously subgoaling and returning results to higher level states chunks are typically created continuously as Soar runs This chapter begins with a discussion of when chunks are built Section 4 1 below followed by a detailed discussion of how Soar determines a chunk s conditions and actions Section 4 2 Sections 4 3 through 4 4 examine the construction of chunks in further detail Section 4 5 explains how and why chunks are prevented from matching with the WME s that led to their creation Section 4 6 reviews the problem of overgeneral chunks 4 1 Chunk Creation Several factors govern when chunks are built Soar chunks the results of every subgoal unless one of the following conditions is true 1 Learning is off See Section 8 4 on page 159 for details of
27. nexts Nexts Number of times the next com mand has been processed prevs Prevs Number of times the previous command has been processed ncb wmes Last Retrieval WMEs Number of WMEs added to work ing memory in last reconstruction qry pos Last Query Positive Number of leaf WMEs in the query cue of last cue based re trieval qry neg Last Query Negative Number of leaf WMEs in the neg query cue of the last cue based retrieval qry ret Last Query Retrieved Episode ID of last retrieval qry card Last Query Cardinality Match cardinality of last cue based retrieval qry lits Last Query Literals Number of literals in the DNF graph of last cue based retrieval Timers Episodic memory also has a set of internal timers that record the durations of certain operations Because fine grained timing can incur runtime costs episodic memory timers are off by default Timers of different levels of detail can be turned on by issuing epmem set timers lt level gt where the levels can be off one two or three three being most 154 CHAPTER 8 THE SOAR USER INTERFACE detailed and resulting in all timers being turned on Note that none of the episodic memory statistics nor timing information is reported by the stats command All timer values are reported in seconds Level one total Total epmem operations Level two epmem_api Agent command validation epmem_hash Hashing symbols epmem_init Episodi
28. returns a symbol whose name starts with constant With one or more arguments it takes those argument symbols concatenates them and uses that as the prefix for the new symbol It may also append a number to the resulting symbol if a symbol with that prefix as its name already exists sp gt lt s gt new symbol make constant symbol When this production fires it will create an augmentation in working memory such as S1 new symbol constant5 The production sp gt lt s gt new symbol make constant symbol lt s gt will create an augmentation in working memory such as S1 new symbol S14 when it fires The vertical bars denote that the symbol is a constant rather than an identifier in this example the number 4 has been appended to the symbol S1 This can be particularly useful when used in conjunction with the timestamp function by using timestamp as an argument to make constant symbol you can get a new symbol that is guaranteed to be unique For example 64 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS sp gt lt s gt new symbol make constant symbol timestamp When this production fires it will create an augmentation in working memory such as S1 new symbol 8 1 96 15 22 49 capitalize symbol Given a symbol this function returns a new symbol with the first character capitalized This function is provided primarily for text output for example to
29. 19 syntax 38 worse 19 219 worst 20 219 preference memory 19 syntax 38 preferences 127 217 print 129 problem solving external 11 functions 6 internal 11 problem space 12 representation 8 production 6 16 condition 41 firing 16 instantiation 17 LHS 41 match 6 RHS 56 roles 17 structured values 54 syntax 38 production actions 18 production memory 16 syntax 38 production find 132 prohibit preference 76 78 219 pushd 189 pwatch 137 pwd 189 quiescence 21 quiescence t augmentation 67 74 79 rand 203 refractory inhibition of chunks 77 reinforcement learning 81 reject preference 219 remove wme 196 replay input 197 require preference 76 78 217 result 23 25 73 75 rete net 190 RHS of production 56 234 rl 81 169 run 115 save backtraces 171 select 172 semantic memory 91 set library location 191 set stop phase 173 smem 91 174 soarnews 204 source 191 sp 117 srand 204 stack see goal state 15 representation 8 state no change impasse 24 stats 138 stop soar 119 structured value notation 54 subgoal see goal 25 66 73 augmentations 67 result 74 termination 67 subgoal result 75 superstate 67 support 215 symbol 34 symbolic constant 34 tie impasse 24 67 time 204 timers 178 timestamp 62 timetag 35 top state for I O 71 trace memory 75 type comparisons 43 unalias 205 Unix 4 value 13 14 33 34 struc
30. A cue is composed of WMEs that describe the augmentations of a long term identifier A cue WME with a constant value denotes an exact match of both attribute and value A cue WME with a long term identifier as its value denotes an exact match as well A cue WME with a short term identifier as its value denotes an exact match of attribute but with any value constant or identifier A cue based retrieval command has a query attribute and an identifier value the cue lt s gt smem command query lt cue gt For instance consider the following rule that creates a cue based retrieval command sp smem sample query state lt s gt smem command lt sc gt lti lt lti gt 96 CHAPTER 6 SEMANTIC MEMORY input link foo lt bar gt gt lt sc gt query lt q gt lt q gt name lt any name gt foo lt bar gt associate lt lti gt age 25 In this example assume that the lt lti gt variable will match a long term identifier and the lt bar gt variable will match a constant Thus the query requests retrieval of a long term identifier from semantic memory with augmentations that satisfy ALL of the following re quirements e Attribute name and ANY value e Attribute foo and value equal to the value of variable lt bar gt at the time this rule fires e Attribute associate and value equal to the long term identifier lt 1ti gt at the time this rule fires e Attribute age and integer value 25 If no l
31. a Soar program to invoke on line help information and to create and delete Soar productions The specific commands described in this section are Summary excise Delete Soar productions from production memory gp Define a pattern used to generate and source a set of Soar productions gp max Set the upper limit to the number of productions generated by the gp command help Provide formatted on line information about Soar commands init soar Reinitialize Soar so a program can be rerun from scratch run Begin Soar s execution cycle sp Create a production and add it to production memory stop soar Interrupt a running Soar program These commands are all frequently used anytime Soar is run 8 1 BASIC COMMANDS FOR RUNNING SOAR 111 excise Delete Soar productions from production memory Synopsis excise production_name excise options Default Aliases ex excise Options a all Remove all productions from memory and perform an init soar com mand chunks Remove all chunks learned productions and justifications from memory d default Remove all default productions default from memory r rl Excise Soar RL rules t task Remove chunks justifications and user productions from memory T templates Excise Soar RL templates u user Remove all user productions but not chunks or default rules from memory production name
32. acceptable preference for a value of an operator and there are no other competing values that operator will be selected If there are multiple acceptable preferences for the same state but with different values the preferences must be evaluated to determine which candidate is selected If the preferences can be evaluated without conflict the appropriate operator augmentation of the state will be added to working memory This can happen when they all suggest the same operator or when one operator is preferable to the others that have been suggested When the preferences conflict Soar reaches an impasse as described in Section 2 6 Preferences can be confusing for example there can be two suggested values that are both best which again will lead to an impasse unless additional preferences resolve this conflict or there may be one preference to say that value A is better than value B and a second preference to say that value B is better than value A 2 5 Soar s Execution Cycle Without Substates The execution of a Soar program proceeds through a number of cycles Each cycle has five phases 1 Input New sensory data comes into working memory 2 Proposal Productions fire and retract to interpret new data state elaboration propose operators for the current situation operator proposal and compare pro posed operators operator comparison All of the actions of these productions are I supported All matched productions f
33. agent to play forward episodes using relative non cue based retrievals Episodic memory stores the time of the last successful retrieval non cue based or cue based Agents can indirectly make use of this information by issuing nest or previous commands Episodic memory executes these commands by attempting to retrieve the episode immedi ately proceeding preceding the last successful retrieval respectively To issue one of these commands the agent must create a new WME on the command link with the appropriate attribute nezt or previous and value of an arbitrary identifier lt s gt epmem command next lt n gt lt s gt epmem command previous lt p gt If no such episode exists then an error is returned 104 CHAPTER 7 EPISODIC MEMORY Currently if the time of the last successfully retrieved episode is known to the agent as could be the case by accessing result meta data these commands are identical to performing an absolute non cue based retrieval after adding subtracting 1 to the last time respectively However if an episodic store dynamic like forgetting is implemented these relative commands are guaranteed to return the next previous valid episode assuming one exists 7 3 4 Retrieval Meta Data The following list details the WMEs that episodic memory creates in the result link of the epmem structure wherein a command was issued e retrieved lt retrieval root gt If episodic memory retrieves an episode t
34. all of Soar s memories must be removed Consequently smem init will reinitialize episodic semantic procedu ral and working memories It is equivalent to wiping the semantic store and executing these commands epmem close excise all init soar See Also watch timers Toggle on or off the internal timers used to profile Soar 8 4 CONFIGURING SOAR S RUNTIME PARAMETERS 179 Synopsis timers options Options d disable off Disable all timers e enable on Enable timers as compiled Description This command is used to control the timers that collect internal profiling information while Soar is running With no arguments this command prints out the current timer status Timers are ENABLED by default The default compilation flags for soar enable the basic timers and disable the detailed timers The timers command can only enable or disable timers that have already been enabled with compiler directives See the stats command for more info on the Soar timing system See Also stats waitsnc Generate a wait state rather than a state no change impasse Synopsis wait eld Options e enable on Turns a state no change into a wait state d disable off Default A state no change generates an impasse Description In some systems espcially those that model expert fully chunked knowledge a state no change may represent a wait state rath
35. allow the first word in a sentence to be capitalized capitalize symbol foo concat Given an arbitrary number of symbols this function concatenates them to gether into a single constant symbol For example sp example state lt s gt type state gt lt s gt name concat foo bar 2 4 After this rule fires the WME S1 name foobar6 will be added 3 3 6 11 User defined functions and interface commands as RHS actions Any function which has a certain function signature may be registered with the Kernel and called as a RHS function The function must have the following signature std string MyFunction smlRhsEventId id void pUserData Agent pAgent char const pFunctionName char const pArgument The Tcl and Java interfaces have similar function signatures Any arguments passed to the function on the RHS of a production are concatenated and passed to the function in the pArgument argument Such a function can be registered with the kernel via the client interface by calling Kernel AddRhsFunction char const pRhsFunctionName RhsEventHandler handler void pUserData The exec and cmd functions are used to call user defined functions and interface commands on the RHS of a production exec Used to call user defined registered functions Any arguments are concatenated without spaces For example if lt o gt is bound to x then 3 3 PRODUCTION MEMORY 65 sp gt exec MakeANote l
36. are also two optional components a documentation string and a type Syntactically each production consists of the symbol sp followed by an opening curly brace the production s name the documentation string optional the production type op tional comments optional the production s conditions the symbol gt literally dash dash greaterthan the production s actions and a closing curly brace Each element of a production is separated by white space Indentation and linefeeds are used by convention but are not necessary sp production name Documentation string type CONDITIONS gt ACTIONS 3 3 PRODUCTION MEMORY 39 sp blocks world propose move block state lt s gt problem space blocks thing lt thing1 gt lt gt lt thing1l gt lt thing2 gt ontop lt ontop gt lt thing1 gt type block clear yes lt thing2 gt clear yes lt ontop gt top block lt thing1 gt pottom block lt gt lt thing2 gt gt lt s gt operator lt o gt lt o gt name move block moving block lt thing1 gt destination lt thing2 gt Figure 3 2 An example production from the example blocks world task An example production named blocks world propose move block is shown in Figure 3 2 This production proposes operators named move block that move blocks from one location to another The details of this production will be described in the following sections
37. be confused with the current operator Figure 2 4 illustrates working memory after the first operator has been selected There are six operators proposed and only one of these is actually selected Goals are either represented explicitly as substructure of the state with general rules that recognize when the goal is achieved or are implicitly represented in the Soar program by goal specific rules that test the state for specific features and recognize when the goal is achieved The point is that sometimes a description of the goal will be available in the state for focusing the problem solving whereas other times it may not Although representing a goal explicitly has many advantages some goals are difficult to explicitly represent on the state The goal in our blocks world task is represented implicitly in the Soar program A single production rule monitors the state for completion of the goal and halts Soar when the goal is achieved 10 CHAPTER 2 THE SOAR ARCHITECTURE aoe nitial State move C er lh on PB dhim Figure 2 5 The six operators proposed for the initial state of the blocks world each move one block to a new location 2 1 4 Proposing candidate operators As a first step in selecting an operator one or more candidate operators are proposed Operators are proposed by rules that test features of the current state When the blocks world task is run the Soar program will propose
38. blocks ontop lt AB gt lt gt lt gt lt AB gt top block lt BC gt top block lt CT gt top block lt A gt lt B gt lt C gt lt T gt gt write crlf type type type type halt block block block table lt AB gt lt BC gt lt AB gt lt gt lt BC gt lt CT gt lt A gt bottom block lt B gt lt B gt bottom block lt C gt lt C gt bottom block lt T gt name A name B name C name TABLE Achieved A B Cl HHHHHHHHHHHEEHHHHHHHHHHAAAEHA AREER RHEE HEAR REHEARSE HHHHHHHHHHHR RHEE HHHHHAAAHHA AERP HHH RRR H HARRAH ARR Monitor the state Print a message every time a block is moved He HH HH H OF The conditions establish that 1 An operator has been selected for the current state a the operator is named move block b the operator has a moving block and a destination 2 each block has a name The actions 1 print a message for the user that the block has been moved to the destination sp blocks world monitor move block state lt s gt operator lt o gt 212 APPENDIX A THE BLOCKS WORLD PROGRAM lt o gt name move block moving block lt block1 gt destination lt block2 gt lt block1 gt name lt blocki name gt lt block2 gt name lt block2 name gt gt write crlf Moving Block lt blocki name gt to lt block2 name gt Appendix B
39. by modifying the state These functions are driven by the knowledge 2 1 AN OVERVIEW OF SOAR T encoded in a Soar program Soar represents that knowledge as production rules Produc tion rules are similar to if then statements in conventional programming languages For example a production might say something like if there are two blocks on the table then suggest an operator to move one block ontop of the other block The if part of the production is called its conditions and the then part of the production is called its actions When the conditions are met in the current situation as defined by working memory the production is matched and it will fire which means that its actions are executed making changes to working memory The other function selecting the current operator involves making a decision once sufficient knowledge has been retrieved This is performed by Soar s decision procedure which is a fixed procedure that interprets preferences that have been created by the retrieval functions The knowledge retrieval and decision making functions combine to form Soar s decision cycle When the knowledge to perform the problem solving functions is not directly available in productions Soar is unable to make progress and reaches an impasse There are three types of possible impasses in Soar 1 An operator cannot be selected because none are proposed 2 An operator cannot be selected because multiple ope
40. can be structurally unified with the candi date episode paying special regard to the structural constraints imposed by shared identi fiers Cue based matching will return the most recent structural match or the most recent candidate episode with the greatest match score A special note should be made with respect to how short vs long term identifiers see Section 6 2 on page 92 are interpreted in a cue Short term identifiers are processed much as they are in working memory transient structures Cue matching will try to find any identifier in an episode with respect to WME path from state that can apply Long term identifiers however are treated as constants Thus when analyzing the cue episodic memory will not consider long term identifier augmentations and will only match with the same long term identifier in the same context in an episode The case based retrieval process can be further controlled using optional modifiers e The before command requires that the retrieved episode come relatively before a sup plied time lt s gt epmem command before time e The after command requires that the retrieved episode come relatively after a supplied time lt s gt epmem command after time e The prohibit command requires that the time of the retrieved episode is not equal to a supplied time 7 3 RETRIEVING EPISODES 103 lt s gt epmem command prohibit time Multiple prohibit command WMEs may be issued as mo
41. currently possible to create production actions wherein the identifier of a new WME is a long term identifier that exists neither in the production conditions nor as the attribute or value of a prior action Such rules will wreak havoc within Soar and are not supported They will be detected and disallowed in future versions of semantic memory 6 2 1 3 Episodic Memory Episodic memory see Section 7 on page 99 faithfully captures short vs long term iden tifiers including the episode of transition Cues are handled in much the same way as cue based retrievals with respect to the differences in semantics of a short vs long term identifier 6 3 Storing Semantic Knowledge An agent stores a long term identifier to semantic memory by creating a store command this is a WME whose identifier is the command link of a state s smem structure the attribute is store and the value is an identifier short or long lt s gt smem command store lt identifier gt Semantic memory will encode and store all WMEs whose identifier is the value of the store command Storing deeper levels of working memory is achieved through multiple store commands Multiple store commands can be issued in parallel Storage commands are processed on every state at the end of every phase of every decision cycle Storage is guaranteed to succeed and a status WME will be created where the identifier is the result link of the smem structure of that state the attribut
42. disjunction of constants With a disjunction there will be a match if any one of the constants is found in a working memory element and the other parts of the working memory element matches Variables and predicates may not be used within disjunctive tests Syntactically a disjunctive test is specified with double angle brackets i e lt lt and gt gt There must be spaces separating the brackets from the constants The following table provides examples of legal and illegal disjunctions Legal disjunctions Illegal disjunctions lt lt ABC 45 117 gt gt lt lt lt A gt A gt gt lt lt 5 10 gt gt lt lt lt 5 gt 10 gt gt lt lt good morning good evening gt gt lt lt A B C gt gt Example Production For example the third condition of the following production contains a disjunction that restricts the color of the table to red or blue sp blocks example production conditions state operator lt o gt table lt t gt lt o gt name move block lt t gt type table color lt lt red blue gt gt gt a Note Disjunctions of complete conditions are not allowed in Soar Multiple similar productions fulfill this role 3 3 5 5 Conjunctions of values A test for an identifier attribute or value in a condition may include a conjunction of tests all of which must hold for there to be a match Syntactically conjuncts are contained within curly brace
43. failure impasse 24 67 217 crlf 60 decision procedure 19 66 217 decision cycle 7 21 pseudo code 23 decision procedure 7 23 default wme depth 121 desirability preference 76 78 231 232 dirs 187 disjunction of constants 44 disjunctions of attributes 49 dont learn 65 dot notation 50 echo 187 echo commands 200 edit production 201 elaboration phase 67 episodic memory 99 epmem 99 149 excise 111 exec 64 exhaustion 67 74 79 explain backtraces 155 firing counts 135 float 62 floating point constants 34 floating point number 60 force learn 66 GDS 221 gds print 122 goal examples 67 representation 8 result see result stack 25 subgoal 23 24 30 termination 28 67 gp 112 gp max 113 grammar 213 grammar action side 214 grammar condition side 213 halt 59 help 114 I support 18 of result 27 i support 215 I O 11 31 68 input functions 69 input links 69 io attribute 69 output functions 69 INDEX output links 69 identifier 13 14 33 34 36 variablization of 77 ifeq 62 impasse 7 23 66 conflict 24 constraint failure 24 elimination 28 examples 67 no change 24 operator no change 24 resolution 28 67 state no change 24 tie 24 types 67 incorrect chunks 78 indifferent selection 20 157 init soar 115 int 61 integer 34 interface 109 internal symbols 123 interrupt 59 item attribute 67 item count a
44. gt lt variable_or_sym_constant gt lt variable_or_sym_constant gt lt value_make gt lt variable gt lt sym_constant gt lt rhs_value gt lt preference_specifier gt lt unary preference gt lt unary or binary preference gt lt unary or binary preference gt lt rhs_value gt n wou l yn l o u n fe gt l lt l Wen Appendix C The Calculation of O Support This appendix provides a description of when a preference is given O support by an instan tiation a preference that is not given O support will have I support Soar has four possible procedures for deciding support which can be selected among with the o support mode com mand see page 167 However only o support modes 3 amp 4 can be considered current to Soar 8 and o support mode 4 should be considered an improved version of mode 3 The default o support mode is mode 4 In O support modes 3 amp 4 support is given production by production that is all preferences generated by the RHS of a single instantiated production will have the same support In both modes a production must meet the following two requirements to create o supported preferences 1 The RHS has no operator proposals i e nothing of the form lt s gt operator lt o gt 2 The LHS has a condition that tests the current operator i e something of the form lt s gt operator lt o gt In condition 1 the variable
45. gt operator lt o gt 5 gt sp variable binding state lt s gt operator lt o gt value lt v gt gt lt s gt operator lt o gt lt v gt The first rule proposes multiple preferences for the proposed operator and thus does not comply with the rule format The second rule does not comply because it does not provide a constant for the numeric indifferent preference value In the typical RL use case the user intends for the agent to learn the best operator in each possible state of the environment The most straightforward way to achieve this is to give the agent a set of RL rules each matching exactly one possible state operator pair This approach is equivalent to a table based RL algorithm where the Q value of each state operator pair corresponds to the numeric indifferent preference created by exactly one RL rule In the more general case multiple RL rules can match a single state operator pair and a single RL rule can match multiple state operator pairs all numeric indifferent preferences for an operator are summed when calculating the operator s Q value In this context RL rules can be interpreted more generally as binary features in a linear approximator of each state operator pair s Q value and their numeric indifferent preference values their weights In other words Q s a wi 2 s a wede s a Wrdn s a where all RL rules in production memory are numbered 1 n Q s a
46. in which the goal of the problem solving is to resolve the impasse Thus in the substate operators will be selected and applied in an attempt either to discover which of the tied operators should be selected or to apply the selected operator piece by piece The substate is often called a subgoal because it exists to resolve the impasse but is sometimes called a substate because the representation of the subgoal in Soar is as a state The initial state in the subgoal contains a complete description of the cause of the impasse such as the operators that could not be decided among or that there were no operators 2 6 IMPASSES AND SUBSTATES 25 proposed and the state that the impasse arose in From the perspective of the new state the latter is called the superstate Thus the superstate is part of the substructure of each state represented by the Soar architecture using the superstate attribute The initial state created in the Oth decision cycle contains a superstate attribute with the value of nil the top level state has no superstate The knowledge to resolve the impasse may be retrieved by any type of problem solving from searching to discover the implications of different decisions to asking an outside agent for advice There is no a priori restriction on the processing except that it involves applying operators to states In the substate operators can be selected and applied as Soar attempts to solve the sub goal The ope
47. inconsistencies automatically The knowledge designer then does not have to consider potential inconsistencies between local o supported WMEs and the context The following sections describe further how the GDS works and how to use the GDS in behavior systems as well as how the GDS is implemented in the Soar kernel Behavior level view of the Goal Dependency Set This section discusses what the GDS does and how that impacts production knowledge design and implementation Soar 6 223 I Supported Feature O Supported Feature Figure E 1 Simplified Representation of the context dependencies above the line local os upported WMEs below the line and the generation of a result In Soar 7 this situation led to non contemporaneous constraints in the chunk that generates 3 Operation of the Goal Dependency Set Whenever a feature is created added to working memory in the Soar 7 architecture that feature will persist for some time The persistence of features may differ with respect to how long the features remain in memory and more importantly what circumstances cause the feature to be removed The Soar 7 architecture utilizes three primary types of persistence i support o support and c support The weakest persistence is instantiation support An i supported feature exists in memory only as long as the production which lead to the feature s creation remains instantiated
48. is important to note that watch level 0 turns off ALL watch options including backtracing indifferent selection and learning However the other watch levels do not change these settings That is if any of these settings is changed from its default it will retain its new setting until it is either explicitly changed again or the watch level is set to 0 144 CHAPTER 8 THE SOAR USER INTERFACE Watching Productions By default the names of the productions are printed as each production fires and retracts at watch levels 3 and higher However it may be more helpful to watch only a specific type of production The tracing of firings and retractions of productions can be limited to only certain types by the use of the following flags Option Flag Argument to Option Description D default remove optional Control only default productions as they fire and retract u user remove optional Control only user productions as they fire and retract c chunks remove optional Control only chunks as they fire and retract j justifications remove optional Control only justifications as they fire and retract T template remote optional Soar RL template firing trace Note The pwatch command is used to watch individual productions specified by name rather than watch a type of productions such as user Additionally when watching productions users may set the level of detail to be displayed
49. it is on top of e the destination of the block the thing it will be on top of CHAPTER 2 THE SOAR ARCHITECTURE Initial State Figure 2 2 The initial state and goal of the blocks world task S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 has an ontop O3 S1 has no operator B2 is a block B3 is a block B2 is named B B3 is named C B1 B2 is clear B3 is clear B1 is a block B1 is named A B1 is clear T1 is a table T1 is named table EO T1 is clear is a state has a problem space bl has a thing B1 has a thing B2 has a thing B3 has a thing T1 has an ontop O1 has an ontop O2 01 02 03 O1 k s a top block B O1 has a bottom block T1 O2 has a t p blocK B2 O2 has a boitom block T1 O3 has a top block B3 O3 has a bottom block T1 An Abstract View of Working Memory Figure 2 3 An abstract illustration of the initial state of the blocks world as working memory objects At this stage of problem solving no operators have been proposed or selected The goal in this task is to stack the blocks so that C is on the table with block B on block C and block A on top of block B 2 1 3 Representation of States Operators and Goals The initial state in our blocks world task before any operators have been proposed or selected is illustrated in Figure 2 3 A state can have only one operator at a
50. learn used to turn learning off Learning can be set to on or off When learn is on chunks are built When learn is off chunks are not built 2 Learning is set to bottom up and a chunk has already been built for a subgoal of the state that generated the results See Section 8 4 on page 159 for details of learn used to set learning to bottom up With bottom up learning chunks are learned only in states in which no subgoal has yet generated a chunk In this mode chunks are learned only for the bottom of the subgoal hierarchy and not the intermediate levels With experience the subgoals at the 73 74 CHAPTER 4 CHUNKING bottom will be replaced by the chunks allowing higher level subgoals to be chunked 3 The learning flag through local negations is disabled and the result is dependent on a test for the negation of a subgoal WME Testing a local negation can result in an overgeneral chunk see Section 4 6 on page 78 In this mode such chunks are not created 4 The chunk duplicates a production or chunk already in production memory In some rare cases a duplicate production will not be detected because the order of the condi tions or actions is not the same as an existing production 5 The augmentation quiescence t of the substate that produced the result is back traced through This mechanism is motivated by the chunking from exhaustion problem where the results of a subgoal are dependent on the exhaustion
51. lt block1 gt 3 3 PRODUCTION MEMORY 51 destination lt block2 gt lt blocki gt name lt blocki name gt lt block2 gt name lt block2 name gt gt write crlf Moving Block lt blocki name gt to lt block2 name gt could be written as sp blocks world monitor move block state lt s gt operator lt o gt lt o gt name move block moving block name lt blocki name gt destination name lt block2 name gt gt write crlf Moving Block lt blocki name gt to lt block2 name gt Attribute path notation yields shorter productions that are easier to write less prone to errors and easier to understand When attribute path notation is used Soar internally expands the conditions into the multi ple Soar objects creating its own variables as needed Therefore when you print a produc tion using the print command the production will not be represented using attribute path notation Negations and attribute path notation A negation may be used with attribute path notation in which case it amounts to a negated conjunction For example the production sp blocks negated conjunction example state lt s gt name top state lt s gt ontop lt on gt lt on gt bottom object lt bo gt lt bo gt type table gt lt s gt nothing ontop table true could be rewritten as sp blocks negated conjunction example state lt s g
52. memory lt s gt smem lt smem gt lt smem gt command lt smem c gt lt smem gt result lt smem r gt As rules augment the command structure in order to access change semantic knowledge 6 3 6 4 semantic memory augments the result structure in response Production actions should not remove augmentations of the result structure directly as semantic memory will maintain these WMEs 91 92 CHAPTER 6 SEMANTIC MEMORY Figure 6 1 Example long term identifier with four augmentations 6 2 Knowledge Representation The representation of knowledge in semantic memory is similar to that in working memory see Section 2 2 on page 13 both include graph structures that are composed of symbolic elements consisting of an identifier an attribute and a value It is important to note however key differences e Currently semantic memory only supports attributes that are symbolic constants string integer or decimal but not attributes that are identifiers e Whereas working memory is a single connected directed graph semantic memory can be disconnected consisting of multiple directed connected sub graphs Long term identifiers LTIs are defined as identifiers that exist in semantic memory The specific letter number combination that labels an LTI e g S5 or C7 is permanently associ ated with that long term identifier any retrievals of the long term identifier are guaranteed to return the associated letter
53. n is given it must be a positive integer and is used to reset the maximum number of allowed nil output cycles max nil output cycles controls the maximum number of output cycles that generate no out put allowed when a run out command is issued After this limit has been reached Soar stops The default initial setting of n is 15 166 CHAPTER 8 THE SOAR USER INTERFACE Examples The command issued with no arguments returns the max empty output cycles allowed max nil output cycles to set the maximum number of empty output cycles in one phase to 25 max nil output cycles 25 See Also Tun multi attributes Declare a symbol to be multi attributed Synopsis multi attributes symbol n Options symbol Any Soar attribute n Integer greater than 1 estimate of degree of simultaneous values for attribute Description This command declares the given symbol to be an attribute which can take on multiple values The optional n is an integer greater than 1 indicating an upper limit on the number of expected values that will appear for an attribute If n is not specified the value 10 is used for each declared multi attribute More informed values will tend to result in greater efficiency This command is used only to provide hints to the production condition reorderer so it can produce better condition orderings Better orderings enable the rete network to run faster This command has no effect on the ac
54. node usage in the Rete net the large data structure used for efficient matching in Soar The max argument reports per cycle maximum statistics for decision cycle time working memory changes and production fires For example if Soar runs for three cycles and there were 23 working memory changes in the first cycle 42 in the second and 15 in the third the max argument would report the highest of these values 42 and what decision cycle that it occurred in 2nd Statistics about the time spent executing the decision cycle and number of productions fired are also collected and reported by max in this manner reset zeros out these statistics so that new maximums can be recorded for future runs The numbers are also zeroed out with a call to init soar The track argument starts tracking the same stats as the max argument but records all data for each cycle instead of the maximum values This data can be printed using the cycle or cycle csv arguments When printing the data with cycle it may be sorted using the sort argument and a column integer Use negative numbers for descending sort Issue stop track to reset and clear this data 140 CHAPTER 8 THE SOAR USER INTERFACE A Note on Timers The current implementation of Soar uses a number of timers to provide time based statistics for use in the stats command calculations These timers are e total CPU time e total kernel time phase kernel time per phase phase call
55. number pair For clarity when printed a long term identi fier is prefaced with the symbol e g S5 or C7 Also when presented in a figure long term identifiers will be indicated by a double circle For instance Figure 6 1 depicts the long term identifier A68 with four augmentations representing the addition fact of 6 7 18 or rather 3 carry 1 in context of multi column arithmetic 6 2 1 Integrating Long Term Identifiers with Soar Integrating long term identifiers in Soar presents a number of theoretical and implementa tion challenges This section discusses the state of integration with each of Soar s memo ries learning mechanisms 6 2 1 1 Working Memory Long term identifiers exist as peers with short term identifiers in Working Memory 6 3 STORING SEMANTIC KNOWLEDGE 93 6 2 1 2 Procedural Memory Soar s production parser i e the sp command has been modified to allow specification of long term identifiers prefaced with an symbol in any context where a variable is valid If a rule contains a long term identifier that is not currently in semantic memory a fatal error will be raised and Soar will quit Once added to the rete the long term identifier is treated as a constant for matching purposes If specified as the value of a WME in an action a long term identifier will be added to working memory if it does not already exist There is also preliminary support for chunking over long term identifiers It is
56. oaoa a a 76 4 3 Variablizi ng Identifiers 4 2 4 se sc tpc sated Oe ESP ee ed Pee 77 44 Ordering Conditions s s sac sec eee ee ee eS 77 Ao Inhibition of Chunk a o ecs kh eke eR doe OR k aog eR k Ee BRE She 77 4 6 Problems that May Arise with Chunking 78 4 6 1 Using search control to determine correctness 78 4 6 2 Testing for local negated conditions 78 4 6 3 Testing for the substate 2 5 4 4 a hoa Oe a 79 4 6 4 Mapping multiple superstate WMEs to one local WME 79 4 6 5 Revising the substructure of a previous result 80 Reinforcement Learning 81 Ol Tiles 2454 44644434434 keke EOE EOE EE EEE SD ES 81 52 Reward Representation 4 ss ke ee DKA PERE R E RE OR ERS OS 83 53 Updating RL Rule Valu s 4 44246444 Be bo ADS HEE RES 84 531 Gaps in R le Lye ecos wen ee Pe RPE PERE DS RE Sy 86 532 RLand Gubstates 2 e cacra ed ede bw bee we He REG 86 539 Poy ees o cee eb oe edea a RE EE a Ce ee OO 88 5 4 Automatic Generation of RL Rules 0 0000 0s 89 CONTENTS iii Oo The gp Commande r ss Ee thapari h Ewe e a se 4 89 54 2 Rule Temnplates o secr ede ei Oe slad ou ea 8 be aay Se oa 89 Oo Chunks os ses sira a p bod a a E T 90 6 Semantic Memory 91 Ol Working Memory She nc soc sde pos see i s eRe RE Eee eS 91 6 2 Knowledge Representation 1 2 0 a e e 92 6 2 1 Integrating Long Term Identifiers with Soar 92 6
57. of alternatives see Section 4 6 on page 78 If this substate augmentation is encountered when determining the conditions of a chunk then no chunk will be built for the currently considered action This is recursive so that if an un chunked result is relevant to a second result no chunk will be built for the second result This does not prevent the creation of a chunk that would include quiescence t as a condition 6 Learning has been temporarily turned off via a call to the dont learn production action described on page 65 in Section 3 3 6 12 This capability is provided for debugging and system development and it is not part of the theory of Soar If a result is to be chunked Soar builds the chunk as soon as the result is created rather than waiting until subgoal termination 4 2 Determining Conditions and Actions Chunking is an experience based learning mechanism that summarizes as productions the problem solving that occurs within a state In order to maintain a history of the processing to be used for chunking Soar builds a trace of the productions that fire in the subgoals This section describes how the relevant actions are determined how information is stored in a trace and finally how the trace and the actions together determine the conditions for the chunk In order for the chunk to apply at the appropriate time its conditions must test exactly those working memory elements that were necessary to produce the resul
58. or how to create Soar applications using multiple interacting agents A discussion of these topics is provided in a separate document the SML Quick Start Guide which is available at the Soar project website see link below For novice Soar users try The Soar 9 Tutorial which guides the reader through several example tasks and exercises See Section 1 2 for information about obtaining Soar documentation 1 2 Contacting the Soar Group Resources on the Internet The primary website for Soar is http sitemaker umich edu soar Look here for the latest downloads documentation and Soar related announcements as well as links to information about specific Soar research projects and researchers and a FAQ list of frequently asked questions about Soar Soar kernel development is hosted on Google Code at http code google com p soar This site contains the public subversion repository active documentation wiki and is also where bugs should be reported For questions about Soar you may write to the Soar e mail list at soar group lists sourceforge net If you would like to be on this list yourself visit http lists sourceforge net lists listinfo soar group 4 CHAPTER 1 INTRODUCTION For Those Without Internet Access If you cannot reach us on the internet please write to us at the following address The Soar Group Artificial Intelligence Laboratory University of Michigan 2260 Hayward Street Ann Arbor MI 48109 21
59. pair message status received to the identifier symbol S1 add wme S1 message status received This example adds an attribute value pair with an acceptable preference to the identifier symbol Z2 The attribute is message and the value is a unique identifier generated by Soar Note that since the is optional it has been left off in this case add wme Z2 message Warnings Be careful how you use this command It may have weird side effects possibly even including system crashes For example the chunker can t backtrace through wmes created via add wme nor will such wmes ever be removed thru Soar s garbage collection Manually removing context impasse wmes may have unexpected side effects See Also remove wme capture input Store the input wmes in a file for reloading later Synopsis capture input open filename flush capture input query capture input close 196 CHAPTER 8 THE SOAR USER INTERFACE Options filename Open filename and begin recording input o open Writes captured input to file overwriting any existing data f flush Writes input to file as soon as it is encountered instead of storing it in RAM and writing when capturing is turned off q query Returns open if input capturing is active or closed if capturing is not active c close Stop capturing input and close the file writing captured input unless the flush option is given Descript
60. perform generic maintenance functions such as cleaning processed output link structures To address this issue Soar s RL mechanism supports automatic propagation of updates over gaps For a gap of length n the Sarsa update is i t t n t a b3 vitri a Osi eas Atin 1 Q s a and the Q Learning update is t n Oo a b3 ytri PE RAX Olster a Q S a Note that rewards will still be collected during the gap but they are discounted based on the number of decisions they are removed from the initial RL operator Gap propagation can be disabled by setting the temporal extension parameter of the rl command to off When gap propagation is disabled the RL rules preceding a gap are updated using Q st 1 141 0 The r1 setting of the watch command see Section 8 3 on page 142 is useful in identifying gaps 5 3 2 RL and Substates When an agent has multiple states in its state stack the RL mechanism will treat each substate independently As mentioned previously each state has its own reward link When an RL operator is selected in a state S the RL updates for that operator are only affected by the rewards collected on the reward link for S and the Q values of subsequent RL operators selected in S The only exception to this independence is when a selected RL operator forces an operator no change impasse When this occurs the number of decision cycles the RL operator at the superstate remains selected is dependent upo
61. processing can proceed An impasse can also become irrelevant if input from the outside world changes working memory which in turn causes productions to fire that make it possible to select an operator In all these cases the impasse is eliminated but not resolved and Soar does not learn in this situation 2 6 IMPASSES AND SUBSTATES 29 Regenerating Impasses An impasse is regenerated when the problem solving in the subgoal becomes inconsistent with the current situation During problem solving in a subgoal Soar monitors which aspect of the surrounding situation the working memory elements that exist in superstates the problem solving in the subgoal has depended upon If those aspects of the surronding situation change either because of changes in input or because of results the problem solving in the subgoal is inconsistent and the state created in response to the original impasse is removed and a new state is created Problem solving will now continue from this new state The impasse is not resolved and Soar does not learn in this situation The reason for regeneration is to guarantee that the working memory elements and prefer ences created in a substate are consistent with higher level states As stated above incon sistency can arise when a higher level state changes either as a result of changes in what is sensed in the external environment or from results produced in the subgoal The problem with inconsistency is that
62. rules generate new rules as patterns occur at run time Unfortunately this incurs a high run time cost If all possible values are known in advance then the rules can be generated using gp at source time thus allowing code to run faster gp is not appropriate when all possible values are not known or if the total number of possible rules is very large and the system is likely to encounter only a small subset at run time It is also possible to combine gp and template e g if some of the values are known and not others This should reduce the run time cost of template There is nothing that actually restricts gp to being used for RL although for non RL rules a disjunction list using lt lt and gt gt is better where it can be used More esoteric uses may include multiple bracketed value lists inside a disjunction list or even variables in bracketed value lists Each rule generated by gp has integer appended to its name where integer is some incrementing number Examples Template version of rule 8 1 BASIC COMMANDS FOR RUNNING SOAR 113 sp water jug fill template state lt si gt name water jug operator lt op gt jug lt ji gt lt j2 gt lt op gt name fill fill jug volume lt fvol gt lt j1 gt volume 3 contents lt c1 gt lt j2 gt volume 5 contents lt c2 gt gt lt si gt operator lt op gt 0 gp version of rule generates 144 rules gp water jug fill
63. see Section 8 1 on Page 117 3 3 4 Comments optional Productions may contain comments which are not stored in Soar when the production is loaded and are therefore not printed out by the print command A comment is begun with a pound sign character and ends at the end of the line Thus everything following the is not considered part of the production and comments that run across multiple lines must each begin with a For example sp blocks world propose move block state lt s gt problem space blocks thing lt thing1 gt lt gt lt thing1l gt lt thing2 gt ontop lt ontop gt lt thing1 gt type block clear yes lt thing2 gt clear yes lt ontop gt top block lt thing1 gt 3 3 PRODUCTION MEMORY Al pottom block lt gt lt thing2 gt gt lt s gt operator lt o gt lt o gt name move block you can also use in line comments moving block lt thing1 gt destination lt thing2 gt When commenting out conditions or actions be sure that all parentheses remain balanced outside the comment External comments Comments may also appear in a file with Soar productions outside the curly braces of the sp command Comments must either start a new line with a or start with In both cases the comment runs to the end of the line imagine that this is part of a Soar program that contains Soar productions as well as some other code source block
64. set of actions If the conditions of a production match working memory the production fires and the actions are performed 2 3 1 The structure of a production In the simplest form of a production conditions and actions refer directly to the presence or absence of objects in working memory For example a production might say CONDITIONS block A is clear block B is clear ACTIONS suggest an operator to move block A ontop of block B This is not the literal syntax of productions but a simplification The actual syntax is presented in Chapter 3 The conditions of a production may also specify the absence of patterns in working memory For example the conditions could also specify that block A is not red or there are no red blocks on the table But since these are not needed for our example production there are no examples of negated conditions for now 2 3 PRODUCTION MEMORY LONG TERM KNOWLEDGE 17 The order of the conditions of a production do not matter to Soar except that the first condition must directly test the state Internally Soar will reorder the conditions so that the matching process can be more efficient This is a mechanical detail that need not concern most users However you may print your productions to the screen or save them in a file if they are not in the order that you expected them to be it is likely that the conditions have been reordered by Soar 2 3 1 1 Variables in productions and multiple
65. six distinct but similar operators for the initial state as illustrated in Figure 2 5 These operators correspond to the six different actions that are possible given the initial state 2 1 5 Comparing candidate operators Preferences The second step Soar takes in selecting an operator is to evaluate or compare the candidate operators In Soar this is done via rules that test the proposed operators and the current state and then create preferences Preferences assert the relative or absolute merits of the candidate operators For example a preference may say that operator A is a better choice than operator B at this particular time or a preference may say that operator A is the best thing to do at this particular time 2 1 6 Selecting a single operator Soar attempts to select a single operator based on the preferences available for the candidate operators There are four different situations that may arise 1 The available preferences unambiguously prefer a single operator 2 The available preferences suggest multiple operators and prefer a subset that can be selected from randomly 3 The available preferences suggest multiple operators but neither case 1 or 2 above hold 4 The available preferences do not suggest any operators 2 1 AN OVERVIEW OF SOAR 11 In the first case the preferred operator is selected In the second case one of the subset is selected randomly In the third and fourth cases Soar has reac
66. space Soar programs are implicitly organized in terms of problem spaces because the conditions for proposing operators will restrict an operator to be considered only when it is relevant The complete problem space for the blocks world is show in Figure 2 6 Typically when Soar solves a problem in this problem space it does not explicitly generate all of the states examine them and then create a path Instead Soar is in a specific state at a given time represented in working memory attempting to select an operator that will move it to a new state It uses whatever knowledge it has about selecting operators given the current situation and if its knowledge is sufficient it will move toward its goal The same problem could be recast in Soar as a planning problem where the goal is to develop a plan to solve the problem instead of just solving the problem In that case a state in Soar would consist of a plan which in turn would have representations of Blocks World states and operators from the original space The operators would perform editing operations on the plan such as adding new Blocks World operators simulating those operators etc In both formulations of the problem Soar is still applying operators to generate new states it is just that the states and operators have different content 2 2 WORKING MEMORY THE CURRENT SITUATION 13 A B BI move block move block move move Y A TIDS B C Al block block move block CA
67. state is created States are created during initial ization the first state or because of an impasse a substate 3 The decision procedure creates the operator augmentation of the state based on pref erences This records the selection of the current operator 4 External I O systems create working memory elements on the input link for sensory data The elements in working memory are removed in six different ways 1 The decision procedure automatically removes all state augmentations it creates when the impasse that led to their creation is resolved 2 The decision procedure removes the operator augmentation of the state when that operator is no longer selected as the current operator 3 Production actions that use reject preferences remove working memory elements that were created by other productions 4 The architecture automatically removes i supported WMEs when the productions that created them no longer match 5 The I O system removes sensory data from the input link when it is no longer valid 6 The architecture automatically removes WME s that are no longer linked to a state because some other WME has been removed For the most part the user is free to use any attributes and values that are appropriate for the task However states have special augmentations that cannot be directly created removed or modified by rules These include the augmentations created when a state is created and the state s operator augme
68. stats Options m memory report usage for Soar s memory pools r rete report statistics about the rete structure s system report the system agent statistics default Mn max report the per cycle maximum statistics decision cycle time WM changes production fires R reset zero out the per cycle maximum statistics reported by max com mand t track begin tracking the per cycle maximum statistics reported by max for each cycle instead of only the max value T stop track stop and clear tracking of the per cycle maximum statistics c cycle print out collected per cycle maximum statistics saved by track in human readable form C cycle csv print out collected per cycle maximum statistics saved by track in comma separated form S sort N sort the tracked cycle stats by column number N see table below Tracked Cycle Stats Columns For use with sort option Negative values sorts descending 0 Use default sort 1 1 Sort by decision cycle use negative for descending 2 2 Sort by DC time use negative for descending 3 3 Sort by WM changes use negative for descending 4 4 Sort by production firings use negative for descending Description This command prints Soar internal statistics The argument indicates the component of interest system is used by default 8 3 CONFIGURING TRACE INFORMATION AND DEBUGGING 139 With the sy
69. symbol state All other conditions and actions must be linked directly or indirectly to this condition This linkage may be direct to the state or it may be indirect through objects specified in the conditions If the identifiers of the actions are not linked to the state a warning is printed when the production is parsed and the production is not stored in production memory In the actions of the example production shown in Figure 3 2 the operator preference is directly linked to the state and the remaining actions are linked indirectly via the operator preference Although all of the attribute tests in the template above are followed by value tests it is possible to test for only the existence of an attribute and not test any specific value by just including the attribute and no value Another exception to the above template is operator preferences which have the following structure where a plus sign follows the value test state identifier test operator valuei test iad In the remainder of this section we describe the different tests that can be used for identifiers attributes and values The simplest of these is a constant where the constant specified in the attribute or value must match the same constant in a working memory element 3 3 5 2 Variables in productions Variables match against constants in working memory elements in the identifier attribute or value positions Variables can be further constrained by addit
70. test gt lt conjunctive_test gt lt simple_test gt lt disjunction_test gt lt relational_test gt lt relation gt lt single_test gt lt variable gt lt constant gt APPENDIX B GRAMMARS FOR PRODUCTION SYNTAX lt attr_test gt lt attr_test gt lt value_test gt lt test gt lt test gt lt conds_for_one_id gt lt conjunctive_test gt lt simple_test gt lt simple_test gt lt disjunction_test gt lt relational_test gt lt lt lt constant gt gt gt lt relation gt lt single_test gt Wage Be Aime a ol lt variable gt lt constant gt lt lt sym_constant gt gt lt sym_constant gt lt int_constant gt lt float_constant gt gt lt gt Notes on the Condition Side e In an lt id test gt only a lt variable gt may be used in a lt single test gt B 1 2 Grammar for Action Side Below is a grammar for the action sides of productions lt rhs gt lt rhs_action gt lt func_call gt lt func_name gt lt rhs_value gt lt attr_value_make gt lt variable_or_sym_constant gt lt value_make gt lt preference specifier gt lt unary pref gt lt unary or binary pref gt lt rhs_action gt lt variable gt lt attr_value_make gt lt func_call gt lt func_name gt lt rhs_value gt lt sym_constant gt x ane lt constant gt lt func_call gt lt variable
71. that are present and firing with those values 78 CHAPTER 4 CHUNKING 4 6 Problems that May Arise with Chunking One of the weaknesses of Soar is that chunking can create overgeneral productions that apply in inappropriate situations or overspecific productions that will never fire These problems arise when chunking cannot accurately summarize the processing that led to the creation of a result Below is a description of five known problems in chunking 4 6 1 Using search control to determine correctness Overgeneral chunks can be created if a result of problem solving in a subgoal is dependent on search control knowledge Recall that desirability preferences such as better best and worst are not included in the traces of problem solving used in chunking Section 4 2 on page 74 In theory these preferences do not affect the validity of search In practice however a Soar program can be written so that search control does affect the correctness of search Here are two examples 1 Some of the tests for correctness of a result are included in productions that prefer operators that will produce correct results The system will work correctly only when those productions are loaded 2 An operator is given a worst preference indicating that it should be used only when all other options have been exhausted Because of the semantics of worst this operator will be selected after all other operators however if this operator then produces a
72. the default a apply Select the apply phase d decision Select the decision phase i input Select the input phase 0 output Select the output phase p proposal Select the proposal phase Description When running by decision cycle it can be helpful to have agents stop at a particular point in its execution cycle This command allows the user to control which phase Soar stops in The precise definition is that running for n decisions and stopping before phase ph means to run until the decision cycle counter has increased by n and then stop when the next phase is ph The phase sequence as of this writing is input proposal decision apply output Stopping after one phase is exactly equivalent to stopping before the next phase On initialization Soar defaults to stopping before the input phase or after the output phase however you like to think of it Setting the stop phase applies to all agents Examples set stop phase Bi stop before input phase set stop phase Ad stop after decision phase before apply phase set stop phase d stop before decision phase set stop phase after output stop after output phase set stop phase reports the current stop phase 174 CHAPTER 8 THE SOAR USER INTERFACE smem Control the behavior of semantic memory Synopsis smem smem g get lt parameter gt smem s set lt parameter gt lt value gt smem S s
73. the dependency set because they are the superstate gt The implementation is slightly different trading additional memory overhead to avoid scanning all the goal dependency sets after each WM change See the next section In addition superstate WMEs can also include context slot preferences which are represented in the architecture as working memory elements 225 Dependency Set to ti A D t2 A B C D t A B C D Figure E 2 The Dependency Set in Soar 8 features that led to 1 which in turn led to 2 and finally 4 However because item A was previously added to the dependency set at t it is unnecessary to add it again Local O Supported Features The dependencies of a local o supported feature have al ready been added to the state s GDS Thus tests of local o supported WMEs do not require additions to the dependency set In Figure E 2 the creation of element 5 does not change the dependency set because it is dependent only upon persistent items 3 and 4 whose features had been previously added to the GDS In Soar 8 any change to the current dependency set will cause the retraction of all subgoal structure Thus any time after time t either the D to D or A to A transition would cause the removal of the entire subgoal The E to E transition causes no retraction because E is not in the goal s dependency set The role of th
74. the total effect of those rule firings will be collapsed into creating a single WME in the substate because working memory is represented as a set If 80 CHAPTER 4 CHUNKING this WME is then tested to create a result on the superstate the chunk that is subsequently created will be overgeneral While the original subgoal processing created only one result the chunk will create a distinct result for each superstate structure originally tested This is because the desired behavior cannot be reduced to a single rule Solution If this type of behavior is needed the single WME should go in the top state so that the chunks built can similarly map multiple structures to one 4 6 5 Revising the substructure of a previous result This can occur when a subgoal creates a local structure which is then linked to a superstate becoming a result A new WME added to this structure is also a result as as it is linked to the superstate However if that WME is created with a rule that matches the local state only not the superstate Soar cannot build a chunk for the result as it is unable to determine how the new WME is linked to the superstate For example assume that an agent builds up a structure consisting of an identifier called thing attached to a substate and then adds property foo as an augmentation to thing If the agent now matches thing on the substate and creates a WME on a superstate linked to the same identifier that identifier a
75. time and the operator is represented as substructure of the state A state may also have as substructure a number of potential operators that 2 1 AN OVERVIEW OF SOAR 9 links from operators to blocks are omitted for simplicity B2 B3 B2 is a block B3 is a block O4 is named move block B2 is named B B3 is named C 04 04 has movigg block B2 B1 B2 is clear B3 is clear O4 has destination B1 B1 is a block f T1 O5 is named move block B1 is named A TH jabl 05 O5 has moving block B3 B1 is clear T1 ad table O5 na des nati n B1 S1 Ti is clear amp block S1 is a state S1 has a problem spac blocks S1 has a thing B1 S1 has a thing B2 S1 has a thing B3 S1 has a thing T1 S1 has an ontop O1 S1 has an ontop O2 S1 has an ontop O3 S1 has operator O7 S1 has six proposed opera 07 O7 is named move block O7 has moving block B3 07 07 has destination B2 01 O1 h s a top block B41 O1 has a bottom block T1 O2 has a t p block B2 02 O2 has a boitom block T1 03 O3 has a top block B3 E O3 has a bottom block T1 An Abstract View of Working Memory Figure 2 4 An abstract illustration of working memory in the blocks world after the first operator has been selected are in consideration however these suggested operators should not
76. 1 I2 block B2 I2 block B3 B1 x location 1 B1 y location 0 B1 color red B2 x location 2 B2 y location 0 B2 color blue B3 x location 3 B3 y location 0 B3 color yellow The A notation in the example is used to indicate the working memory elements that are created by the architecture and not by the input function This configuration of blocks corresponds to all blocks on the table as illustrated in the initial state in Figure 2 2 Then during the Apply Phase of the execution cycle Soar productions could respond to an operator such as move the red block ontop of the blue block by creating a structure on the output link such as 3 5 SOAR I O INPUT AND OUTPUT IN SOAR 71 move block moving block yellow eo gt Figure 3 4 An example portion of the output link for the blocks world task S1 io I1 A I1 output link I3 A I3 name move block I3 moving block B1 I3 x destination 2 I3 y destination 1 B1 x location 1 B1 y location 0 B1 color red The A notation is used to indicate the working memory elements that are created by the architecture and not by productions An output function would look for specific structure in this output link and translate this into the format required by the external program that controls the robotic arm Movement by the robotic arm would lead to changes
77. 1 1 4000019 S1 operator 01000001 1 3 S1 reward link R1 1 8 4 CONFIGURING SOAR S RUNTIME PARAMETERS 181 8 S1 smem S2 1 2 S1 superstate nil 1 14 S1 top state S1 1 1 S1 type state 1 The bracketed values are activation To get the history of an individual element wma history 18 history 60 5999999 first d1 6 d1000000 1 da999999 2 a999998 3 d999997 4 a999996 5 a999995 6 a999994 7 d999993 8 d999992 9 d999991 10 DAA ADNADAADAA AD YN COO OO0O O O considering WME for decay d1019615 This shows the last 60 references of 5999999 in total where the first occurred at decision cycle 1 For each reference it says how many references occurred in the cycle such as 6 at decision 1000000 which was one cycle ago at the time of executing this command Note that references during the current cycle will not be reflected in this command or computed activation value until the end of output phase If forgetting is on this command will also display the cycle during which the WME will be considered for decay Even if the WME is not referenced until then this is not necessarily the cycle at which the WME will be forgotten However it is guaranteed that the WME will not be forgotten before this cycle Parameters The wma command uses the get set lt parameter gt lt value gt convention rather than individual switches for each parameter Run
78. 191 Normally users wish to save only production memory Note that justifications cannot be present when saving the Rete net Issuing an init soar before saving a Rete net will remove all justifications and working memory elements If the filename contains a suffix of Z then the file is compressed automatically when it is saved and uncompressed when it is loaded Compressed files may not be portable to another platform if that platform does not support the same uncompress utility See Also excise init soar set library location Set the top level directory containing demos help etc Synopsis set library location directory Options directory The new desired library location Description Invoke with no arguments to query what the current library location is The library location should contain at least the help subdirectory and the command names file for help to work See Also help source Load and evaluate the contents of a file Synopsis source options filename 192 CHAPTER 8 THE SOAR USER INTERFACE Options filename The file of Soar productions and commands to load a all Enable a summary for each file sourced d disable Disable all summaries v verbose Print excised production names Description Load and evaluate the contents of a file The filename can be a relative path or a fully qualified path source will generate an implicit push to t
79. 207 208 APPENDIX A THE BLOCKS WORLD PROGRAM Note that this production will fire exactly once and will never retract sp blocks world elaborate initial state state lt s gt superstate nil gt lt s gt problem space blocks thing lt block A gt lt block B gt lt block C gt lt table gt ontop lt ontop A gt lt ontop B gt lt ontop C gt lt block A gt type block name A lt block B gt type block name B lt block C gt type block name C lt table gt type table name TABLE lt ontop A gt top block lt block A gt bottom block lt table gt lt ontop B gt top block lt block B gt bottom block lt table gt lt ontop C gt top block lt block C gt bottom block lt table gt write crlf Initial state has A B and C on the table HHHHHHAEHHHHHHHHHRHAAHAA ARH HHRR REA A EERE HHH HAHAHA REHEARSE State elaborations keep track of which objects are clear There are two productions one for blocks and one for the table HHFHHHHEHHHHHHHHHRAAAEAA ARE HHRR RRR H EERE HHH ARH H AAA RRR ARS HHHHHHHHHHAR REE HHHHAAAEHA AEE HHR RRR E EERE HHA H HAAR RH Assert table always clear The conditions establish that 1 The state has a problem space named blocks 2 The state has a thing of type table The action 1 creates an acceptable preference for an attribute value pair asserting the table is clear This producti
80. 21 USA 1 3 A Note on Different Platforms and Operating Sys tems Soar runs on a wide variety of platforms including Linux Unix although not heavily tested Mac OS X and Windows 7 Vista and XP and probably 2000 and NT This manual documents Soar generally although all references to files and directories use Unix format conventions rather than Windows style folders Chapter 2 The Soar Architecture This chapter describes the Soar architecture It covers all aspects of Soar except for the specific syntax of Soar s memories and descriptions of the Soar user interface commands This chapter gives an abstract description of Soar It starts by giving an overview of Soar and then goes into more detail for each of Soar s main memories working memory production memory and preference memory and processes the decision procedure learning and input and output 2 1 An Overview of Soar The design of Soar is based on the hypothesis that all deliberate goal oriented behavior can be cast as the selection and application of operators to a state A state is a representation of the current problem solving situation an operator transforms a state makes changes to the representation and a goal is a desired outcome of the problem solving activity As Soar runs it is continually trying to apply the current operator and select the next operator a state can have only one operator at a time until the goal has been achieved The se
81. 3 3 5 12 Structured value notation Another convenience that eliminates the use of intermediate variables is structured value notation Syntactically the attributes and values of a condition may be written where a variable would normally be written The attribute value structure is delimited by parentheses Using structured value notation the production in Figure 3 2 on page 39 may also be written as sp blocks world propose move block state lt s gt problem space blocks thing lt thing1 gt lt gt lt thing1l gt lt thing2 gt ontop top block lt thing1 gt pottom block lt gt lt thing2 gt lt thing1 gt type block clear yes lt thing2 gt clear yes gt lt s gt operator lt o gt lt o gt name move block moving block lt thing1 gt destination lt thing2 gt Thus several conditions may be collapsed into a single condition 3 3 PRODUCTION MEMORY 55 Using variables within structured value notation Variables are allowed within the parentheses of structured value notation to specify an iden tifier to be matched elsewhere in the production For example the variable lt ontop gt could be added to the conditions although it are not referenced again so this is not helpful in this instance sp blocks world propose move block state lt s gt problem space blocks thing lt thing1 gt lt gt lt thing1 gt lt thing2 gt ontop lt ontop gt
82. 3 Storing Semantic Knowledge 02 00002 eee eee 93 6 3 1 User Initinted Storage 2 0 a 2 ee eRe ee es 94 6 3 2 Storage Location 2 2 a s Se eR Ae eR we ee ES 94 6 4 Retrieving Semantic Knowledge 0 2 000022 eae 94 6 4 1 Non Cue Based Retrievals aoaaa aaa 95 6 4 2 Cue Based Retrievals o 425 se ea eee ee ee dE ewe es 95 Ge PerorManGE s lee ae de ee Be Bee ee eee e i 97 6 5 1 Performance Tweaking lt 24 06 22496 568444440444 97 7 Episodic Memory 99 Gl Working Memory Structure s cs ce eke ewe hee eee EER EES EES 99 Ta Episodi DWTS ceed Pea eae eee be dwe ged eiacewsa ces 100 Tonk Episode Contents a a aa s RG EPP EG EEC aca k a RR ES 100 Laz Storage IG soc ca saceka RS e e E we RES Swe eS 100 T Retrieving Episodes o a sa cererea n dw EE an Taea i 101 tol Cue Based Retrievals o lt oe Shae De Le EYEE eek Ee OEY OS 101 7 3 2 Absolute Non Cue Based Retrieval 4 103 7 3 3 Relative Non Cue Based Retrieval oaoa a 103 7 3 4 Retrieval Meta Data 2 2 4 2246 bese ek bee eee 104 Ce Veg cos ow Bee ere Bok BA Se Be ee a oe eRe Ge wR EH 105 7 4 1 Performance Tweaking 2 420554 45 54 ene ee tee ee 106 8 The Soar User Interface 109 8 amp 1 Basie ommands tor Running Soar soca s bebe YA Re A aOR Ye we x 110 8 amp 2 Examining Memory gt e s ss sa gocs amg barkon Se go G pa oe od 120 8 3 Configuring Trace Information and Debugging 134 8 4 Configur
83. 7 3 and a discussion of performance 7 4 The detailed behavior of episodic memory is determined by numerous parameters that can be controlled and configured via the epmem command Please refer to the documentation for that command in Section 8 4 on page 149 7 1 Working Memory Structure Upon creation of a new state in working memory see Section 2 6 1 on page 24 Section 3 4 on page 66 the architecture creates the following augmentations to facilitate agent interaction with episodic memory lt s gt epmem lt e gt lt e gt command lt e c gt lt e gt result lt e r gt lt e gt present id As rules augment the command structure in order to retrieve episodes 7 3 episodic memory augments the result structure in response Production actions should not remove augmen tations of the result structure directly as episodic memory will maintain these WMEs The value of the present id augmentation is an integer and will update to expose to the agent the current episode number This information is identical to what is available via the time statistic see Section 8 4 on page 149 and the present id retrieval meta data 7 3 4 99 100 CHAPTER 7 EPISODIC MEMORY 7 2 Episodic Storage Episodic memory records new episodes without deliberate action consideration by the agent The timing and frequency of recording new episodes is controlled by the phase and trigger parameters The phase parameter sets the phase in the
84. B is indifferent to C A and C will not be indifferent to one another unless there is a preference that A is indifferent to C or C and A are both indifferent to all competing values Acceptable An acceptable preference states that a value is a candidate for selection All values except those with require preferences must have an acceptable preference in order to be selected If there is only one value with an acceptable preference and none with a require preference that value will be selected as long as it does not also have a reject or a prohibit preference Reject A reject preference states that the value is not a candidate for selection Better gt Worse lt A better or worse preference states for the two values involved that one value should not be selected if the other value is a candidate Better and 20 CHAPTER 2 THE SOAR ARCHITECTURE worse allow for the creation of a partial ordering between candidate values Better and worse are simple inverses of each other so that A better than B is equivalent to B worse than A Best gt A best preference states that the value may be better than any competing value unless there are other competing values that are also best If a value is best and not rejected prohibited or worse than another it will be selected over any other value that is not also best or required If two such values are best then any re maining preferences for those candidates worst indi
85. CT BT move block BT A A move block B B CA Al B move pry movez mover OC loc move block AC AT Cn move block ne ees ii A A B move move block move block C T a BA move block Al BT fmove move block B poo AT T move B block B A B _ IA move block move block B move move AB block block ve AD states C a 3 operators Al Figure 2 6 The problem space in the blocks world includes all operators that move blocks from one location to another and all possible configurations of the three blocks The remaining sections in this chapter describe the memories and processes of Soar work ing memory production memory preference memory Soar s execution cycle the decision procedure learning and how input and output fit in 2 2 Working memory The Current Situation Soar represents the current problem solving situation in its working memory Thus working memory holds the current state and operator and is Soar s short term knowledge reflecting the current knowledge of the world and the status in problem solving Working memory contains elements called working memory elements or WME s for short Each WME contains a very specific piece of information for example a WME might say that B1 is a block Several WME s collectively may provide more information about the same object for example B1 is a block B1 is named A B1 is on the table
86. If there are multiple valid reward signals their values are summed into a single reward signal As an example consider the following state S1 reward link R1 R1 reward R2 R2 value 1 0 R1 reward R3 R3 value 0 2 In this state there are two reward signals with values 1 0 and 0 2 They will be summed together for a total reward of 0 8 and this will be the value given to the RL update algorithm There are two reasons for requiring the intermediate identifier The first is so that multiple reward signals with the same value can exist simultaneously Since working memory is a set multiple WMEs with identical values in all three positions identifier attribute value cannot exist simultaneously Without an intermediate identifier specifying two rewards with the same value would require a WME structure such as S1 reward link R1 Ri reward 1 0 R1 reward 1 0 which is invalid With the intermediate identifier the rewards would be specified as S1 reward link R1 R1 reward R2 R2 value 1 0 R1 reward R3 84 CHAPTER 5 REINFORCEMENT LEARNING R3 value 1 0 which is valid The second reason for requiring an intermediate identifier in the reward signal is so that the rewards can be augmented with additional information such as their source or how long they have existed Although this information will be ignored by the RL mechanism it can be useful to the agent or programmer For e
87. Manual storage can be arbitrarily complex and use standard dot notation 6 3 2 Storage Location Semantic memory uses SQLite to facilitate efficient and standardized storage and querying of knowledge The semantic store can be maintained in memory or on disk per the database and path parameters If the store is located on disk users can use any standard SQLite programs components to access query its contents However using a disk based semantic store is very costly performance is discussed in greater detail in Section 6 5 on page 97 and running in memory is recommended for most runs The lazy commit parameter is a performance optimization If set to on default disk databases will not reflect semantic memory changes until the Soar kernel shuts down This improves performance by avoiding disk writes The optimization parameter see Section 6 5 on page 97 will have an affect on whether databases on disk can be opened while the Soar kernel is running 6 4 Retrieving Semantic Knowledge An agent retrieves knowledge from semantic memory by creating an appropriate command we detail the types of commands below on the command link of a state s smem structure At the end of the output of each decision semantic memory processes each state s smem command structure Results meta data and errors are added to the result structure of that state s smem structure Only one type of retrieval command which may include optional modifiers can b
88. NDIX E A GOAL DEPENDENCY SET PRIMER dependent change for feature 1 because A was used to create 1 In Soar 7 some features are insensitive to dependent changes These features are often referred to as persistent WMEs because unlike i supported WMEs they remain in memory until explicitly removed There are two different types of this stronger persistence o support and c support Any feature created by the action of an operator receives operator support An o supported feature remains in memory until explicitly rejected or until the superstructure to which it is attached is removed Removal is architecturally independent of the WME s instantiating conditions Context support affects the persistence of an operator itself rather than its effects Once a unique operator has been chosen by the decision procedure the choice persists until explicitly re decided via a reconsider preference C support ensures that the WME for a selected operator remains available even if the production that proposed the operator is no longer instantiated Soar 8 eliminates c support so that operators now persist only as long as they receive instantiation support This change was integral to the overall solution Soar 8 provides but is distinct from the GDS The GDS provides a solution to the first problem When A changes the persistent WME 1 may be no longer consistent with its context e g A The specific solution is inspired by the chunkin
89. Preferences are created by production firings and express the relative or absolute merits for selecting an operator for a state When preferences express an absolute rating they are identifier attribute value preference quadruples when preferences express relative ratings they are identifier attribute value preference value quintuples For example S1 operator 03 is a preference that asserts that operator O3 is an acceptable operator for state S1 while S1 operator 03 gt 04 is a preference that asserts that operator O3 is a better choice for the operator of state S1 than operator O4 The semantics of preferences and how they are processed were described in Section 2 4 which also described each of the eleven different types of preferences Multiple production instantiations may create identical preferences Unlike working memory preference memory is not a set Duplicate preferences are allowed in preference memory 3 3 Production Memory Production memory contains productions which can be loaded in by a user typed in while Soar is running or sourced from a file or generated by chunking while Soar is running Productions both user defined productions and chunks may be examined using the print command described in Section 8 2 on page 129 Each production has three required components a name a set of conditions also called the left hand side or LHS and a set of actions also called the right hand side or RHS There
90. R INTERFACE Print the Soar stack which includes states and operators print stack Print the named production in its RETE form print if named production Print the names of all user productions currently loaded print u Default print vs tree print print si depth 2 S1 io I1 reward link R1 superstate nil type state I1 input link I2 output link I3 print s1 depth 2 tree S1 io I1 I1 input link I2 I1 output link 13 S1 reward link R1 S1 superstate nil S1 type state See Also default wme depth wma production find Find productions by condition or action patterns Synopsis production find lrs n c pattern 8 2 EXAMINING MEMORY 133 Options c chunks Look only for chunks that match the pattern 1 lhs Match pattern only against the conditions left hand side of productions default n nochunks Disregard chunks when looking for the pattern r rhs Match pattern against the actions right hand side of produc tions s show bindings Show the bindings associated with a wildcard pattern pattern Any pattern that can appear in productions Description The production find command is used to find productions in production memory that include conditions or actions that match a given pattern The pattern given specifies one or more condition elements on the left hand side of productions or negated conditions or one o
91. RFORMANCE 105 This WME is created whenever an episode is successfully retrieved from a cue based retrieval command The WME value is an integer indicating the time of the retrieved episode e present id This WME is created whenever an episode is successfully retrieved from a cue based retrieval command The WME value is an integer indicating the current time such as to provide a sense of now in episodic memory terms By comparing this value to the memory id value the agent can gain a sense of the relative time that has passed since the retrieved episode was recorded e graph match This WME is created whenever an episode is successfully retrieved from a cue based retrieval command and the graph match parameter was on The value is an integer with value 1 if graph matching was executed successfully and 0 otherwise e mapping lt mapping root gt This WME is created whenever an episode is successfully retrieved from a cue based retrieval command the graph match parameter was on and structural match was successful on the retrieved episode This WME provides a mapping between identifiers in the cue and in the retrieved episode For each identifier in the cue there is a node WME as an augmentation to the mapping identifier The node has a cue augmentation whose value is an identifier in the cue and a retrieved augmentation whose value is an identifier in the retrieved episode In a graph match it is possible to have multiple iden
92. S previously If not elaborate_gds is called recursively to find the context dependencies for the local contributing WME c 3 When WME changes occur each goal state must be checked to determine if the WME appeared on that goal s GDS Because WME changes occur in nearly every Soar elaboration cycle we chose to extend the WME data structure to avoid this scanning Figure E 4 illustrates the relationship Each GDS structure consists of a pointer to its goal and a pointer to a linked list of WMEs The gds_next and gds_prev pointers on the WME structure define the GDS WMEs for a particular GDS and the GDS pointer provides a link back from each GDS WME to the GDS data structure When a WME is removed the GDS pointer can be checked to determine immediately if the goal should be removed No scanning is necessary Other implementation issues e Allocating memory for the GDS The GDS memory is created for each goal when the goal is created The GDS is deallocated when the goal is removed A NIL WME pointer for the GDS indicates a goal has no WMEs in its GDS e Updating a WME GDS pointer A WME should appear in only the GDS of the highest goal for which it is dependent If a WME is determined to already be in a GDS lower than the current goal its GDS pointer is updated to the higher goal it is removed from the gds_WME DLL of the lower goal and added to the higher one If there are no other WMEs on the gds_ WME DLL of the lower goal its WME
93. S1 operator 02 0 5 So Q st41 02 0 5 4 O2 is selected so Q S141 41 Q S141 02 0 5 Therefore i Q ri YQ St41 G41 Q S at 0 3 x 1 0 0 9 x 0 5 1 3 0 045 Since r1 1 and r1 2 both contributed to the Q value of O1 6 is evenly divided amongst them resulting in updated values of 2 3225 0 9775 rl 1 lt s gt operator lt o gt rl 2 lt s gt operator lt o gt 5 1 3 will be updated when the next RL operator is selected 86 CHAPTER 5 REINFORCEMENT LEARNING 5 3 1 Gaps in Rule Coverage Call an operator with numeric indifferent preferences an RL operator The previous descrip tion had assumed that RL operators were selected in both decision cycles t and t 1 If the operator selected in t 1 is not an RL operator then Q s141 a 41 would not be defined and an update for the RL operator selected at time t will be undefined We will call a sequence of one or more decision cycles in which RL operators are not selected between two decision cycles in which RL operators are selected a gap Conceptually it is desirable to use the temporal difference information from the RL operator after the gap to update the Q value of the RL operator before the gap There are no intermediate storage locations for these up dates Requiring that RL rules support operators at every decision can be difficult for agent programmers particularly for operators that do not represent steps in a task but instead
94. The Soar User s Manual Version 9 3 2 John E Laird and Clare Bates Congdon User interface sections by Karen J Coulter Soar 9 Modules by Nate Derbinsky and Joseph Xu Computer Science and Engineering Department University of Michigan Draft of April 9 2012 Errors may be reported to John E Laird laird umich edu Copyright 1998 2012 The Regents of the University of Michigan Development of earlier versions of this manual were supported under contract N00014 92 K 2015 from the Advanced Systems Technology Office of the Advanced Research Projects Agency and the Naval Research Laboratory and contract N66001 95 C 6013 from the Ad vanced Systems Technology Office of the Advanced Research Projects Agency and the Naval Command and Ocean Surveillance Center RDT amp E division Contents Contents 1 Introduction LT Usinge ths Manusl c ce ew oo ee Mao Yoe ead a ee Oe eee eS 1 2 Conta ting the Soar Group lt e ck ke Be eR ee E EEG BEA 1 3 A Note on Different Platforms and Operating Systems 2 The Soar Architecture Al An Overview ol OU ocea Ce A EERE EE Re PK Oe E i eS 2 1 1 Problem Solving Functions in Soar 2 024 2 1 2 An Example Task The Blocks World 2 1 3 Representation of States Operators and Goals 2 1 4 Proposing candidate operators o oo o e 2 1 5 Comparing candidate operators Preferences 2 1 6 Selecting a single operator 20202
95. University of Michigan 1998 3This report will use state not goal At the kernel level states are still called goals and goal is often still used to refer to states As a result a confusion in terminology results with Goal Dependency Set a specific example even though goals have not been an explicit behavior level Soar construct since 221 222 APPENDIX E A GOAL DEPENDENCY SET PRIMER makes the resulting WME persistent it will remain in memory until explicitly removed or until its local state is removed regardless of whether it continues to be justified Persistent WMEs are pervasive in Soar because operators are the main unit of problem solving Persistence is necessary for taking any non monotonic step in a problem space However persistent WMEs also are dependent on WMEs in the superstate context The problem in Soar 7 especially when trying to create large scale systems like TacAir Soar is that the knowledge developer must always think about which dependencies can be ignored and which need to result in a reconsideration of the persistent WME For example imagine an exploration robot that makes a persistent decision to travel to some distant destination based in part on its power reserves Now suppose that the agent notices that its power reserves have failed If this change is not communicated to the state where the travel decision was made the agent will continue to act as if its full
96. about how to select and apply operators to transform the states of the problem and a means of recognizing that the goal has been achieved 2 1 1 Problem Solving Functions in Soar All of Soar s long term knowledge is organized around the functions of operator selection and operator application which are organized into four distinct types of knowledge Knowledge to select an operator 1 Operator Proposal Knowledge that an operator is appropriate for the current situation 2 Operator Comparison Knowledge to compare candidate operators 3 Operator Selection Knowledge to select a single operator based on the compar isons Knowledge to apply an operator 4 Operator Application Knowledge of how a specific operator modifies the state In addition there is a fifth type of knowledge in Soar that is indirectly connected to both operator selection and operator application 5 Knowledge of monotonic inferences that can be made about the state state elab oration State elaborations indirectly affect operator selection and application by creating new de scriptions of the current situation that can cue the selection and application of operators These problem solving functions are the primitives for generating behavior in Soar Four of the functions require retrieving long term knowledge that is relevant to the current situa tion elaborating the state proposing candidate operators comparing the candidates and applying the operator
97. ail in Section 2 6 on page 23 1 tie when there is a collection of equally eligible operators competing for the value of a particular attribute 2 conflict when two or more objects are better than each other and they are not dominated by a third operator 3 constraint failure when there are conflicting necessity preferences 4 no change when the proposal phase runs to quiescence without suggesting a new operator 3 4 IMPASSES IN WORKING MEMORY AND IN PRODUCTIONS 67 The list below gives the seven augmentations that the architecture creates on the substate generated when an impasse is reached and the values that each augmentation can contain type state impasse Contains the impasse type tie conflict constraint failure or no change choices Either multiple for tie and conflict impasses constraint failure for constraint failure impasses or none for no change impasses superstate Contains the identifier of the state in which the impasse arose attribute For multi choice and constraint failure impasses this contains operator For no change impasses this contains the attribute of the last decision with a value state or operator item For multi choice and constraint failure impasses this contains all values involved in the tie conflict or constraint failure If the set of items that tie or conflict changes dur ing the impasse the architecture removes or adds the appropriate item augmentations wi
98. ain types of impasses You may wish to make use of some of all of these productions or merely use them as guides for writing your own set of productions to respond to impasses Examples The following is an example of a substate that is created for a tie among three operators S12 type state impasse tie choices multiple attribute operator superstate S3 item 09 010 011 quiescence t 68 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS The following is an example of a substate that is created for a no change impasse to apply an operator S12 type state impasse no change choices none attribute operator superstate S3 quiescence t S3 operator 02 3 4 2 Testing for impasses in productions Since states appear in working memory they may also be tested for in the conditions of productions For example the following production tests for a constraint failure impasse on the top level state sp default top goal halt operator failure Halt if no operator can be selected for the top goal default state lt ss gt impasse constraint failure superstate lt s gt lt s gt superstate nil gt write crlf No operator can be selected for top goal write crlf Soar must halt halt 3 5 Soar I O Input and Output in Soar Many Soar users will want their programs to interact with a real or simulated environment For example Soar programs could control a robot receiving sen
99. aling and a single learning mechanism chunk ing It was only as Soar was applied to diverse tasks in complex environments that we found these mechanisms to be insufficient and have recently added new long term memories semantic and episodic and learning mechanisms semantic episodic and reinforcement learning to extend Soar agents with crucial new functionalities All decisions are made through the combination of relevant knowledge at run time In Soar every decision is based on the current interpretation of sensory data and any relevant knowledge retrieved from permanent memory Decisions are never precompiled into uninterruptible sequences 2 CHAPTER 1 INTRODUCTION 1 1 Using this Manual We expect that novice Soar users will read the manual in the order it is presented Chapter 2 and Chapter 3 describe Soar from different perspectives Chapter 2 de scribes the Soar architecture but avoids issues of syntax while Chapter 3 describes the syntax of Soar including the specific conditions and actions allowed in Soar pro ductions Chapter 4 describes chunking Soar s mechanism to learn new procedural knowledge Not all users will make use of chunking but it is important to know that this capability exists Chapter 5 describes reinforcement learning RL a mechanism by which Soar s procedural knowledge is tuned given task experience Not all users will make use of RL but it is important to know that this capability exist
100. at is used more often to refer to the contents of working memory while augmentation is a term that is used more often to refer to the description of an object Working memory is illustrated at an abstract level in Figure 2 3 on page 8 The attribute of an augmentation is usually a constant such as name or type because in a sense the attribute is just a label used to distinguish one link in working memory from another The value of an augmentation may be either a constant such as red or an identifier such as 06 When the value is an identifier it refers to an object in working memory that may have additional substructure In semantic net terms if a value is a constant then it is a terminal node with no links if it is an identifier it is a nonterminal node One key concept of Soar is that working memory is a set which means that there can never be two elements in working memory at the same time that have the same identifier attribute value triple this is prevented by the architecture However it is possible to have multiple working memory elements that have the same identifier and attribute but that each have different values When this happens we say the attribute is a multi valued attribute which is often shortened to be multi attribute An object is defined by its augmentations and not by its identifier An identifier is simply a label or pointer to the object On subsequent runs of the same Soar program there may be an objec
101. ates is either empty or has one member preference semantics terminates and this set is returned e Otherwise the remaining candidates are passed to the Indifferent Test IndifferentTest This operation traverses the remaining candidates and marks each candidate for which one of the following is true e the candidate has a unary indifferent preference e the candidate has a numeric indifferent preference e the candidate is binary indifferent to all of the remaining candidate operators If some candidate is left unmarked then the procedure signals a tie impasse and returns the complete set of candidates that passed into the IndifferentTest Otherwise the candidates are mutually indifferent in which case an operator is chosen according to the method set by the indifferent selection command described on page 157 220 APPENDIX D THE RESOLUTION OF OPERATOR PREFERENCES Appendix E A Goal Dependency Set Primer This document briefly describes the Goal Dependency Set GDS which was introduced with Soar 8 There are three sections a brief discussion of the motivation for the GDS a discussion of the consequences of the GDS from a behavior developer modeler s point of view and some details on the kernel implementation of the GDS for anyone working at the architecture level This document is by no means complete but introduces the GDS in Soar specific terms Why the GDS was needed As a symbol system Soar attempts to approximate the
102. ationship between objects soar gt watch wmes add filter t both ontop 8 4 Configuring Soar s Runtime Parameters This section describes the commands that control Soar s Runtime Parameters Many of these commands provide options that simplify or restrict runtime behavior to enable easier and more localized debugging Others allow users to select alternative algorithms or methodolo gies Users can configure Soar s learning mechanism examine the backtracing information that supports chunks and justifications provide hints that could improve the efficiency of the Rete matcher limit runaway chunking and production firing choose an alternative algo rithm for determining whether a working memory element receives O support and configure options for selecting between mutually indifferent operators 8 4 CONFIGURING SOAR S RUNTIME PARAMETERS 149 The specific commands described in this section are Summary epmem Get Set episodic memory parameters and statistics explain backtraces Print information about chunk and justification back traces indifferent selection Controls indifferent preference arbitration learn Set the parameters for chunking Soar s learning mechanism max chunks Limit the number of chunks created during a decision cycle max dc time Set a wall clock time limit such that the agent will be interrupted when a single decision cycle exceeds this limit max elaborations Limit the maximum num
103. attribute operator superstate choices multiple Subgoal level 2 This subgoal was created because Soar didn t know which of the three operators 04 O5 or O6 to select in state S2 Figure 2 10 A simplified illustration of a subgoal stack operator 2 6 IMPASSES AND SUBSTATES 27 if after it is created and it is still in working memory or preference memory its identifier becomes linked to a superstate through the creation of another result For example if the problem solving in a state constructs an operator for a superstate it may wait until the operator structure is complete before creating an acceptable preference for the operator in the superstate The acceptable preference is a result because it was created in the state and is linked to the superstate and through the superstate is linked to the top level state The substructures of the operator then become results because the operator s identifier is now linked to the superstate Justifications Determination of support for results Some results receive I support while others receive O support The type of support received by a result is determined by the function it plays in the superstate and not the function it played in the state in which it was created For example a result might be created through operator application in the state that created it however it might only be a state elaboration in the superstate The first function would lead to O support bu
104. backs time per phase e input function time e output function time Total CPU time is calculated from the time a decision cycle or number of decision cycles is initiated until stopped Kernel time is the time spent in core Soar functions In this case kernel time is defined as the all functions other than the execution of callbacks and the input and output functions The total kernel timer is only stopped for these functions The phase timers for the kernel and callbacks track the execution time for individual phases of the decision cycle i e input phase preference phase working memory phase output phase and decision phase Because there is overhead associated with turning these timers on and off the actual kernel time will always be greater than the derived kernel time i e the sum of all the phase kernel timers Similarly the total CPU time will always be greater than the derived total the sum of the other timers because the overhead of turning these timers on and off is included in the total CPU time In general the times reported by the single timers should always be greater than than the corresponding derived time Additionally as execution time increases the difference between these two values will also increase For those concerned about the performance cost of the timers all the run time timing calculations can be compiled out of the code by defining NO_TIMING_STUFF in soarkernel h before compilation Examples Trac
105. ber of elaboration cycles in a given phase max goal depth Limit the sub state stack depth max memory usage Set the number of bytes that when exceeded by an agent will trigger the memory usage exceeded event max nil output cycles Limit the maximum number of decision cycles exe cuted without producing output multi attributes Declare multi attributes so as to increase Rete matching efficiency numeric indifferent mode Select method for combining numeric preferences o support mode Choose experimental variations of o support predict Predict the next selected operator rl Get Set RL parameters and statistics save backtraces Save trace information to explain chunks and justifications select Force the next selected operator set stop phase Controls the phase where agents stop when running by deci sion smem Get Set semantic memory parameters and statistics timers Toggle on or off the internal timers used to profile Soar waitsnc Generate a wait state rather than a state no change impasse wma Get Set working memory activation parameters epmem Control the behavior of episodic memory 150 CHAPTER 8 THE SOAR USER INTERFACE Synopsis epmem epmem g get lt parameter gt epmem s set lt parameter gt lt value gt epmem S stats lt statistic gt epmem t timers lt timer gt epmem c close epmem v viz lt episode id gt epmem p print lt episode id gt epmem b ba
106. bstates When the decision procedure is applied to evaluate preferences and determine the operator augmentation of the state it is possible that the preferences are either incomplete or incon sistent The preferences can be incomplete in that no acceptable operators are suggested or that there are insufficient preferences to distinguish among acceptable operators The preferences can be inconsistent if for instance operator A is preferred to operator B and operator B is preferred to operator A Since preferences are generated independently from different production instantiations there is no guarantee that they will be consistent 24 CHAPTER 2 THE SOAR ARCHITECTURE 2 6 1 Impasse Types There are four types of impasses that can arise from the preference scheme Tie impasse A tie impasse arises if the preferences do not distinguish between two or more operators with acceptable preferences If two operators both have best or worst preferences they will tie unless additional preferences distinguish between them Conflict impasse A conflict impasse arises if at least two values have conflicting better or worse preferences such as A is better than B and B is better than A for an operator and neither one is rejected prohibited or required Constraint failure impasse A constraint failure impasse arises if there is more than one required value for an operator or if a value has both a require and a prohibit preference These pref
107. c store initialization epmem_ncb_retrieval Episode reconstruction epmem next Determining next episode epmem_ prev Determining previous episode epmem_query Cue based query epmem_storage Encoding new episodes epmem_trigger Deciding whether new episodes should be encoded epmem_wm_phase Converting preference assertions to working memory changes Level three ncb_edge Collecting edges during reconstruction ncb_edge rit Collecting edges from relational interval tree ncb_node Collecting nodes during reconstruction ncb_node_rit Collecting nodes from relational interval tree query_dnf DNF graph construction query graph match Graph match query neg end_ep Interval search for negative cue end point ranges query_neg_end_now Interval search for negative cue end point now query_neg_end_point Interval search for negative cue end point points query_neg_start_ep Interval search for negative cue start point ranges query neg start now Interval search for negative cue start point now query neg start _point Interval search for negativecue start point points query_pos_end_ep Interval search for positive cue end point ranges query_pos_end_now Interval search for positive cue end point now query_pos_end_point Interval search for positive cue end point points query_pos_start_ep Interval search for positive cue start point ranges query _pos_start now Interval search for positive cu
108. candidate long term identifiers on demand and thus retrieval time is independent of cue selectivity However each activation update such as after a retrieval incurs an update cost linear in the number of augmentations If the number of augmentations for a long term iden tifier is large this cost can dominate Thus the thresh parameter sets the upper bound of augmentations after which activation is stored with the long term identifier This allows the user to establish a balance between cost of updating augmentation activation and the number of long term identifiers that must be pre sorted during a cue based retrieval As long as the threshold is greater than the number of augmentations of most long term identifiers performance should be fine as it will bound the effects of selectivity The next two parameters deal with the SQLite cache which is a memory store used to speed operations like queries by keeping in memory structures like levels of index B trees The first parameter page size indicates the size in bytes of each cache page The second parameter cache size suggests to SQLite how many pages are available for the cache Total cache size is the product of these two parameter settings The cache memory is not pre allocated so short small runs will not necessarily make use of this space Generally speaking a greater number of cache pages will benefit query time as SQLite can keep necessary meta data in memory However some docum
109. ckup lt file name gt Options g get Print current parameter setting Sy raset Set parameter value S stats Print statistic summary or specific statistic t timers Print timer summary or specific statistic c close Close epmem database and commit to disk if applicable v viz Print episode in graphviz format p print Print episode in user readable format b backup Creates a backup of the episodic database on disk Description The epmem command is used to change all behaviors of the episodic memory module except for watch output which is controlled by the watch epmem command Parameters Due to the large number of parameters the epmem command uses the get set lt parameter gt lt value gt convention rather than individual switches for each parameter Run ning epmem without any switches displays a summary of the parameter settings 8 4 CONFIGURING SOAR S RUNTIME PARAMETERS 151 Parameter Description Possible values Default balance Linear weight of 0 1 1 match cardinality 1 vs working memory activation 0 used in calcu lating match score cache size Number of memory 1 2 10000 pages used in the SQLite cache database Database storage file memory memory method exclusions Toggle the exclu any string epmem smem sion of an attribute string constant force Forces episode en ignore
110. command if one exists is processed 7 3 1 Cue Based Retrievals Cue based retrieval commands are used to search for an episode in the store that best matches an agent supplied cue while adhering to optional modifiers A cue is composed of WMEs that partially describe a top state of working memory in the retrieved episode All cue based retrieval requests must contain a single query cue and optionally a single neg query cue lt s gt epmem command query lt required cue gt lt s gt epmem command neg query lt optional negative cue gt A query cue describes structures desired in the retrieved episode whereas a neg query cue describes non desired structures For example the following Soar production creates a query cue consisting of a particular state name and a copy of a current value on the input link structure sp epmem sample query state lt s gt epmem command lt ec gt io input link foo lt bar gt gt lt ec gt query lt q gt lt q gt name my state name io input link foo lt bar gt As detailed below multiple prior episodes may equally match the structure and contents of an agent s cue Nuxoll has produced initial evidence that in some tasks retrieval quality improves when using activation of cue WME s as a form of feature weighting Thus episodic memory supports integration with working memory activation see Section 8 4 on page 180 For a theoretical discussion of the Soar implemen
111. cription This command prints formatted help for the given command name Issue alone to see what topics have help available 8 1 BASIC COMMANDS FOR RUNNING SOAR 115 init soar Empties working memory and resets run time statistics Synopsis init soar Default Aliases init init soar is init soar Description The init soar command initializes Soar It removes all elements from working memory wiping out the goal stack and resets all runtime statistics The firing counts for all productions are reset to zero The init soar command allows a Soar program that has been halted to be reset and start its execution from the beginning init soar does not remove any productions from production memory to do this use the excise command Note however that all justifications will be removed because they will no longer be supported See Also excise run Begin Soar s execution cycle Synopsis run dlelolp g luln s count i elpldlo Default Aliases d run d 1 e run e 1 step run 1 116 CHAPTER 8 THE SOAR USER INTERFACE Options d decision Run Soar for count decision cycles e elaboration Run Soar for count elaboration cycles 0 output Run Soar until the nth time output is generated by the agent Limited by the value of max nil output cycles p phase Run Soar by phases A phase is either an input phase proposal pha
112. ction sp gt write lt x gt crlf lt y gt 3 3 6 9 Mathematical functions The expressions described in this section can be nested to any depth For all of the functions in this section missing or non numeric arguments result in an error 3 3 PRODUCTION MEMORY 61 These symbols provide prefix notation mathematical functions These symbols work similarly to C functions They will take either integer or real number arguments The first three functions return an integer when all arguments are integers and otherwise return a real number and the last two functions always return a real number The symbol is also a unary function which given a single argument returns the product of the argument and 1 The symbol is also a unary function which given a single argument returns the reciprocal of the argument 1 x sp gt lt s gt sum lt x gt lt y gt product sum lt v gt lt w gt lt x gt lt y gt pig sum lt x gt lt y gt lt z gt 402 negative x lt x gt div mod These symbols provide prefix notation binary mathematical functions they each take two arguments These symbols work similarly to C functions They will take only integer arguments using reals results in an error and return an integer div takes two integers and returns their integer quotient mod returns their remainder sp gt lt s gt quotient div lt x gt lt y gt
113. ction This method is more general than using the gp command or rule templates and is useful if the environment state consists of arbitrarily complex relational structures that cannot be enumerated Chapter 6 Semantic Memory Soar s semantic memory is a repository for long term declarative knowledge supplement ing what is contained in short term working memory and production memory Episodic memory which contains memories of the agent s experiences is described in Chapter 7 The knowledge encoded in episodic memory is organized temporally and specific information is embedded within the context of when it was experienced whereas knowledge in semantic memory is independent of any specific context representing more general facts about the world This chapter is organized as follows semantic memory structures in working memory 6 1 representation of knowledge in semantic memory 6 2 storing semantic knowledge 6 3 retrieving semantic knowledge 6 4 and a discussion of performance 6 5 The detailed behavior of semantic memory is determined by numerous parameters that can be controlled and configured via the smem command Please refer to the documentation for that command in Section 8 4 on page 174 6 1 Working Memory Structure Upon creation of a new state in working memory see Section 2 6 1 on page 24 Section 3 4 on page 66 the architecture creates the following augmentations to facilitate agent interaction with semantic
114. ctions loaded is high gds print is useful for examining the goal dependecy set when subgoals seem to be disappearing unexpectedly default wme depth is related to the print command internal symbols is not often used but is helpful when debugging Soar extensions or trying to locate memory leaks default wme depth Set the level of detail used to print WMEs Synopsis default wme depth depth Default Aliases set default depth default wme depth Options depth A non negative integer 122 CHAPTER 8 THE SOAR USER INTERFACE Description The default wme depth command reflects the default depth used when working memory elements are printed using the print The default value is 1 When the command is issued with no arguments default wme depth returns the current value of the default depth When followed by an integer value default wme depth sets the default depth to the specified value This default depth can be overridden on any particular call to the print command by explicitly using the depth flag e g print depth 10 args By default the print command prints objects in working memory not just the individual working memory element To limit the output to individual working memory elements the internal flag must also be specified in the print command Thus when the print depth is 0 by default Soar prints the entire object which is the same behavior as when the print depth is 1 But if
115. cture is removed from working memory Here are possible approaches for resolving specific types of impasses are listed below Tie impasse A tie impasse can be resolved by productions that create preferences that prefer one option better best require eliminate alternatives worse worst reject prohibit or make all of the objects indifferent indifferent Conflict impasse A conflict impasse can be resolved by productions that create prefer ences to require one option require or eliminate the alternatives reject prohibit Constraint failure impasse A constraint failure impasse cannot be resolved by addi tional preferences but may be prevented by changing productions so that they create fewer require or prohibit preferences State no change impasse A state no change impasse can be resolved by productions that create acceptable or require preferences for operators Operator no change impasse An operator no change impasse can be resolved by pro ductions that apply the operator changing the state so the operator proposal no longer matches or other operators are proposed and preferred Eliminating Impasses An impasse is resolved when results are created that allow progress to be made in the state where the impasse arose In Soar an impasse can be eliminated but not resolved when a higher level impasse is resolved eliminated or regenerated In these cases the impasse becomes irrelevant because higher level
116. d deleted from working memory Figure 2 8 A detailed illustration of Soar s decision cycle out of date During the processing of these phases it is possible that the preferences that resulted in the selection of the current operator could change Whenever operator preferences change the preferences are re evaluated and if a different operator selection would be made then the current operator augmentation of the state is immediately removed However a new operator is not selected until the next decision phase when all knowledge has had a chance to be retrieved 2 6 IMPASSES AND SUBSTATES 23 Soar while HALT not true Cycle Cycle InputPhase ProposalPhase DecisionPhase ApplicationPhase OutputPhase ProposalPhase while some I supported productions are waiting to fire or retract FireNewlyMatchedProductions RetractNewlyUnmatchedProductions DecisionPhase for each state in the stack starting with the top level state until a new decision is reached EvaluateOperatorPreferences for the state being considered if one operator preferred after preference evaluation SelectNewOperator else could be no operator available or CreateNewSubstate unable to decide between more than one ApplicationPhase while some productions are waiting to fire or retract FireNewlyMatchedProductions RetractNewlyUnmatchedProductions Figure 2 9 A simplified version of the Soar algorithm 2 6 Impasses and Su
117. d production name print the production named production name How to print the productions f fulll When printing productions print the whole production This is the default when printing a named production F filename also prints the name of the file that contains the production i internal items should be printed in their internal form For productions this means leaving conditions in their reordered rete net form n name When printing productions print only the name and not the whole production This is the default when printing any category of produc tions as opposed to a named production Printing items in working memory 8 2 EXAMINING MEMORY 131 d depth n This option overrides the default printing depth see the default wme depth command for more detail e exact Print only the wmes that match the pattern i internal items should be printed in their internal form For working memory this means printing the individual elements with their timetags and activation rather than the objects t tree wmes should be printed in in a tree form one wme per line v varprint Print identifiers enclosed in angle brackets identifier print the object identifier identifier must be a valid Soar symbol such as S1 pattern print the object whose working memory elements matching the given pattern See Description for more information o
118. d On 17 Jul 1996 16 35 14 Soar Version 7 7 Description A new simpler implementation of the blocks world with just three blocks being moved at random Notes CBC 6 27 Converted to Tcl syntax CBC 6 27 Added extensive comments HHEFHHHHHHHHEHHHHEHHAEHEHAEEHHEAHHEEEHHEA HHA HHHHHRRE RHEE HHH HAH RH RR RRR RS HHHHHHHHHHHHHHHHHHEHHHHHEHHEHHHHHEHHBH HEHEHE EH HEHEHE HEHEHE HHH EHH EH HERE Create the initial state with blocks A B and C on the table HH HH HH HH HH HH HO H OH OF w This is the first production that will fire Soar creates the initial state as an architectural function in the zeroth decision cycle which will match against this production This production does a lot of work because it is creating preferences for all the structure for the initial state 1 The state has a problem space named blocks The problem space limits the operators that will be selected for a task In this simple problem it isn t really necessary there is only one operator but it s a programming convention that you should get used to The state has four things three blocks and the table The state has three ontop relations Each of the things has substructure their type and their names Note that the fourth thing is actually a table Each of the ontop relations has substructure the top thing and the bottom thing Finally the production writes a message for the user
119. decision cycle default end of Out put phase during which episodic memory stores episodes and processes commands The value of the trigger parameter indicates to the architecture the event that concludes an episode adding a new augmentation to the output link default or each decision cycle For debugging purposes the force parameter allows the user to manually request that an episode be recorded or not during the current decision cycle Behavior is as follows e The value of the force parameter is initialized to off every decision cycle e During the phase of episodic storage episodic memory tests the value of the force parameter if it has a value other than of off episodic memory follows the forced policy irrespective of the value of the trigger parameter 7 2 1 Episode Contents When episodic memory stores a new episode it captures the entire top state of working memory There are currently two exceptions to this policy e Episodic memory only supports WMEs whose attribute is a constant Behavior is currently undefined when attempting to store a WME that has an attribute that is an identifier e The exclusions parameter allows the user to specify a set of attributes for which Soar will not store WMEs The storage process currently walks the top state of working memory in a breadth first manner and any WME that is not reachable other than via an excluded WME will not be stored By default episodic memory excludes the epmem and
120. difiers to a single CB retrieval If no episode satisfies the cue s and optional modifiers an error is returned lt s gt epmem result failure lt query gt lt optional neg query gt If an episode is returned there is additional meta data supplied 7 3 4 7 3 2 Absolute Non Cue Based Retrieval At time of storage each episode is attributed a unique time This is the current value of time statistic and is provided as the memory id meta data item of retrieved episodes 7 3 4 An absolute non cue based retrieval is one that requests an episode by time An agent issues an absolute non cue based retrieval by creating a WME on the command structure with attribute retrieve and value equal to the desired time lt s gt epmem command retrieve time Supplying an invalid value for the retrieve command will result in an error The time of the first episode in an episodic store will have value 1 and each subsequent episode s time will increase by 1 Thus the desired time may be the mathematical result of operations performed on a known episode s time The current episodic memory implementation does not implement any episodic store dynam ics such as forgetting Thus any integer time greater than 0 and less than the current value of the time statistic will be valid However if forgetting is implemented in future versions no such guarantee will be made 7 3 3 Relative Non Cue Based Retrieval Episodic memory supports the ability for an
121. do any one of the following not all possibilities listed watch level 1 wmes watch 1 1 w watch decisions wmes watch d wmes watch w decisions watch w d To turn on printing of decisions productions and wmes and turns phases off do any one of the following not all possibilities listed 8 3 CONFIGURING TRACE INFORMATION AND DEBUGGING 147 watch level 4 phases remove watch l 4 p remove watch l 4 p 0 watch d P w p remove To watch the firing and retraction of decisions and only user productions do any one of the following not all possibilities listed watch 1l 1 u watch d u To watch decisions phases and all productions except user productions and justifications and to see full wmes do any one of the following not all possibilities listed watch decisions phases productions user remove justifications remove fullwn watch d p P f u remove j 0 watch f 1 3 u 0 j 0 See Also epmem pwatch print run watch wmes watch wmes Print information about wmes matching a certain pattern as they are added and removed Synopsis watch wmes alr t type pattern watch wmes 1 R t type Options a add filter Add a filter to print wmes that meet the type and pattern cri teria r remove filter Delete filters for printing wmes that match the type and pattern criteria 1l list filter List the filters of this type currently in use Does not use
122. e production find lt s gt name lt j gt volume 3 production find rhs lt j gt lt volume gt See Also sp 8 3 Configuring Trace Information and Debugging This section describes the commands used primarily for debugging or to configure the trace output printed by Soar as it runs Users may specify the content of the runtime trace output ask that they be alerted when specific productions fire and retract or request details on Soar s performance The specific commands described in this section are Summary chunk name format Specify format of the name to use for new chunks firing counts Print the number of times productions have fired pwatch Trace firings and retractions of specific productions stats Print information on Soar s runtime statistics verbose Control detailed information printed as Soar runs warnings Toggle whether or not warnings are printed watch Control the information printed as Soar runs watch wmes Print information about wmes that match a certain pattern as they are added and removed Of these commands watch is the most often used and the most complex pwatch is related to watch but applies only to specific named productions firing counts and stats are useful for understanding how much work Soar is doing chunk name format is less frequently used but allows for detailed control of Soar s chunk naming chunk name format Specify format of the name to
123. e srand Seed the random number generator time Uses a default system clock timer to record the wall time required while executing a command unalias Remove an existing alias version Returns version number of Soar kernel alias Define a new alias or command using existing commands and arguments Synopsis alias name cmd args 8 7 MISCELLANEOUS 199 Default Aliases a alias Description This command defines new aliases by creating Soar procedures with the given name The new procedure can then take an arbitrary number of arguments which are post pended to the given definition and then that entire string is executed as a command The definition must be a single command multiple commands are not allowed The alias procedure checks to see if the name already exists and does not destroy existing procedures or aliases by the same name Existing aliases can be removed by using the unalias command With no arguments alias returns the list of defined aliases With only the name given alias returns the current definition Examples The alias wmes is defined as alias wmes print i If the user executes a command such as wmes superstate nil it is as if the user had typed this command print i superstate nil To check what a specific alias is defined as you would type alias wmes See Also unalias allocate Allocate additional 32 kilobyte blocks of memory for a specif
124. e start point now query_pos_start_point Interval search for positivecue start point points Visualization When debugging agents using episodic memory it is often useful to inspect the contents of individual episodes Running epmem viz lt episode id gt will output the contents of an episode in graphviz format For more information on this format and vi sualization tools see http www graphviz org The epmem print option has the same syntax but outputs text that is similar to using the print command to get the substructure of an identifier in working memory which is possibly more useful for interactive debugging 8 4 CONFIGURING SOAR S RUNTIME PARAMETERS 155 See Also watch wma explain backtraces Print information about chunk and justification backtraces Synopsis explain backtraces options prod_name Default Aliases eb explain backtraces Options prod_name List all conditions and grounds for the chunk or justification c condition Explain why condition number n is in the chunk or justification f full Print the full backtrace for the named production Description This command provides some interpretation of backtraces generated during chunking The two most useful variants are explain backtraces prodname explain backtraces c n prodname The first variant prints a numbered list of all the conditions for the named chunk or jus tification and the ground w
125. e GDS in agent design The GDS places some design time constraints on operator implementation These constraints are e Operator actions that are used to remember a previous state situation should be as serted in the top state e All operator elaborations should be i supported e Any operator with local actions should be designed to be re entrant This section describes these issues 226 APPENDIX E A GOAL DEPENDENCY SET PRIMER Soar says any operator effect is o supported regardless of whether that assertion is entailed by the current situation or whether it reflects an assumption about it The GDS adds additional needed constraint Because any context dependencies for subgoal o supported assertions will be added to the GDS the developer must decide if an o supported element should be represented in a substate or the top state This decision is straightforward if the functional role of the persistent element is considered Four important capabilities that require persistence are 1 Reasoning hypothetically Some assertions may need to reflect hypothetical states Such assertions are assumptions because a hypothetical inference cannot always be grounded in the current context In other problem solvers with truth main tenance only assumptions are persistent 2 Reasoning non monotonically Sometimes the result of an inference changes one of the assertions on which the inference is dependent As an example consider the task
126. e collectively called an object in working memory The individual working memory elements that make up an object are often called augmentations because they augment the object A template for an object in working memory is identifier attribute 1 value 1 attribute 2 value 2 attribute 3 value 3 attribute n value n 3 1 WORKING MEMORY 35 For example if you run Soar with the example blocks world program described in Appendix A after one elaboration cycle you can look at the top level state by using the print com mand soar gt print sl S1 io I1 ontop 02 ontop 03 ontop 01 problem space blocks superstate nil thing B3 thing T1 thing B1 thing B2 type state The attributes of an object are printed in alphabetical order to make it easier to find a specific attribute Working memory is a set so that at any time there are never duplicate versions of working memory elements However it is possible for several working memory elements to share the same identifier and attribute but have different values Such attributes are called multi valued attributes or multi attributes For example state S1 above has two attributes that are multi valued thing and ontop 3 1 3 Timetags When a working memory element is created Soar assigns it a unique integer tzmetag The timetag is a part of the working memory element and therefore WME s are actually quadru ples rather than triples However
127. e is success and the value is the value of the store command above lt s gt smem result success lt identifier gt Any short term identifiers that compose the stored WMEs will be converted to long term identifiers If a long term identifier is the value of a store command the stored WMEs replace those associated with the LTI in semantic memory It should be noted that between issuing store commands it is possible that the augmentations of a long term identifier in working memory are inconsistent with those in semantic memory 94 CHAPTER 6 SEMANTIC MEMORY 6 3 1 User Initiated Storage Semantic memory provides agent designers the ability to store semantic knowledge via the add switch of the smem command see Section 8 4 on page 174 The format of the com mand is nearly identical to the working memory manipulation components of the RHS of a production i e no RHS functions see Section 3 3 6 on page 56 For instance smem add lt arithmetic gt addi0 facts lt a01 gt lt a02 gt lt a03 gt lt a01 gt digit1 1 digit 10 11 lt a02 gt digit1 2 digit 10 12 lt a0a gt digit1 3 dipgit 10 13 Unlike agent storage declarative storage is automatically recursive Thus this command instance will add a new long term identifier represented by the temporary arithmetic vari able with three augmentations The value of each augmentation will each become an LTI with two constant attribute value pairs
128. e issued per state in a single decision cycle Malformed commands including attempts at multiple 6 4 RETRIEVING SEMANTIC KNOWLEDGE 95 retrieval types will result in an error lt s gt smem result bad cmd lt smem c gt Where the smem c variable refers to the command structure of the state After a command has been processed semantic memory will ignore it until some aspect of the command structure changes via addition removal of WMEs When this occurs the result structure is cleared and the new command if one exists is processed 6 4 1 Non Cue Based Retrievals A non cue based retrieval is a request by the agent to reflect in working memory the current augmentations of a long term identifier in semantic memory The command WME has a retrieve attribute and a long term identifier value lt s gt smem command retrieve lt lti gt If the value of the command is not a long term identifier an error will result lt s gt smem result failure lt lti gt Otherwise two new WMEs will be placed on the result structure lt s gt smem result success lt lti gt lt s gt smem result retrieved lt lti gt All augmentations of the long term identifier in semantic memory will be created as new WMEs in working memory 6 4 2 Cue Based Retrievals A cue based retrieval performs a search for a long term identifier in semantic memory whose augmentations exactly match an agent supplied cue as well as optional cue modifiers
129. e state is the other role that Soar long term knowledge may fulfill Such elaboration knowledge can simplify the encoding of operators because entailments of a set of core features of a state do not have to be explicitly included in application of the operator In Soar these inferences will be automatically retracted when the situation changes such that the inference no longer holds For instance our example blocks world task uses an elaboration to keep track of whether or not a block is clear The elaboration tests for the absence of a block that is on top of a particular block if there is no such on top the block is clear When an operator application creates a new on top the corresponding elaboration retracts and the block is no longer clear 2 1 9 Problem Spaces If we were to construct a Soar system that worked on a large number of different types of problems we would need to include large numbers of operators in our Soar program For a specific problem and a particular stage in problem solving only a subset of all possible operators are actually relevant For example if our goal is to count the blocks on the table operators having to do with moving blocks are probably not important although they may still be legal The operators that are relevant to current problem solving activity define the space of possible states that might be considered in solving a problem that is they define the problem
130. eae 217 Supine Tee Operate c cs sa s Kw deos BR Kee eR ee ee 2 1 8 Making inferences about the state 024 2A Wale oie su he ee eae ewe eee eh ee ee 2 2 Working memory The Current Situation 0 2 4 2 3 Production Memory Long term Knowledge 2 4 20 1 The structure of a prod ctiom s is s a dye wad BH Re EGE Os 2 3 2 Architectural roles of productions lt lt s a 6s es eee eee ws 2 3 3 Production Actions and Persistence 2006 2 4 Preference memory Selection Knowledge a oo oa aa a 2 4 1 Preference semantics lt c co 40462264 be 2 ee eos eR EDS Oe 2 5 Soars Execution Cycle Without Substates 0 2 0085 2 6 Impasses and Substates 4 4654 G44 4 e wR aR RH RESG ORES 26 1 Impasse Types c ce ne deen eee CERES EERE ESE EY OS 26 2 Creating New States oo c c ed asmara d eod Aona ona RED HS 2o REU ea So a eS e De CSR De O ORES eee Se 2 6 4 Removal of Substates Impasse Resolution 2 6 5 Soar s Cycle With Substates 2 2 2 00 cee ees 27 PON ec hee ee ee Rem eA De EEO ee Se 28 pi gud Output oce s date e he eek he eee eee Se ee eh bed 3 The Syntax of Soar Programs e w N e ID oF Cl 10 10 11 12 12 13 16 16 17 18 19 19 21 23 24 24 25 28 30 30 31 33 il CONTENTS ook Working Memory eces s w thees E oe BG Ree DRE Ze eee ees 33 odl MS se seda Ree ee Ole we Bee we a 8 be ee ER Oe 34 Ol SIS crara debe ede s ine h
131. ed Rules matching the following requirement are flagged upon being created sourced a rule is a Soar RL rule if and only if its right hand side RHS consists of a single numeric preference and it is not a template rule see FLAGs below This format exists to ease technical requirements of identifying updating Soar RL rules as well as to make it easy for the agent programmer to add maintain RL capabilities within an agent See the Soar RL Manual for further details RULE FLAGS The optional FLAGs are given below Note that these switches are pre ceeded by a colon instead of a dash this is a Soar parser convention o support specifies that all the RHS actions are to be given o support when the production fires 8 1 BASIC COMMANDS FOR RUNNING SOAR 119 no support default chunk interrupt template specifies that all the RHS actions are only to be given i support when the production fires specifies that this production is a default production this matters for excise task and watch task specifies that this production is a chunk this matters for learn trace specifies that Soar should stop running when this production matches but before it fires this is a useful debugging tool specifies that this production should be used to generate new reinforcement learning rules by filling in those variables that match constants in working memory Multiple flags may be used but not both of o support and no support
132. ed gee otace ed aed 34 ako Times sd eeo ta eoma eb ee AME REESE EE dee RS 30 3 1 4 Acceptable preferences in working memory 36 3 1 5 Working Memory asa Graph 2500005 36 oe Vesterence Memory oo coa eek wn eR EE ERY CRHHREOS RS SS 38 39 Producton Memory 64 4 4 spiona wd ede we RSE HEP RES EH KREG 38 33 1 Producton Names lt aed ce siei pop aded a y a ean CERES e G 39 3 3 2 Documentation string optional ooa e a 40 3 3 3 Production type optional lt o es a cresa desde de ede ra 40 3 3 4 Comments optional ooa a 40 3 3 5 The condition side of productions or LHS a aoaaa aa 41 3 3 6 The action side of productions or RHS 56 3 4 Impasses in Working Memory and in Productions 66 3 4 1 Impasses in working memory a sooo 0005 66 3 4 2 Testing for impasses in productions ooo e a 68 35 Doar 1 0 Inputand Output in 9oar s ses Bo he ee we ea r o 68 351 QOyerview of Soar I O 224 osca so seage apa Gatsie ana 69 3 0 2 Input and output in working memory gt ccoo esu a ce 84 244 69 3 5 3 Input and output in production memory 4 72 Chunking 73 Al Chunk Creation e so s oeoo poradie h bip REE ES OE aa k a REE 73 4 2 Determining Conditions and Actions ooo e a a 74 4 2 1 Determining a chunk s actions aoao a 75 4 2 2 Tracing the creation and reference of working memory elements 75 4 2 3 Determining a chunk s conditions a
133. editor to this production Synopsis edit production production_name Options production name The name of the production to edit Description If an editor currently limited to Visual Soar is open and connected to Soar this command causes the editor to open the file containing this production and move the cursor to the start of the production If there is no editor connected to Soar the command does nothing In order to connect Visual Soar to Soar launch Visual Soar and choose Connect from the Soar Runtime menu Then open the Visual Soar project that you re working on At that point you re set up and edit production will start to work Examples edit production my production name See Also sp load library Load a shared library into the local client for the purpose of e g providing custom event handling Synopsis load library library_name arguments 202 CHAPTER 8 THE SOAR USER INTERFACE Options library name The root name of the library without the dll or so extension this is added for you depending on your platform arguments Whatever arguments the library s initialization function is expecting if any Description Sometimes a user will want to extend an existing environment For example the person may want to provide custom RHS functions or register for print events for the purpose of logging trace information If modifying the existing environmen
134. eference resolution all operator preferences are input to the resolution procedure each step may add or remove some operator candidates only some steps may exit All operator preferences Outcome of preference resolution one required operator winner returned constraintfailure impasse multiple required operators RequireTest AcceptableCollect ProhibitFilter RejectFilter require is also prohibited else y y one candidate remaining winner returned y none selected no change impasse no candidates remaining all candidates are worse than another y BestFilter one candidate remaining winner returned z no candidates remaining none selected WorstFilter no change impasse Fei can Aldaera winner wil be u y chosen based on 2 remaining candidates are userselect setting IndifferentTes NOT mutually indifferen tie impasse Figure D 1 An illustration of the preference resolution process There are eight steps only five of these provide exits from the resolution process e Otherwise If there exists a required candidate that is also prohibited a constraint failure impasse with the required prohibited value is recognized and preference semantics terminates e Otherwise The candidates are passed to AcceptableCollect AcceptableCollect This operation builds a list of op
135. el which is substructure of the state When Soar is doing internal problem solving it must know how to modify the state descrip tions appropriately when an operator is being applied If it is solving the problem in an external environment it must know what possible motor commands it can issue in order to affect its environment The example blocks world task described here does not interact with an external environ ment Therefore the Soar program directly makes changes to the state when operators are applied There are four changes that may need to be made when a block is moved in our task 1 The block that is being moved is no longer where it was it is no longer on top of the same thing 2 The block that is being moved is now in a new location it is on top of a new thing 3 The place that the block used to be is now clear 4 The place that the block is moving to is no longer clear unless it is the table which is always considered clear t In this blocks world task the table always has room for another block so it is represented as always being clear 12 CHAPTER 2 THE SOAR ARCHITECTURE The blocks world task could also be implemented using an external simulator In this case the Soar program does not update all the on top and clear relations the updated state description comes from the simulator 2 1 8 Making inferences about the state Making monotonic inferences about th
136. elected before the impasse r the reward received in the decision cycle immediately following and On the first operator selected after the impasse then O1 is updated with 6 a r2 YQ sn On Q 81 O1 If an RL operator is selected in a substate immediately prior to the state s retraction the RL rules will be updated based only on the reward signals present and not on the Q values of future operators This point is not covered in traditional RL theory The retraction of a substate corresponds to a suspension of the RL task in that state rather than its termination so the last update assumes the lack of information about future rewards rather than the 88 CHAPTER 5 REINFORCEMENT LEARNING discontinuation of future rewards To handle this case the numeric indifferent preference value of each RL rule is stored as two separate values the expected current reward ECR and expected future reward EFR The ECR is an estimate of the expected immediate reward signal for executing the corresponding RL operator The EFR is an estimate of the time discounted Q value of the next RL operator Normal updates correspond to traditional RL theory showing the Sarsa case for simplicity gcr alr EC R s az nFR A VQ s1 G41 EF R sz a t ECR EFR a ri YQ St41 Gt41 EC R st at EF R st at a r VQ st 1 G41 Q st at During substate retraction only the ECR is updated based on the
137. element to working memory Synopsis add wme id Jattribute value Default Aliases aw add wme Options id Must be an existing identifier Leading on attribute is optional attribute Attribute can be any Soar symbol Use to have Soar create a new identifier value Value can be any soar symbol Use to have Soar create a new identifier If the optional preference is specified its value must be acceptable Description Manually add an element to working memory add wme is often used by an input function to update Soar s information about the state of the external world add wme adds a new wme with the given id attribute value and optional preference The given id must be an existing identifier The attribute and value fields can be any Soar symbol If is given in the attribute or value field Soar creates a new identifier symbol for that field If the preference is given it can only have the value to indicate that an acceptable preference should be created for this wme Note that because the id must already exist in working memory the WME that you are adding will be attached directly or indirectly to the top level state As with other WME s any WME added via a call to add wme will automatically be removed from working memory once it is no longer attached to the top level state 8 6 SOAR I O COMMANDS 195 Examples This example adds the attribute value
138. emory is identifier operator value For example if you run Soar with the example blocks world program described in Appendix A after the first operator has been selected you can again look at the top level state using the wmes command soar gt wmes sl 3 S1 io I1 9 S1 ontop 03 10 S1 ontop 02 11 S1 ontop 01 48 S1 operator 04 49 S1 operator 05 50 S1 operator 06 51 S1 operator 07 54 S1 operator 07 52 S1 operator 08 53 S1 operator 09 4 S1 problem space blocks 2 S1 superstate nil 5 S1 thing Ti 8 S1 thing B1 6 S1 thing B3 7 S1 thing B2 1 S1 type state The state S1 has six augmentations of acceptable preferences for different operators 04 through 09 These have plus signs following the value to denote that they are acceptable preferences The state has exactly one operator 07 This state corresponds to the illustration of working memory in Figure 2 4 3 1 5 Working Memory as a Graph Not only is working memory a set it is also a graph structure where the identifiers are nodes attributes are links and constants are terminal nodes Working memory is not an arbitrary graph but a graph rooted in the states Therefore all WMEs are linked either directly or 3 1 WORKING MEMORY 37 isa Color size apple red small box orange large isa size E identifiers ball red bi a fame attribut
139. en added in to the chunk 4 4 Ordering Conditions Since the efficiency of the Rete matcher depends heavily upon the order of a production s conditions the chunking mechanism attempts to write the chunk s conditions in the most favorable order At each stage the condition ordering algorithm tries to determine which eligible condition if placed next will lead to the fewest number of partial instantiations when the chunk is matched A condition that matches an object with a multi valued attribute will lead to multiple partial instantiations so it is generally more efficient to place these conditions later in the ordering This is the same process that internally reorders the conditions in user defined productions as mentioned briefly in Section 2 3 1 4 5 Inhibition of Chunks When a chunk is built it may be able to match immediately with the same working memory elements that participated in its creation If the production s actions include preferences for new operators the production would immediately fire and create a preference for a new operator which duplicates the operator preference that was the original result of the subgoal To prevent this inhibition is used This means that each production that is built during chunking is considered to have already fired with the instantiation of the exact set of working memory elements used to create it This does not prevent a newly learned chunk from matching other working memory elements
140. enced during a decision The cache is composed of double variables i e 64 bits currently and the number of cache items is computed as follows e decay_thresh In max_refs decay_rate With the current default parameter values this will incur about 1 04MB of memory Holding the decay rate constant reasonable changes to decay thresh i e 5 does not greatly change this value However small changes to decay rate will dramatically change this profile For instance keeping everything else constant a decay thresh of 0 3 requires 2 7GB and 0 2 requires 50TB Thus the max pow cache parameter serves to allow you to control the space vs time tradeoff by capping the maximum amount of memory used by this cache If max pow cache is much smaller than the result of the equation above you may experience somewhat degraded performance due to relatively frequent system calls to pow If forget wme is 1ti and forgetting is on only those WMEs whose id is a long term identifier at the decision of forgetting will be removed from working memory If for 8 5 FILE SYSTEM I O COMMANDS 183 instance the id is stored to semantic memory after the decision of forgetting the WME will not be removed till some time after the next WME reference such as testing creation by a rule Statistics Working memory activation tracks statistics over the lifetime of the agent These can be accessed using wma stats lt statistic gt Running wma stats with
141. ented situations have shown improved performance from decreasing cache pages to increase memory locality This is of greater concern when 98 CHAPTER 6 SEMANTIC MEMORY dealing with file based databases versus in memory The size of each page however may be important whether databases are disk or memory based This setting can have far reaching consequences such as index B tree depth While this setting can be dependent upon a particular situation a good heuristic is that short simple runs should use small values of the page size 1k 2k 4k whereas longer more complicated runs will benefit from larger values 8k 16k 32k 64k The episodic memory chapter see Section 7 4 on page 105 has some further empirical evidence to assist in setting these parameters for very large stores The next parameter is optimization The safety parameter setting will use SQLite default settings If data integrity is of importance this setting is ideal The performance setting will make use of lesser data consistency guarantees for significantly greater perfor mance First writes are no longer synchronous with the OS synchronous pragma thus semantic memory won t wait for writes to complete before continuing execution Second transaction journaling is turned off journal_mode pragma thus groups of modifications to the semantic store are not atomic and thus interruptions due to application os hardware failure could lead to inconsistent database state
142. epends on the machine and imple mentation you re using but it is at least 2 billion 2 billion e Floating point constants numbers The range depends on the machine and imple mentation you re using e Symbolic constants These are symbols with arbitrary names A constant can use any combination of letters digits or amp lt gt _ Other characters such as blank spaces can be included by surrounding the complete constant name with vertical bars This is a constant The vertical bars aren t part of the name they re just notation A vertical bar can be included by prefacing it with a backslash inside surrounding vertical bars Odd symbol1 name Identifiers should not be confused with constants although they may look the same identifiers are generated by the Soar architecture at runtime and will not necessarily be the same for repeated runs of the same program Constants are specified in the Soar program and will be the same for repeated runs Even when a constant looks like an identifier it will not act like an identifier in terms of matching A constant is printed surrounded by vertical bars whenever there is a possibility of confusing it with an identifier G37 is a constant while G37 is an identifier To avoid possible confusion you should not use letter number combinations as constants or for production names 3 1 2 Objects Recall from Section 2 2 that all WME s that share an identifier ar
143. er matches the situation has changed making the preference no longer relevant Soar automatically removes the preferences in such cases These preferences are said to have support for instantiation support Similarly state elaborations are simple inferences that are valid only so long as the production matches Working memory elements created as state elaborations also have I support and remain in working memory only as long as the production instantiation that created them continues to match working memory For example the set of relevant operators changes as the state changes thus the proposal of operators is done with I supported preferences This way the operator proposals will be retracted when they no longer apply to the current situation However the actions of productions that apply an operator either by adding or removing elements from working memory need to persist even after the operator is no longer selected and operator application production instantiation no longer matches For example in placing a block on another block a condition is that the second block be clear However the action of placing the first block removes the fact that the second block is clear so the condition will no longer be satisfied Thus operator application productions do not retract their actions even if they no longer match working memory This is called O support for operator support Working memory elements that participate in the a
144. er than an impasse The waitsnc command allows the user to switch to a mode where a state no change that would normally generate an impasse 180 CHAPTER 8 THE SOAR USER INTERFACE and subgoaling instead generates a wait state At a wait state the decision cycle will repeat and the decision cycle count is incremented but no state no change impasse and therefore no substate will be generated When issued with no arguments waitsnc returns its current setting wma Control the behavior of working memory activation Synopsis wma wma g get lt parameter gt wma s set lt parameter gt lt value gt wma S stats lt statistic gt wma t timers lt timer gt wma h history lt timetag gt Options g get Print current parameter setting s set Set parameter value S stats Print statistic summary or specific statistic t timers Print timer summary or specific timer h history Print reference history of a WME Description The wma command changes the behavior of and displays information about working memory activation To get the activation of individual WMEs use print i To get the reference history of an individual WME use wma h history lt timetag gt For example print internal s1 4000016 S1 ct 1000000 3 6 4 S1 epmem E1 1 11 S1 io I1 1 20 S1 max 1000000 3 4 18 S1 name ct 3 4 4000018 S1 operator 0100000
145. erators for which there is an acceptable preference in preference memory This list of candidate operators is passed 219 to the ProhibitFilter ProhibitFilter This filter removes the candidates that have prohibit preferences in memory The rest of the candidates are passed to the RejectFilter RejectFilter This filter removes the candidates that have reject preferences in mem ory e At this point if the set of remaining candidates is either empty or has one member preference semantics terminates and this set is returned e Otherwise the remaining candidates are passed to the BetterWorseFilter BetterWorseFilter gt lt This filter removes any candidates that are worse than an other candidate e If the set of remaining candidates is empty a conflict impasse is created returning the set of conflicted operators all candidates passed to this filter e Otherwise pass any remaining candidates to the BestFilter BestFilter gt If some remaining candidate has a best preference this filter removes any candidates that do not have a best preference If there are no best preferences for any of the current candidates the filter has no effect The remaining candidates are passed to the WorstFilter WorstFilter lt If all remaining candidates have worst preferences this filter has no effect Otherwise the filter removes any candidates that have a worst preference e Once again if the set of remaining candid
146. erences represent constraints on the legal selections that can be made for a decision and if they conflict no progress can be made from the current situation and the impasse cannot be resolved by additional preferences No change impasse A no change impasse arises if a new operator is not selected during the decision procedure There are two types of no change impasses state no change and operator no change State no change impasse A state no change impasse occurs when there are no acceptable or require preferences to suggest operators for the current state or all the acceptable values have also been rejected The decision procedure cannot select a new operator Operator no change impasse An operator no change impasse occurs when ei ther a new operator is selected for the current state but no additional productions match during the application phase or a new operator is not selected during the next decision phase There can be only one type of impasse at a given level of subgoaling at a time Given the semantics of the preferences it is possible to have a tie or conflict impasse and a constraint failure impasse at the same time In these cases Soar detects only the constraint failure impasse The impasse is detected during the selection of the operator but happens because one of the other four problem solving functions was incomplete 2 6 2 Creating New States Soar handles these inconsistencies by creating a new state
147. erent preferences created by specially formulated productions called RL rules RL rules are identified by syntax A production is a RL rule if and only if its left hand side tests for a proposed operator its right hand side creates a single numeric indifferent preference and it is not a template rule see 5 4 2 These constraints ease the technical requirements of identifying updating RL rules and makes it easy for the agent programmer to add maintain RL capabilities within an agent We define an RL operator as an operator with numeric indifferent preferences created by RL rules The following is an RL rule sp r1 3 12 left state lt s gt name task name x 3 y 12 operator lt o gt 1 Tn this context the term state refers to the state of the task or environment not a state identifier For the rest of this chapter bold capital letter names such as 1 will refer to identifiers and italic lowercase names such as s will refer to task states 81 82 CHAPTER 5 REINFORCEMENT LEARNING lt o gt name move direction left gt lt s gt operator lt o gt 1 5 Note that the LHS of the rule can test for anything as long as it contains a test for a proposed operator The RHS is constrained to exactly one action creating a numeric indifferent preference for the proposed operator The following are not RL rules sp multiple preferences state lt s gt operator lt o gt gt lt s
148. es state values Figure 3 1 A semantic net illustration of four objects in working memory indirectly to a state The impact of this constraint is that all WME s created by actions are linked to WME s tested in the conditions The link is one way from the identifier to the value Less commonly the attribute of a WME may be an identifier Figure 3 1 illustrates four objects in working memory the object with identifier X44 has been linked to the object with identifier 043 using the attribute as the link rather than the value The objects in working memory illustrated by this figure are 043 isa apple color red inside 053 size small X44 200 087 isa ball color red inside 053 size big 053 isa box size large color orange contains 043 087 X44 unit grams property mass In this example object 043 and object 087 are both linked to object 053 through 053 contains 043 and 053 contains 087 respectively the contains attribute is a multi valued attribute Likewise object 053 is linked to object 043 through 043 inside 053 and linked to object 087 through 087 inside 053 Object X44 is linked to object 043 through 043 X44 200 Links are transitive so that X44 is linked to 053 because 043 is linked to 053 and X44 is linked to 043 However since links are not symmetric 053 is not linked to X44 38 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS 3 2 Preference Memory
149. es the firing and retrac tion of rules starting from those matching the oldest substate to the newest Whenever a production fires or retracts changes are made to working memory and preference memory possibly changing which productions will match at the lower levels productions firing within a given level are fired in parallel simulated Productions firings at higher levels can resolve impasses and thus eliminate lower states before the productions at the lower level ever fire Thus whenever a level in the state stack is reached all production activity is guaranteed to be consistent with any processing that has occurred at higher levels 2 7 Learning When an operator impasse is resolved it means that Soar has through problem solving gained access to knowledge that was not readily available before Therefore when an impasse is resolved Soar has an opportunity to learn by summarizing and generalizing the processing in the substate One of Soar s learning mechanisms is called chunking it attempts to create a new production called a chunk The conditions of the chunk are the elements of the state that through some chain of production firings allowed the impasse to be resolved the action of the production is the working memory element or preference that resolved the impasse the result of the impasse The conditions and action are variablized so that this new production may match in a similar situation in the future and prevent a
150. et of the candidate operators either the empty set a set consisting of a sin gle winning candidate or a larger set of candidates that may be conflicting tied or indifferent 2 an impasse type possibly NONE_IMPASSE_TYPE The procedure has several potential exit points Some occur when the procedure has detected a particular type of impasse The others occur when the number of candidates has been reduced to one necessarily the winner or zero a no change impasse Each step in Figure D 1 is described below RequireTest This test checks for required candidates in preference memory and also constraint failure impasses involving require preferences see Section 2 6 on page 23 e If there is exactly one candidate operator with a require preference and that candidate does not have a prohibit preference then that candidate is the winner and preference semantics terminates e Otherwise If there is more than one required candidate then a constraint failure impasse is recognized and preference semantics terminates by returning the set of required candidates 217 218 all nonprohibited candidates are passed on all nonrejected candidates are passed on pass along only candidates that are not worse pass along only candidates that are best if none pass on all candidates all acceptable candidates are passed on all nonworst candidates are passed on APPENDIX D THE RESOLUTION OF OPERATOR PREFERENCES Pr
151. example when you are watching Soar run and looking at the specific productions that are firing and retracting Since Soar uses white space to delimit components of a production if whitespace inadvertently occurs in the production name Soar will complain that an open parenthesis was expected to start the first condition 3 3 2 Documentation string optional A production may contain an optional documentation string The syntax for a documenta tion string is that it is enclosed in double quotes and appears after the name of the production and before the first condition and may carry over to multiple lines The documentation string allows the inclusion of internal documentation about the production it will be printed out when the production is printed using the print command 3 3 3 Production type optional A production may also include an optional production type which may specify that the production should be considered a default production default or a chunk chunk or may specify that a production should be given O support o support or I support i support Users are discouraged from using these types These types are described in Section 8 1 which begins on Page 117 There is one additional flag interrupt which can be placed at this location in a produc tion However this flag does not specify a production type but is a signal that the production should be marked for special debugging capabilities For more information
152. fferent will be examined to determine the selection Note that if a value that is not rejected or prohibited is better than a best value the better value will be selected This result is counter intuitive but allows explicit knowledge about the relative worth of two values to dom inate knowledge of only a single value A require preference should be used when a value must be selected for the goal to be achieved Worst lt A worst preference states that the value should be selected only if there are no alternatives It allows for a simple type of default specification The semantics of the worst preference are similar to those for the best preference Indifferent An indifferent preference states that there is positive knowledge that it does not matter which value is selected This may be a binary preference to say that two values are mutually indifferent or a unary preference to say that a single value is as good or as bad a choice as other expected alternatives When indifferent preferences are used to signal that it does not matter which oper ator is selected by default Soar chooses randomly from among the alternatives The indifferent selection function can be used to change this behavior as described on page 157 in Chapter 8 Numeric Indifferent number A numeric indifferent preference is used to bias the random selection from mutually indifferent values This preference includes a unary indifferent preference so an opera
153. fficient preferences have been generated so that one of the operators for state S2 can be selected When state S3 is removed operator 09 will also be removed as will the acceptable preferences for 07 08 and 09 and the impasse attribute and choices augmentations of state 3 These working memory elements are removed because they are no longer linked to the subgoal stack The acceptable preferences for operators 04 05 and 06 remain in working memory They were linked to state 3 but since they are also linked to state S2 they will stay in working memory until S2 is removed or until they are retracted or rejected 30 CHAPTER 2 THE SOAR ARCHITECTURE 2 6 5 Soar s Cycle With Substates When there are multiple substates Soar s cycle remains basically the same but has a few minor changes The first change is that during the decision procedure Soar will detect impasses and create new substates For example following the proposal phase the decision phase will detect if a decision cannot be made given the current preferences If an impasse arises a new substate is created and added to working memory The second change when there are multiple substates is that at each phase Soar goes through the substates from oldest highest to newest lowest completing any necessary processing at that level for that phase before doing any processing in the next substate When firing productions for the proposal or application phases Soar process
154. for WMEs that are added or retracted as productions fire and retract Note that detailed information about WME s will be printed only for productions that are being watched Option Flag Description n nowmes When watching productions do not print any information about matching wmes t timetags When watching productions print only the timetags for matching wmes f fullwmes When watching productions print the full matching wmes Watching Learning Option Flag Argument to Option Description L learning noprint print or fullprint Controls the printing of see table below chunks justifications as they are created As Soar is running it may create justifications and chunks which are added to production memory The watch command allows users to monitor when chunks and justifications are created by specifying one of the following arguments to the learning command 8 3 CONFIGURING TRACE INFORMATION AND DEBUGGING 145 Argument Alias Effect noprint 0 Print nothing about new chunks or justifications default print 1 Print the names of new chunks and justifications when created fullprint 2 Print entire chunks and justifications when created Watching other Functions Option Flag Argument to Option Description a wma remove optional Print log of working mem ory activation events b backtracing remove optional Print back
155. g algorithm In Soar 8 whenever an o supported WME is created in the local state the superstate dependencies of that new feature are determined and added to the goal dependency set GDS of that state Conceptually speaking whenever a working memory change occurs the dependency sets for every state in the context hierarchy are compared to working memory changes If a removed element is found in a GDS the state is removed from memory along with all existing substructure The dependency set includes only dependencies for o supported features For example in Figure E 2 at time tg because only i supported features have been created in the subgoal the dependency set is empty Three types of features can be tested in the creation of an o supported feature Each requires a slightly different type of update to the dependency set Elements in the superstate WMEs in the superstate are added directly to the goal s dependency set In Figure E 2 the persistent subgoal item 3 is dependent upon A and D These superstate WMEs are added to the subgoal s dependency set when 3 is added to working memory at time t It does not matter that A is i supported and D o supported Local I Supported Features Local i supported features are not added to the goal depen dency set Instead the superstate WMEs that led to the creation of the i supported feature are determined and added to the GDS In the example when 4 is created A B and C must be added to
156. g the value This allows a production to test the existence of a candidate operator and its properties and possibly create a preference for it before it is selected In the example below operator lt o gt matches the acceptable preference for the operator augmentation of the state This does not test that operator lt o gt has been selected as the current operator sp blocks example production conditions state operator lt o gt table lt t gt lt o gt name move block gt T In the example below the production tests the state for acceptable preferences for two different operators and also tests that these operators move different blocks sp blocks example production conditions state operator lt 01 gt lt 02 gt table lt t gt lt 01 gt name move block moving block lt m1 gt destination lt d1 gt lt o2 gt name move block moving block lt m2 gt lt gt lt m1 gt destination lt d2 gt gt 3 3 PRODUCTION MEMORY 49 3 3 5 10 Attribute tests The previous examples applied all of the different test to the values of working memory elements All of the tests that can be used for values can also be used for attributes and identifiers except those including constants Variables in attributes Variables may be used with attributes as in sp blocks example production conditions state lt s gt operator lt o gt thing lt t gt lt gt lt t g
157. gt volume 5 contents 3 gt OON OA KBPWNH HE ererrrerrrr rererere oono AUNEBEO 20 21 22 lt s3 gt operator lt o1 gt state lt s2 gt name water jug state lt s1 gt name water jug lt si gt desired lt d1 gt lt s2 gt desired lt d1 gt lt s1 gt operator lt 01 gt lt 01 gt name pour lt 01 gt into lt n1 gt lt ni gt volume 3 lt n1 gt contents 0 lt si gt jug lt n1 gt lt si gt problem space lt p1 gt lt p1 gt name water jug lt s2 gt jug lt n4 gt lt n4 gt volume 3 lt n4 gt contents 3 lt s2 gt jug lt n3 gt lt n3 gt volume 5 lt n3 gt contents 0 lt si gt jug lt n2 gt lt n2 gt volume 5 lt n2 gt contents 3 lt ol gt jug lt n2 gt Further examining condition 21 Ground Ground Ground Ground Ground Ground Ground Ground Ground Ground Ground Ground Ground Ground Ground Ground Ground Ground Ground Ground Ground Ground S3 name water jug S5 name water jug S5 desired D1 S3 desired D1 S5 operator 018 018 name pour 018 into N3 N3 volume 3 N3 contents 0 S5 jug N3 S5 problem space P3 P3 name water jug S3 jug N1 N1 volume 3 N1 contents 3 S3 jug N2 N2 volume 5 N2 contents 0
158. gure E 4 The GDS and WME data structures 230 APPENDIX E A GOAL DEPENDENCY SET PRIMER Index 1 58 217 amp 58 48 58 219 45 58 219 00 lt 43 58 219 lt lt gt gt 44 49 lt 43 lt gt 43 lt gt 43 43 58 219 gt see best preference 43 58 219 gt 43 58 carat symbol 33 58 219 29 acceptable preference 48 219 action side of production 56 action side grammar 214 add wme 194 alias 198 allocate 199 arithmetic operations 60 attribute 8 13 14 33 34 multi valued attribute 35 augmentation see working memory element backtracing 75 76 best preference see best preference 219 better preference 219 bottom up chunking 74 capitalize symbol 64 capture input 195 carriage return line feed 60 cd 184 chunk 30 overgeneral 27 chunk name format 134 chunking see learning 73 actions 75 bottom up 74 conditions 76 77 creation 73 determining actions 75 determining conditions 76 duplicate chunks 74 incorrect chunks 78 negated conditions 75 79 ordering conditions 77 overgeneral 78 refractory inhibition 77 variablization 77 when active 73 clog 185 cmd 65 command to file 186 comments 40 compute 60 concat 64 condition acceptable preference 48 condition side 41 condition side grammar 213 Conditions 41 conflict impasse 24 67 conjunctive conditions 44 negation 46 constant 34 214 constraint
159. hat memory is placed here This WME is an identifier that is treated as the root of the state that was used to create the episodic memory If the retrieve command was issued with an invalid time the value of this WME will be no memory e success lt query gt lt optional neg query gt If the cue based retrieval was successful the WME will have the status as the attribute and the value of the identifier of the query and neg query if applicable e match score This WME is created whenever an episode is successfully retrieved from a cue based retrieval command The WME value is a decimal indicating the raw match score for that episode with respect to the cue s e cue size This WME is created whenever an episode is successfully retrieved from a cue based retrieval command The WME value is an integer indicating the number of leaf WMEs in the cue s e normalized match score This WME is created whenever an episode is successfully retrieved from a cue based retrieval command The WME value is the decimal result of dividing the raw match score by the cue size It can hypothetically be used as a measure of episodic memory s relative confidence in the retrieval e match cardinality This WME is created whenever an episode is successfully retrieved from a cue based retrieval command The WME value is an integer indicating the number of leaf WMEs matched in the query cue minus those matched in the neg query cue e memory id 7 4 PE
160. he preferences and working memory elements in its actions are considered to be created in the most recent of those states and is not considered to have been created in the other states The architecture automatically detects if a preference or working memory elmenet created in a substate is also linked to a superstate These working memory elements and preferences will not be removed when the impasse is resolved because they are still linked to a superstate and therefore they are called the results of the subgoal A result has either I support or O support the determination of support is described below A working memory element or preference will be a result if its identifier is already linked to a superstate A working memory element or preference can also become a result indirectly 3The original state is the top of the stack because as Soar runs this state created first will be at the top of the computer screen and substates will appear on the screen below the top level state 26 CHAPTER 2 THE SOAR ARCHITECTURE superstate O state and operator objects Top level O other objects state operator decisions that have not yet been made ora e acceptable preferences for operators Q no change attribute operator superstate choices none A Fa This subgoal was created because Soar didn t know 7 how to apply operator O2 in state S1 a Pa No operator has been r selected yet for S2 tie
161. he Dependency Set in Soar 8 2 a 225 The algorithm for determining members of the GDS 225 The GDS and WME data structures 2 22 0000 229 vi LIST OF FIGURES Chapter 1 Introduction Soar has been developed to be an architecture for constructing general intelligent systems It has been in use since 1983 and has evolved through many different versions This manual documents the most current of these Soar version 9 3 2 Our goals for Soar include that it is to be an architecture that can be used to build systems that work on the full range of tasks expected of an intelligent agent from highly routine to extremely difficult open ended problems represent and use appropriate forms of knowledge such as procedural declarative episodic and possibly iconic employ the full range of problem solving methods interact with the outside world and learn about all aspects of the tasks and its performance on those tasks In other words our intention is for Soar to support all the capabilities required of a general intelligent agent Below are the major principles that are the cornerstones of Soar s design ds The number of distinct architectural mechanisms should be minimized Classically Soar had a single representation of permanent knowledge productions a single representa tion of temporary knowledge objects with attributes and values a single mechanism for generating goals automatic subgo
162. he basis of the chunk s actions 4 2 2 Tracing the creation and reference of working memory ele ments Soar automatically maintains information on the creation of each working memory element in every state When a production fires a trace of the production is saved with the appro priate state A trace is a list of the working memory elements matched by the production s conditions together with the actions created by the production The appropriate state is the most recently created state i e the state lowest in the subgoal hierarchy that occurs in the production s matched working memory elements Recall that when a subgoal is created the item augmentation lists all values that lead to the impasse Chunking is complicated by the fact that the item augmentation of the substate is created by the architecture and not by productions Backtracing cannot determine the cause of these substate augmentations in the same way as other working memory elements To overcome this Soar maps these augmentations onto the acceptable preferences for the operators in the item augmentations Negated conditions Negated conditions are included in a trace in the following way when a production fires its negated conditions are fully instantiated with its variables appropriate values This instan tiation is based on the working memory elements that matched the production s positive conditions If the variable is not used in any positive condit
163. he new directory execute the command and then pop back to the current working directory from which the command was issued After the source completes the number of productions sourced and excised is printed agent gt source demos mac mac soar BEA k k kkk kkk k Total 18 productions sourced Source finished agent gt source demos mac mac soar HEK HEK HE HK HK HEK HEK HEK HK Hk Hk HE Hk Hk Hk Hk Hk Hk Total 18 productions sourced 18 productions excised Source finished This can be disabled by using the d flag agent gt source demos mac mac soar d FR A A ACA kk K k K K KKK KK Source finished agent gt source demos mac mac soar d HIHHH HkHkHkHkHxHkHkHkHkHxHxH kHH Source finished A list of excised productions is available using the v flag agent gt source demos mac mac soar v HxH HKHK HKH Hk Hk Hk Hk Hk HH HH HHH Total 18 productions sourced 18 productions excised Excised productions macxdetectxstatexsuccess macxevaluatexstatexfailurexmore xcannibals monitor move boat monitor state left A separate summary for each file sourced can be enabled using the a flag 8 6 SOAR I O COMMANDS 193 agent gt source demos mac mac soar a _firstload soar 0 productions sourced all_source soar 0 productions sourced goal test soar 2 productions sourced 40K monitor soar 3 productions sourced kkk search control soar 4 productions sourced top state soar 0 productions sourced elaborations_
164. he retrieval mecha nism implements some basic query optimization statistics are maintained about all stored knowledge When a query is issued semantic memory re orders the cue such as to minimize expected query time Because only perfect matches are acceptable and there is no sym bol variablization semantic memory retrievals do not contend with the same combinatorial search space as the rete Preliminary empirical study shows that semantic memory maintains sub millisecond retrieval time for a large class of queries even in very large stores millions of nodes edges Once the number of long term identifiers overcomes initial overhead about 1000 WMEs initial empirical study shows that semantic storage requires far less than 1KB per stored WME 6 5 1 Performance Tweaking When using a database stored to disk several parameters become crucial to performance The first is lLlazy commit which controls when database changes are written to disk The default setting on will keep all writes in memory and only commit to disk upon re initialization quitting the agent or issuing the init command The off setting will write each change to disk and thus incurs massive I O delay The next parameter is thresh This has to do with the locality of storing updating acti vation information with semantic augmentations By default all WME augmentations are incrementally sorted by activation such that cue based retrievals need not sort large number of
165. hed an impasse in problem solving and a new substate is created Impasses are discussed in Section 2 6 In our blocks world example the second case holds and Soar can select one of the operators randomly 2 1 7 Applying the operator An operator applies by making changes to the state the specific changes that are appropriate depend on the operator and the current state There are two primary approaches to modifying the state indirect and direct Indirect changes are used in Soar programs that interact with an external environment The Soar program sends motor commands to the external environment and monitors the external environment for changes The changes are reflected in an updated state description garnered from sensors Soar may also make direct changes to the state these correspond to Soar doing problem solving in its head Soar programs that do not interact with an external environment can make only direct changes to the state Internal and external problem solving should not be viewed as mutually exclusive activities in Soar Soar programs that interact with an external environment will generally have operators that make direct and indirect changes to the state The motor command is represented as substructure of the state and it is a command to the environment Also a Soar program may maintain an internal model of how it expects an external operator will modify the world if so the operator must update the internal mod
166. hen the current warnings status is printed At startup warnings are initially enabled If warnings are disabled using this command then some warnings may still be printed since some are considered too important to ignore The warnings that are printed apply to the syntax of the productions to notify the user when they are not in the correct syntax When a lefthand side error is discovered such as conditions that are not linked to a common state or impasse object the production is generally loaded into production memory anyway although this production may never match or may seriously slow down the matching process In this case a warning would be printed only if warnings were on Righthand side errors such as preferences that are not linked to the state usually result in the production not being loaded and a warning regardless of the warnings setting watch Control the run time tracing of Soar Synopsis watch options watch level Default Aliases w watch Options When appropriate a specific option may be turned off using the remove argument This argument has a numeric alias you can use 0 for remove A mix of formats is acceptable even in the same command line 8 3 CONFIGURING TRACE INFORMATION AND DEBUGGING Basic Watch Settings 143 Option Flag Argument to Option Description 1 level 0 to 5 see Watch Levels below This flag is optional but recom mended Set a specif
167. hich resulted in inclusion in the chunk justification A ground is a working memory element WME which was tested in the supergoal Just knowing which WME was tested may be enough to explain why the chunk justification exists If not the second variant where n is the condition of interest can be used to obtain a list of the productions which fired to obtain this condition in the chunk justification and the crucial WMEs tested along the way save backtraces mode must be on when a chunk or justification is created or no explanation will be available Calling explain backtraces with no argument prints a list of all chunks and justifications for which backtracing information is available Use with no arguments to list all productions that can be explained 156 Examples CHAPTER 8 THE SOAR USER INTERFACE Examining the chunk chunk 65 d13 tie 2 generated in a water jug task soar gt explain backtraces chunk 65 d13 tie 2 sp chunk 65 d13 tie 2 state lt s2 gt name water jug jug lt n4 gt jug lt n3 gt state lt s1 gt name water jug desired lt d1 gt operator lt ol gt jug lt n1 gt jug lt n2 gt lt s2 gt desired lt d1 gt lt ol gt name pour into lt n1 gt jug lt n2 gt lt ni gt volume 3 contents 0 lt s1 gt problem space lt p1 gt lt p1 gt name water jug lt n4 gt volume 3 contents 3 lt n3 gt volume 5 contents 0 lt n2
168. ic watch level using an integer 0 to 5 this is an inclusive operation N none No argument Turns off all printing about Soar s internals equivalent to level 0 d decisions remove optional Controls whether state and op erator decisions are printed as they are made phases remove optional Controls whether decisions cycle phase names are printed as Soar executes g gds remove optional Controls printing of warnings about wme changes to GDS P productions remove optional Controls whether the names of productions are printed as they fire and retract equivalent to Dujc wmes remove optional Controls the printing of work ing memory elements that are added and deleted as produc tions are fired and retracted preferences remove optional Controls whether the prefer ences generated by the traced productions are printed when those productions fire or retract Watch Levels Use of the level 1 flag is optional but recommended watch nothing equivalent to N watch decisions equivalent to d watch phases gds and decisions equivalent to dpg watch productions phases and decisions equivalent to dpgP watch wmes productions phases and decisions equivalent to dpgPw oje WW N e oO watch preferences wmes productions phases and decisions equivalent to dpgPwr It
169. ied memory pool without run ning Soar Synopsis allocate pool blocks 200 CHAPTER 8 THE SOAR USER INTERFACE Description Soar allocates blocks of memory for its memory pools as it is needed during a run or during other actions like loading productions Unfortunately this behavior translates to an increased run time for the first run of a memory intensive agent To mitigate this blocks can be allocated before a run by using this command Issuing the command with no parameters lists current pool usage exactly like stats com mand s memory flag Issuing the command with part of a pool s name and a positive integer will allocate that many additional blocks for the specified pool Only the first few letters of the pool s name are necessary If more than one pool starts with the given letters which pool will be chosen is unspecified Memory pool block size in this context is approximately 32 kilobytes the exact size deter mined during agent initialization See Also stats echo commands Set whether or not commands are echoed to other connected debuggers Synopsis echo commands yn Options n no Do not echo commands y yes Do echo commands Description Setting this on will echo typed commands to other connected debuggers Otherwise the output is displayed without the initiating command and this can be confusing 8 7 MISCELLANEOUS 201 edit production Move focus in an
170. ilable heuristics are depth first search dfs and most constrained variable mcv It is advised that you attempt these heuristics to improve performance if the query_graph match 8 4 CONFIGURING SOAR S RUNTIME PARAMETERS 153 timer reveals that graph matching is dominating retrieval time The merge parameter controls how the augmentations of retrieved long term identifiers LTIs interact with an existing LTI in working memory If the LTI is not in working memory or has no augmentations in working memory this parameter has no effect If the augmentation is in working memory and has augmentations by default none episodic memory will not augment the LTI If the parameter is set to add then any augmentations that augmented the LTI in a retrieved episode are added to working memory Statistics Episodic memory tracks statistics over the lifetime of the agent These can be accessed using epmem stats lt statistic gt Running epmem stats without a statistic will list the values of all statistics Unlike timers statistics will always be updated Available statistics are Name Label Description time Time Current episode ID db lib version SQLite Version SQLite library version mem usage Memory Usage Current SQLite memory usage in bytes mem high Memory Highwater High SQLite memory usage wa termark in bytes queries Queries Number of times the query com mand has been processed
171. in the vision system which would later be reported on the input link Input and output are viewed from Soar s perspective An input function adds or deletes augmentations of the input link providing Soar with information about some occurrence external to Soar An output function responds to substructure of the output link produced by production firings and causes some occurrence external to Soar Input and output occur through the io attribute of the top level state exclusively Structures placed on the input link by an input function remain there until removed by an input function During this time the structure continues to provide support for any production that has matched against it The structure does not cause the production to 72 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS rematch and fire again on each cycle as long as it remains in working memory to get the production to refire the structure must be removed and added again 3 5 3 Input and output in production memory Productions involved in input will test for specific attributes and values on the input link while productions involved in output will create preferences for specific attributes and values on the output link For example a simplified production that responds to the vision input for the blocks task might look like this sp blocks world elaborate input state lt s gt io input link lt in gt lt in gt block lt ib1 gt lt ib1 gt x location lt x1 g
172. ing Soar s Runtime Parameters o oao e a a a 148 8 amp 5 File System 1 0 Commands s sos i ss pea ro aspie e ha 183 ob Soar I O Commands s os cece nepera kisa iaid aa e a di 193 Del ieee nee os o eoo OG Mee BES pea eiae A a cote w Eee Ss 198 Appendices 207 A The Blocks World Program 207 B Grammars for production syntax 213 Bil Grammar of Soar produchions lt sec ts 68444 Eee Ree RD ED EES 213 BLI Graminar for Condition Sid s e dk sore DRAW See Ee ees 213 B12 Grammar for Action Side 26 46 8645 68 eee eee es 214 iv CONTENTS C The Calculation of O Support 215 D The Resolution of Operator Preferences 217 E A Goal Dependency Set Primer 221 Index 231 Summary of Soar Aliases Variables and Functions 235 List of Figures 2A 2 2 2 3 2 4 2 5 2 6 2T 2 8 2 9 2 10 3 1 3 2 3 3 3 4 oad 6 1 TA D 1 E 1 E 2 E 3 E 4 Soar is continually trying to select and apply operators 5 The initial state and goal of the blocks world task 8 An abstract illustration of the initial state of the blocks world as working memory objects At this stage of problem solving no operators have been proposed or selected 8 An abstract illustration of working memory in the blocks world after the first op erator has been selected 2 1 1 a a a a 9 The six operators proposed for the initial state of the blocks world each move one block to a new location 2 1 10 The p
173. ing input wmes or context impasse wmes may have unexpected side effects You ve been warned See Also add wme replay input Load input wmes for each decision cycle from a file Synopsis replay input open filename replay input query replay input close Options filename Open filename and load input and random seed o open Reads captured input from file in to memory and seeds the random number generator q query Returns open if input replay is active or closed if not active c close Stop replaying input 198 CHAPTER 8 THE SOAR USER INTERFACE Description Replays input stored using the capture input command The replay file also includes a random number generator seed and seeds the generator with that See Also capture input 8 7 Miscellaneous The specific commands described in this section are Summary alias Define a new alias or command using existing commands and arguments allocate Allocate additional 32 kilobyte blocks of memory for a specified mem ory pool without running Soar echo commands Set whether or not commands are echoed to other connected debuggers edit production Fire event to Move focus in an open editor to this production load library Load a shared library into the local client port Returns the port the kernel instance is listening on rand Generate a random number soarnews Prints information about the current releas
174. ing predicate There are six predicates that can be used lt gt lt gt lt Kay DSa 2 Predicate Semantics of Predicate lt gt Not equal Matches anything except the value immediately following it lt gt Same type Matches any symbol that is the same type identifier integer floating point non numeric constant as the value immediately following it lt Numerically less than the value immediately following it lt Numerically less than or equal to the value immediately following it gt Numerically greater than or equal to the value immediately following it gt Numerically greater than the value immediately following it The following table shows examples of legal and illegal predicates Legal predicates Illegal predicates gt lt valuex gt gt gt lt valuey gt lt 1 1 gt lt gt lt y gt 10 Example Production sp propose operator to show example predicate state lt s gt car lt c gt lt c gt style convertible color lt gt rust gt lt s gt operator lt o gt lt o gt name drive car car lt c gt In this production there must be a color attribute for the working memory object that matches lt c gt and the value of that attribute must not be rust 44 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS 3 3 5 4 Disjunctions of values A test for an identifier attribute or value may also be for a
175. instantiations In the example production above the names of the blocks are hardcoded that is they are named specifically In Soar productions variables are used so that a production can apply to a wider range of situations The variables are bound to specific symbols in working memory elements by Soar s matching process A production along with a specific and consistent set of variable bindings is called an instantiation A production instantiation is consistent only if every occurrence of a variable is bound to the same value Since the same production may match multiple times each with different variable bindings several instantiations of the same production may match at the same time and therefore fire at the same time If blocks A and B are clear the first production without variables will suggest one operator However if a production was created that used variables to test the names this second production will be instantiated twice and therefore suggest two operators one operator to move block A ontop of block B and a second operator to move block B ontop of block A Because the identifiers of objects are determined at runtime literal identifiers cannot appear in productions Since identifiers occur in every working memory element variables must be used to test for identifiers and using the same variables across multiple occurrences is what links conditions together Just as the elements of working memory must be linked
176. internal is also specified then a depth of 0 prints just the individual WME while a depth of 1 prints all WMEs which share that same identifier This is true when printing timetags identifiers or WME patterns When the depth is greater than 1 the identifier links from the specified WME s will be followed so that additional substructure is printed For example a depth of 2 means that the object specified by the identifier wme pattern or timetag will be printed along with all other objects whose identifiers appear as values of the first object This may result in multiple copies of the same object being printed out If internal is also specified then individuals WMEs and their timetags will be printed instead of the full objects See Also print gds print Print the WMEs in the goal dependency set for each goal Synopsis gds print Default Aliases gds_print gds print 8 2 EXAMINING MEMORY 123 Description The Goal Dependency Set GDS is described in an appendix of the Soar manual This command is a debugging command for examining the GDS for each goal in the stack First it steps through all the working memory elements in the rete looking for any that are included in any goal dependency set and prints each one Then it also lists each goal in the stack and prints the wmes in the goal dependency set for that particular goal This command is useful when trying to determine why subgoals are disappearing unex
177. ion Description The sp command creates a new production and loads it into production memory production_body is a single argument parsed by the Soar kernel so it should be enclosed in curly braces to avoid being parsed by other scripting languages that might be in the same process The overall syntax of a rule is as follows name documentation string FLAG LHS gt RHS The first element of a rule is its name Conventions for names are given in the Soar Users Manual If given the documentation string must be enclosed in double quotes Optional flags define the type of rule and the form of support its right hand side assertions will receive The specific flags are listed in a separate section below The LHS defines the left hand side of the production and specifies the conditions under which the rule can be fired Its syntax is given in detail in a subsequent section The gt symbol serves to separate the LHS and RHS portions The RHS defines the right hand side of the production and specifies the assertions to be made and the actions to be performed when the rule fires The syntax of the allowable right hand side actions are given in a later section The Soar Users Manual gives an elaborate discussion of the design and coding of productions Please see that reference for tutorial information about productions If the name of the new production is the same as an existing one the old production will be overwritten excis
178. ion Save agent commands issued from the input cycle function in a file for reloading later This command may be used to start and stop the recording of input function commands as created by an external simulation Commands are recorded decision cycle by decision cycle Use the command replay input to replay the sequence Note that this command seeds the random number generator and writes the seed to the capture file See Also replay input remove wme Manually remove an element from working memory Synopsis remove wme timetag Default Aliases rw remove wme Options timetag A positive integer matching the timetag of an existing working memory element 8 6 SOAR I O COMMANDS 197 Description The remove wme command removes the working memory element with the given timetag This command is provided primarily for use in Soar input functions although there is no programming enforcement remove wme should only be called from registered input functions to delete working memory elements on Soar s input link Beware of weird side effects including system crashes Warnings remove wme should never be called from the RHS if you try to match a wme on the LHS of a production and then remove the matched wme on the RHS Soar will crash If used other than by input and output functions interfaced with Soar this command may have weird side effects possibly even including system crashes Remov
179. ion just list the partial match counts t timetags Also print the timetags of the wmes at the first failing condition w wmes Also print the full wmes not just the timetags at the first failing condition a assertions List only productions about to fire r retractions List only productions about to retract Description The matches command prints a list of productions that have instantiations in the match set i e those productions that will retract or fire in the next Propose or Apply phase It also 8 2 EXAMINING MEMORY 125 will print partial match information for a single named production Printing the match set When printing the match set i e no production name is spec ified the default action prints only the names of the productions which are about to fire or retract If there are multiple instantiations of a production the total number of instantia tions of that production is printed after the production name unless timetags or wmes are specified in which case each instantiation is printed on a separate line When printing the match set the assertions and retractions arguments may be specified to restrict the output to print only the assertions or retractions Printing partial matches for productions In addition to printing the current match set the matches command can be used to print information about partial matches for a named production In this case the cond
180. ion cycle and are processed in response to changes to specific output structures in working memory An output function is called only if changes have been made to the output link structures in working memory The structures for manipulating input and output in Soar are linked to a predefined attribute of the top level state called the io attribute The io attribute has substructure to represent sensor inputs from the environment called input links because these are represented in working memory Soar productions can match against input links to respond to an external situation Likewise the io attribute has substructure to represent motor commands called output links Functions that execute motor commands in the environment use the values on the output links to determine when and how they should execute an action Generally input functions create and remove elements on the input link to update Soar s perception of the environment Output functions respond to values of working memory elements that appear on Soar s output link strucure 3 5 2 Input and output in working memory All input and output is represented in working memory as substructure of the io attribute of the top level state By default the architecture creates an input link attribute of the io object and an output link attribute of the io object The values of the input link and output link attributes are identifiers whose augmentations are the complete set of input and output wo
181. ional tests described in later sections or by multiple occurrences in conditions If a variable occurs more than once in the condition of a production the production will match only if the variables match the same identifier or constant However there is no restriction that prevents different variables from binding to the same identifier or constant Because identifiers are generated by Soar at run time it impossible to include tests for specific identifiers in conditions Therefore variables are used in conditions whenever an identifier is to be matched Variables also provide a mechanism for passing identifiers and constants which match in conditions to the action side of a rule Syntactically a variable is a symbol that begins with a left angle bracket i e lt ends with a right angle bracket i e gt and contains at least one alphanumeric symbol in between In the example production in Figure 3 2 there are seven variables lt s gt lt clear1 gt lt clear2 gt lt ontop gt lt block1i gt lt block2 gt and lt o gt The following table gives examples of legal and illegal variable names 3 3 PRODUCTION MEMORY 43 Legal variables Illegal variables lt s gt lt gt lt 1 gt lt 1 lt variablel gt variable gt lt abcl gt lt a b gt 3 3 5 3 Predicates for values A test for an identifier attribute or value in a condition whether constant or variable can be modified by a preced
182. ions such as in a conjunctive negation a dummy variable is used that will later become a variable in a chunk If the identifier used to instantiate a negated condition s identifier field is linked to the super state then the instantiated negated condition is added to the trace as a negated condition 76 CHAPTER 4 CHUNKING In all other cases the negated condition is ignored because the system cannot determine why a working memory element was not produced in the subgoal and thus allowed the pro duction to fire Ignoring these negations of conditions internal to the subgoal may lead to overgeneralization in chunking see Section 4 6 on page 78 4 2 3 Determining a chunk s conditions The conditions of a chunk are determined by a dependency analysis of production traces a process called backtracing For each instantiated production that creates a subgoal result backtracing examines the production trace to determine which working memory elements were matched If a matched working memory element is linked to a superstate it is included in the chunk s conditions If it is not linked to a superstate then backtracing recursively examines the trace of the production that created the working memory element Thus backtracing begins with a subgoal result traces backwards through all working memory elements that were used to produce that result and collects all of the working memory elements that are linked to a superstate This method ign
183. ire in parallel and all retractions occur in par allel and matching and firing continues until there are no more additional complete matches or retractions of productions quiescence 3 Decision A new operator is selected or an impasse is detected and a new state is created 4 Application Productions fire to apply the operator operator application The actions of these productions will be O supported Because of changes from operator application productions other productions with I supported actions may also match or retract Just as during proposal productions fire and retract in parallel until quiescence 5 Output Output commands are sent to the external environment The cycles continue until the halt action is issued from the Soar program as the action of a production or until Soar is interrupted by the user 22 CHAPTER 2 THE SOAR ARCHITECTURE Decision Cycle Bet N Decision 2 Decision 3 Elaboration Phase Decision Phase p a Quiescence El ion Ph Elaboration Cycle aboration Phase Decision Phase Decision Phase Preference Working Memory Quiescence Phase Phase noore 1 all eae peleienees are considere newly instantiated 1 all non operator aE peeuciens fire preferences are considered tofir a 2 the preferences are evaluated productions that 2 the preferences are retract are no aa evaluated 3 a new operator is selected instantiated are OR retracted 3 elements are added and a new state is create
184. is applied recursively so that all item that become unlinked are removed The reject should be used with an action that will be o supported If reject is attempted with I support the working memory element will reappear if the reject loses I support and the element still has support 3 3 6 4 The syntax of preferences Below are the eleven types of preferences as they can appear in the actions of a production for the selection of operators 58 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS RHS preferences Semantics id operator value acceptable id operator value acceptable id operator value require id operator value prohibit id operator value reject id operator value gt value2 better id operator value lt value2 worse id operator value gt best id operator value lt worst id operator value unary indifferent id operator value value2 binary indifferent id operator value number numeric indifferent The identifier and value will always be variables such as lt s1 gt operator lt 01 gt gt lt 02 gt The preference notation appears similar to the predicate tests that appear on the left hand side of productions but has very different meaning Predicates cannot be used on the right hand side of a production and you cannot restrict the bindings of variables on the right hand side of a production Such restrictions can happen only i
185. is different from the block moved to The block moved must be type block The block moved must not already be ontop the block being moved to The actions create an acceptable preference for an operator 2 create acceptable preferences for the substructure of the operator its name its moving block and the destination aOP WDNR Be sp blocks world propose move block state lt s gt problem space blocks thing lt thing1 gt lt gt lt thing1l gt lt thing2 gt ontop lt ontop gt lt thing1 gt type block clear yes lt thing2 gt clear yes lt ontop gt top block lt thing1 gt pottom block lt gt lt thing2 gt gt lt s gt operator lt o gt lt o gt name move block moving block lt thing1 gt destination lt thing2 gt HHHHHHHHHHAE AHHH HHHHHAAAEHA AREER HHHR RRR HERE HHH RRA ARR H Make all acceptable move block operators also indifferent The conditions establish that 1 the state has an acceptable preference for an operator 2 the operator is named move block The actions 1 create an indifferent prefererence for the operator sp blocks world compare move block indifferent state lt s gt operator lt o gt lt o gt name move block gt lt s gt operator lt o gt 210 APPENDIX A THE BLOCKS WORLD PROGRAM HHHHHHHHHHAE AHHH HHHHAAAEHA AREER HH RRR RR ERR RHA AAR
186. ists of individual actions that can e Add new elements to working memory e Remove elements from working memory e Create preferences e Perform other actions When the conditions of a production match working memory the production is said to be instantiated and the production will fire during the next elaboration cycle Firing the production involves performing the actions using the same variable bindings that formed the instantiation 3 3 6 1 Variables in Actions Variables can be used in actions A variable that appeared in the condition side will be replaced with the value that is was bound to in the condition A variable that appears only in the action side will be bound to a new identifier that begins with the first letter of that variable e g lt o gt might be bound to 0234 This symbol is guaranteed to be unique and it will be used for all occurrences of the variable in the action side appearing in all working memory elements and preferences that are created by the production action 3 3 6 2 Creating Working Memory Elements An element is created in working memory by specifying it as an action Multiple augmen tations of an object can be combined into a single action using the same syntax as in conditions including path notation and multi valued attributes gt lt s gt block color red thing lt t1 gt lt t2 gt The action above is expanded to be gt lt s gt block lt b gt lt b gt
187. ition lt p1 gt name john type father spouse lt p2 gt would match only if there is no object in working memory that matches all three attribute value tests Example Production sp default evaluate object state lt ss gt operator lt so gt lt so gt type evaluation superproblem space lt p gt lt p gt default state copy no gt lt so gt default state copy yes Notes One use of negated conditions to avoid is testing for the absence of the working memory element that a production creates with I support this would lead to an infinite loop in your Soar program as Soar would repeatedly fire and retract the production 3 3 5 7 Negated conjunctions of conditions Conditions can be grouped into conjunctive sets by surrounding the set of conditions with and The production compiler groups the test in these conditions together This grouping allows for negated tests of more than one working memory element at a time In the example below the state is tested to ensure that it does not have an object on the table sp blocks negated conjunction example state lt s gt name top state lt s gt ontop lt on gt lt on gt bottom object lt bo gt lt bo gt type table gt lt s gt nothing ontop table true 3 3 PRODUCTION MEMORY 47 When using negated conjunctions of conditions the production has nested curly braces One set of curly braces delimi
188. itions of the production are listed each preceded by the number of currently active matches for that condition If a condition is negated it is preceded by a minus sign The pointer gt gt gt gt before a condition indicates that this is the first condition that failed to match When printing partial matches the default action is to print only the counts of the number of WME s that match and is a handy tool for determining which condition failed to match for a production that you thought should have fired At levels timetags and wmes the matches command displays the WME s immediately after the first condition that failed to match temporarily interrupting the printing of the production conditions themselves Notes When printing partial match information some of the matches displayed by this command may have already fired depending on when in the execution cycle this command is called To check for the matches that are about to fire use the matches command without a named production In Soar 8 the execution cycle decision cycle is input propose decide apply output it no longer stops for user input after the decision phase when running by decision cycles run d 1 If a user wishes to print the match set immediately after the decision phase and before the apply phase then the user must run Soar by phases run p 1 Examples This example prints the productions which are about to fire and the wmes that match the productions
189. ively expensive in Soar Thus these should be enabled with caution and understanding of their limitations First they will affect performance depending on the level set via the timers parameter A level of three for instance times every step in the cue based retrieval candidate episode search Furthermore because these iterations are relatively cheap typically a single step in the linked list of a b tree timer values are typically unreliable depending upon the system resolution is 1 microsecond or more 108 CHAPTER 7 EPISODIC MEMORY Chapter 8 The Soar User Interface This chapter describes the set of user interface commands for Soar All commands and examples are presented as if they are being entered at the Soar command prompt This chapter is organized into 7 sections 1 Basic Commands for Running Soar 2 Examining Memory 3 Configuring Trace Information and Debugging 4 Configuring Soar s Run Time Parameters 5 File System I O Commands 6 Soar I O commands 7 Miscellaneous Commands Each section begins with a summary description of the commands covered in that section including the role of the command and its importance to the user Command syntax and usage are then described fully in alphabetical order The following pages were automatically generated from the wiki version at http code google com p soar wiki CommandIndex on the date listed on the title page of this manual Please consult the wiki directly f
190. k These parameters trade off safety in the case of a program crash with database performance When optimization is set to performance the agent will have an exclusive lock on the database meaning it cannot be opened concurrently by another SQLite process such as SQLiteMan The lock can be relinquished by issuing epmem close or shutting down the Soar kernel The epmem backup command can be used to make a copy of the current state of the database whether in memory or on disk This command will commit all outstanding changes before initiating the copy When the path parameter is set to a non empty value for the first time the value of the database parameter is automatically changed to file The balance parameter sets the linear weight of match cardinality vs cue activation As a performance optimization when the value is 1 default activation is not computed If this value is not 1 even close such as 0 99 and working memory activation is enabled this value will be computed for each leaf WME which may incur a noticeable cost depending upon the overall complexity of the retrieval The graph match ordering parameter sets the heuristic by which identifiers are ordered during graph match assuming graph match is on The default undefined does not enforce any order and may be sufficient for small cues For more complex cues there will be a one time sorting cost during each retrieval if the parameter value is changed The currently ava
191. k per cycle stats then print them out using default sort stats track run stop stats cycle Print out per cycle stats sorting by decision cycle time stats cycle sort 2 Print out per cycle stats sorting by firing counts descending stats cycle sort 4 8 3 CONFIGURING TRACE INFORMATION AND DEBUGGING Save per cycle stats to file stats csv ctf stats csv stats cycle csv See Also timers init soar command to file verbose Control detailed information printed as Soar runs Synopsis verbose ed Options 141 d disable off Turn verbosity off default e enable on Turn verbosity on Description The verbose command enables tracing of a number of low level Soar execution details during arun The details printed by verbose are usually only valuable to developers debugging Soar implementation details Invoke with no arguments to query the current setting See Also watch warnings Enable or disable the printing of warning messages from the Soar kernel Synopsis warnings options 142 CHAPTER 8 THE SOAR USER INTERFACE Options e enable on Default Print all warning messages from the kernel d disable off Disable all except most critical warning messages Description Enables and disables the printing of warning messages If an argument is specified then the warnings are set to that state If no argument is given t
192. knowledge level but will necessar ily always fall short We can informally think of the way in which Soar falls short of the knowledge level as its peculiar psychology Those interested in using Soar to model human psychology would like Soar s psychology to approximate human psychology Those using Soar to create agent systems would like to make Soar s processing approximate the knowledge level as closely as possible However Soar 7 had a number of symbol level quirks that ap peared inconsistent with human psychology and that made building large scale knowledge based systems in Soar more difficult than necessary Bob Wray s thesis addressed many of these symbol level problems in Soar among them logical inconsistency in symbol manipula tions non contemporaneous constraints in chunks race conditions in rule firings and in the decision process and contention between original task knowledge and learned knowledge The Goal Dependency Set implements a solution to logical inconsistencies between persistent o supported working memory elements WMEs in a substate and its context The context consists of all the WMEs in any superstates above the local goal state In Soar any action application of an operator receives an o support preference This preference 1A preliminary draft by Robert Wray contact at wrayre acm org Robert E Wray Ensuring Reasoning Consistency in Hierarchical Architectures PhD thesis
193. l arise during the decision phase conflict is returned If predict determines a constraint failure will occur con straint is returned Otherwise predict will return the id of the operator to be chosen If operator selection will require probabilistic selection and no alterations to the probabilities are made between the call to predict and decision phase predict will manipulate the random number generator to enforce its prediction 8 4 CONFIGURING SOAR S RUNTIME PARAMETERS 169 See Also select rl Control how numeric indifferent preference values in RL rules are updated via reinforcement learning Synopsis rl g get lt parameter gt rl s set lt parameter gt lt value gt rl S stats lt statistic gt Options g get Print current parameter setting s set Set parameter value S stats Print statistic summary or specific statistic Description The rl command sets parameters and displays information related to reinforcement learning The print and watch commands display additional RL related information not covered by this command Parameters Due to the large number of parameters the rl command uses the get set lt parameter gt lt value gt convention rather than individual switches for each parameter Run ning rl without any switches displays a summary of the parameter settings 170 CHAPTER 8 THE SOAR USER INTERFACE apoptosis Automatic excising
194. le 162 CHAPTER 8 THE SOAR USER INTERFACE Description The max chunks command is used to limit the maximum number of chunks that may be created during a decision cycle The initial value of this variable is 50 allowable settings are any integer greater than 0 The chunking process will end after max chunks chunks have been created even if there are more results that have not been backtraced through to create chunks and Soar will proceed to the next phase A warning message is printed to notify the user that the limit has been reached This limit is included in Soar to prevent getting stuck in an infinite loop during the chunking process This could conceivably happen because newly built chunks may match immediately and are fired immediately when this happens this can in turn lead to additional chunks being formed etc If you see this warning something is seriously wrong Soar is unable to guarantee consistency of its internal structures You should not continue execution of the Soar program in this situation stop and determine whether your program needs to build more chunks or whether you ve discovered a bug in your program or in Soar itself max dc time Set a wall clock time limit such that the agent will be interrupted when a single decision cycle exceeds this limit Synopsis max dc time seconds n max dc time d Options n Maximum decision cycle time in microseconds d disable Disable this in
195. lection and application of operators is illustrated in Figure 2 1 Soar has separate memories and different representations for descriptions of its current Soar execution SS select apply select apply select apply Figure 2 1 Soar is continually trying to select and apply operators 6 CHAPTER 2 THE SOAR ARCHITECTURE situation and its long term knowledge In Soar the current situation including data from sensors results of intermediate inferences active goals and active operators is held in working memory Working memory is organized as objects Objects are described in terms of their attributes the values of the attributes may correspond to sub objects so the description of the state can have a hierarchical organization This need not be a strict hierarchy for example there s nothing to prevent two objects from being substructure of each other The long term knowledge which specifies how to respond to different situations in working memory can be thought of as the program for Soar The Soar architecture cannot solve any problems without the addition of long term knowledge Note the distinction between the Soar architecture and the Soar program The former refers to the system described in this manual common to all users and the latter refers to knowledge added to the architec ture A Soar program contains the knowledge to be used for solving a specific task or set of tasks including information
196. long with its augmentation property foo becomes a result and a chunk is formed Now if a rule in the subgoal adds another augmentation to the thing identifier property bar say that augmentation will also be a result as it is linked to an identifier which is linked to a superstate However if that rule matches the identifier through the substate the chunking process cannot determine how it is linked to the superstate and a chunk cannot be created Solution If the substructure of a result must be revised the rules that modify it should match the result through the superstate not through the local state Chapter 5 Reinforcement Learning Soar has a reinforcement learning RL mechanism that tunes operator selection knowledge based on a given reward function This chapter describes the RL mechanism and how it is integrated with production memory the decision cycle and the state stack We assume that the reader is familiar with basic reinforcement learning concepts and notation If not we recommend first reading Reinforcement Learning An Introduction 1998 by Richard S Sutton and Andrew G Barto The detailed behavior of the RL mechanism is determined by numerous parameters that can be controlled and configured via the r1 command Please refer to the documentation for that command in section 8 4 on page 169 5 1 RL Rules Soar s RL mechanism learns Q values for state operator pairs Q values are stored as nu meric indiff
197. m With bottom up learning chunks are learned only in states in which no subgoal has yet generated a chunk In this mode 8 4 CONFIGURING SOAR S RUNTIME PARAMETERS 161 chunks are learned only for the bottom of the subgoal hierarchy and not the intermediate levels With experience the subgoals at the bottom will be replaced by the chunks allowing higher level subgoals to be chunked Similarly enable through local negations and disable though local negations are orthogonal to the rest of the learn options These options control whether or not chunks can be created that are derived from rules that check for negated WMEs on the substate local negations Chunking through local negations can result in overgeneral chunks but disabling this ability will reduce the number of chunks formed The default is to enable chunking through local negations If chunking through local negations is disabled to see when chunks are discarded and why set watch learning print see watch command Learning can be turned on or off at any point during a run Examples To enable learning only at the lowest subgoal level learn e b To see all the force learn and dont learn states registered by RHS actions learn 1 See Also watch explain backtraces save backtraces max chunks Limit the number of chunks created during a decision cycle Synopsis max chunks n Options n Maximum number of chunks allowed during a decision cyc
198. ment to working memory 194 alias Define a new alias or command using existing 198 commands and arguments allocate Allocate additional 32 kilobyte blocks of mem 199 ory for a specified memory pool without running Soar capture input Store the input wmes in a file for reloading later 195 cd Change directory 184 chunk name format Specify format of the name to use for new 134 chunks clog Record all user interface input and output toa 185 file command to file Dump the printed output and results of a com 186 mand to a file default wme depth Set the level of detail used to print WMEs 121 dirs List the directory stack 187 echo commands Set whether or not commands are echoed to 200 other connected debuggers echo Print a string to the current output device 187 edit production Move focus in an editor to this production 201 epmem Control the behavior of episodic memory 149 excise Delete Soar productions from production mem 111 ory explain backtraces Print information about chunk and justification 155 backtraces firing counts Print the number of times each production has 135 fired gds print Print the WMEs in the goal dependency set for 122 each goal gp max Set the upper limit to the number of produc 113 tions generated by the gp command gp Generate productions according to a specified 112 pattern help Provide formatted usage information about Soar 114 commands indifferent selection Controls indifferent prefe
199. mmand has been issued stores Stores Number of times the store com mand has been issued Timers Semantic memory also has a set of internal timers that record the durations of certain operations Because fine grained timing can incur runtime costs semantic memory timers are off by default Timers of different levels of detail can be turned on by issuing smem set timers lt level gt where the levels can be off one two or three three being most detailed and resulting in all timers being turned on Note that none of the semantic memory statistics nor timing information is reported by the stats command All timer values are reported in seconds Level one total Total smem operations Level two smem_api Agent command validation smem_hash Hashing symbols smem_init Semantic store initialization smem_ncb_retrieval Adding concepts and children to working memory smem_query Cue based queries smem_storage Concept storage Level three three_activation Recency information maintenance 178 CHAPTER 8 THE SOAR USER INTERFACE Manual Storage Concepts can be manually added to the semantic store using the smem add lt concept gt command The format for specifying the concept is similar to that of adding WMEs to working memory on the RHS of productions For example smem add lt arithmetic gt addi0 facts lt a01 gt lt a02 gt lt a03 gt lt a01 g
200. model Note that we do imple ment the Petrov 2006 approximation with a history size set as a compile time parameter default 10 The base update policy sets the frequency with which activation is re computed The default stable only recomputes activation when a memory is referenced through storage or retrieval The naive setting will update the entire candidate set of memories defined as those that match the most constraining cue WME during a retrieval which has severe performance detriment and should be used for experimentation or those agents that require high fidelity retrievals The incremental policy updates a constant num ber of memories those with last access ages defined by the base incremental threshes set The smem backup command can be used to make a copy of the current state of the database whether in memory or on disk This command will commit all outstanding changes before initiating the copy The merge parameter controls how the augmentations of retrieved long term identifiers LTIs interact with an existing LTI in working memory If the LTI is not in working memory or has no augmentations in working memory this parameter has no effect If the augmentation is in working memory and has augmentations by default add semantic memory will add any augmentations that augmented the LTI in a retrieved LTI are added to working memory If the parameter is set to none then semantic memory will not augment the LTI Note this i
201. mory The size of each page however may be important whether databases are disk or memory based This setting can have far reaching consequences such as index B tree depth While this setting can be dependent upon a particular situation a good heuristic is that short simple runs should use small values of the page size 1k 2k 4k whereas longer more complicated runs will benefit from larger values 8k 16k 32k 64k One known situation of concern is that as indexed tables accumulate many rows millions insertion time of new rows can suffer an infrequent but linearly increasing burst of computation In episodic memory this situation will typically arise with many episodes and or many working memory changes Increasing the page size will reduce the intensity of the spikes at the cost of increasing disk I O and average total time for episode storage Thus the settings of page size for long complicated runs establishes the desired balance of reactivity i e max computation and average speed To ground this discussion the Figure 7 1 depicts maximum and average episodic storage time the value of the epmem_storage timer converted to milliseconds with different page sizes after 10 million decisions 1 episode decision of a very basic agent i e very few working memory changes per episode running on a 2 8GHz Core i7 with Mac OS X 10 6 5 While only a single use case the cross point of these data forms the basis for the decision to default
202. mples 185 To move to the relative directory named home soar agents cd home soar agents See Also dirs ls pushd popd source pwd clog Record all user interface input and output to a file Synopsis clog Ae filename clog a string clog cdoq Options filename Open filename and begin logging c close o off d disable Stop logging close the file a add string Add the given string to the open log file q query Returns open if logging is active or closed if logging is not active A append e existing Opens existing log file named filename and logging is added at the end of the file Description The clog command allows users to save all user interface input and output to a file When Soar is logging to a file everything typed by the user and everything printed by Soar is written to the file in addition to the screen Invoke clog with no arguments or with q to query the current logging status Pass a filename to start logging to that file relative to the command line interface s home directory see the home command Use the close option to stop logging 186 CHAPTER 8 THE SOAR USER INTERFACE Examples To initiate logging and place the record in foo log clog foo log To append log data to an existing foo log file clog A foo log To terminate logging and close the open log file clog c Known Issues Does not log everything when structu
203. n impasse from arising Chunks are very similar to justifications in that they are both formed via the backtrac ing process and both create a result in their actions However there are some important distinctions 1 Chunks are productions and are added to production memory Justifications do not appear in production memory 2 Justifications disappear as soon as the working memory element or preference they provide support for is removed 3 Chunks contain variables so that they may match working memory in other situations justifications are similar to an instantiated chunk 2 8 INPUT AND OUTPUT 31 2 8 Input and Output Many Soar users will want their programs to interact with a real or simulated environment For example Soar programs may control a robot receiving sensory inputs and sending command outputs Soar programs may also interact with simulated environments such as a flight simulator Input is viewed as Soar s perception and output is viewed as Soar s motor abilities When Soar interacts with an external environment it must make use of mechanisms that allow it to receive input from that environment and to effect changes in that environment the mechanisms provided in Soar are called input functions and output functions Input functions add and delete elements from working memory in response to changes in the external environment Output functions attempt to effect changes in the external environment Input is proce
204. n printing objects matching a specific pattern timetag print the object in working memory with the given timetag Printing the current subgoal stack s stack Specifies that the Soar goal stack should be printed By default this includes both states and operators o operators When printing the stack print only operators S states When printing the stack print only states Description The print command is used to print items from production memory or working memory It can take several kinds of arguments When printing items from working memory the Soar objects are printed unless the internal flag is used in which case the wmes themselves are printed identifier attribute value activation The activation value is only printed if activation is turned on See wma The pattern is surrounded by parentheses The identifier attribute and value must be valid Soar symbols or the wildcard symbol which matches all occurrences The optional symbol restricts pattern matches to acceptable preferences If wildcards are included an object will be printed for each pattern match even if this results in the same object being printed multiple times Examples Print the objects in working memory and their timetags which have wmes with identifier si and value v2 note this will print the entire s1 object for each match found print internal s1 v2 132 CHAPTER 8 THE SOAR USE
205. n production s This number is a function of the number of elements in working memory that match each production Therefore this command will not provide useful information at the beginning of a Soar run when working memory is empty and should be called in the middle or at the end of a Soar run The memories command is used to find the productions that are using the most memory and therefore may be taking the longest time to match this is only a heuristic By identifying these productions you may be able to rewrite your program so that it will run more quickly Note that memory usage is just a heuristic measure of the match time A production might not use much memory relative to others but may still be time consuming to match and excising a production that uses a large number of tokens may not speed up your program because the Rete matcher shares common structure among different productions 8 2 EXAMINING MEMORY 127 As a rule of thumb numbers less than 100 mean that the production is using a small amount of memory numbers above 1000 mean that the production is using a large amount of memory and numbers above 10 000 mean that the production is using a very large amount of memory See Also matches preferences Examine details about the preferences that support the specified identifier and attribute Synopsis preferences options identifier attribute Default Aliases pref preferences
206. n the conditions Also notice that the symbol is optional when specifying acceptable preferences in the actions of a production although using this symbol will make the semantics of your productions clearer in many instances The symbol will always appear when you inspect preference memory with the preferences command Productions are never needed to delete preferences because preferences will be retracted when the production no longer matches Preferences should never be created by operator application rules and they should always be created by rules that will give only I support to their actions 3 3 6 5 Shorthand notations for preference creation There are a few shorthand notations allowed for the creation of operator preferences on the right hand side of productions Acceptable preferences do not need to be specified with a symbol lt s gt operator lt op1 gt is assumed to mean lt s gt operator lt op1 gt Ambiguity can easily arise when using a preference that can be either binary or unary gt lt The default assumption is that if a value follows the preference then the preference is binary It will be unary if a carat up arrow a closing parenthesis another preference or a comma follows it Below are four examples of legal although unrealistic actions that have the same effect lt s gt operator lt 01 gt lt 02 gt lt 02 gt lt lt ol gt lt 03 gt lt 04 gt lt s gt ope
207. n the processing in the impasse state Consider the operator trace in Figure 5 1 5 3 UPDATING RL RULE VALUES 87 01 o O1 O1 O85 S1 r3 r3 r4 S2 02 03 04 Figure 5 1 Example Soar substate operator trace e At decision cycle 1 RL operator O1 is selected in 1 and causes an operator no change impass for three decision cycles In the substate S2 operators 02 O3 and O4 are selected and applied sequentially e Meanwhile in 1 reward values r2 r3 and r4 are put on the reward link sequen tially Finally the impasse is resolved by 04 the proposal for O1 is retracted and RL operator O5 is selected in S1 In this scenario only the RL update for Q s 01 will be different from the ordinary case Its value depends on the setting of the hrl discount parameter of the rl command When this parameter is set to the default value on the rewards on S1 and the Q value of O5 are discounted by the number of decision cycles they are removed from the selection of O1 In this case the update for Q s1 01 is i a r2 yr3 7ra Q s5 05 Q s1 01 which is equivalent to having a three decision gap separating O1 and O5 When hrl discount is set to off the number of cycles O1 has been impassed will be ignored Thus the update would be 6 a r2 r3 r4 7Q s5 05 Q s1 01 For impasses other than operator no change RL acts as if the impasse hadn t occurred If O1 is the last RL operator s
208. ndition 1 is that operator augmentations should always receive i support Soar has been written to recognize augmentations directly off the operator ie lt o gt augmentation value and to attempt to give them i support However there was some confusion about what to do about a production that simultaneously tests an operator doesn t propose an operator adds an operator augmentation and adds a non operator augmentation such as sp operator augmentation application state lt s gt task test support operator lt o gt gt lt o gt new augmentation lt s gt new augmentation In o support mode 3 both RHS actions receive o support in o support mode 4 both receive i support In either case Soar will print a warning on firing this production because this is considered bad coding style Appendix D The Resolution of Operator Preferences During the decision phase operator preferences are evaluated in a sequence of eight steps in an effort to select a single operator Each step handles a specific type of preference as illustrated in Figure D 1 The figure should be read starting at the top where all the operator preferences are collected and passed into the procedure At each step the procedure either exits through a arrow to the right or passes to the next step through an arrow to the left Input to the procedure are the set of current operator preferences and the output consists of 1 a subs
209. nging to it Synopsis pushd directory Description Maintain a stack of working directories and push the directory on to the stack Can be relative path name or fully specified See Also cd dirs home ls popd source pwd pwd Print the current working directory 190 Synopsis pwd Default Aliases CHAPTER 8 THE SOAR USER INTERFACE topd pwd Description Prints the current working directory of Soar rete net Save the current Rete net or restore a previous one Synopsis rete net s l filename Default Aliases rn rete net Options s save Save the Rete net in the named file Cannot be saved when there are justifications present Use excise j 1 r load restore Load the named file into the Rete network working mem ory and production memory must both be empty Use excise filename The name of the file to save or load Description The rete net command saves the current Rete net to a file or restores a Rete net previously saved The Rete net is Soar s internal representation of production and working memory the conditions of productions are reordered and common substructures are shared across different productions This command provides a fast method of saving and loading productions since a special format is used and no parsing is necessary Rete net files are portable across platforms that support Soar 8 5 FILE SYSTEM I O COMMANDS
210. ning wma without any switches displays a summary of the parameter settings 182 CHAPTER 8 THE SOAR USER INTERFACE Parameter Description Possible values Default activation Enable working mem on off off ory activation decay rate WME decay factor 0 1 0 5 decay thresh Forgetting threshold 0 inf 2 0 forgetting Enable removal of on off off WMEs with low activation values forget wme If lti only remove all lti all WMEs with a long term id max pow cache Maximum size in 1 2 10 MB for the internal pow cache petrov approx Enables the Petrov on off off 2006 long tail ap proximation timers Timer granularity off one off The decay rate and decay thresh parameters are entered as positive decimals but are internally converted to and printed out as negative The petrov approx may provide additional validity to the activation value but comes at a significant computational cost as the model includes unbounded positive exponential computations which cannot be reasonably cached When activation is enabled the system produces a cache of results of calls to the pow function as these can be expensive during runtime The size of the cache is based upon three run time parameters decay rate decay thresh and max pow cache and one compile time parameter WMA REFERENCES PER DECISION default value of 50 which estimates the maximum number of times a WME will be refer
211. nism is enabled 5 4 AUTOMATIC GENERATION OF RL RULES 89 5 4 Automatic Generation of RL Rules The number of RL rules required for an agent to accurately approximate operator Q values is usually infeasibly large to write by hand even for small domains Therefore several methods exist to automate this 5 4 1 The gp Command The gp command can be used to generate productions based on simple patterns This is useful if the states and operators of the environment can be distinguished by a fixed number of dimensions with finite domains An example is a grid world where the states are described by integer row column coordinates and the available operators are to move north south east or west In this case a single gp command will generate all necessary RL rules gp gen rl rules state lt s gt name gridworld operator lt o gt row 1234 col 1234 lt o gt name move direction north south east west gt lt s gt operator lt o gt 0 0 For more information see the documentation for this command on page 112 5 4 2 Rule Templates Rule templates allow Soar to dynamically generate new RL rules based on a predefined pattern as the agent encounters novel states This is useful when either the domains of environment dimensions are not known ahead of time or when the enumerable state space of the environment is too large to capture in its entirety using gp but the agent will only encounter a
212. not the first result is included in the chunk for the second result depends on the links that were used to match the first result in the subgoal If the elements are linked to the superstate they are included as conditions If the elements are not linked to the superstate then the result is traced through In some cases there may be more than one set of links so it is possible for a result to be both backtraced through and included as a condition 4 3 VARIABLIZING IDENTIFIERS TT 4 3 Variablizing Identifiers Chunks are constructed by examining the traces which include working memory elements and operator preferences To achieve any useful generality in chunks identifiers of actual objects must be replaced by variables when the chunk is created otherwise chunks will only ever fire when the exact same objects are matched However a constant value is never variablized the actual value always appears directly in the chunk When a chunk is built all occurrences of the same identifier are replaced with the same variable This can lead to an overspecific chunk when two variables are forced to be the same in the chunk even though distinct variables in the original productions just happened to match the same identifier A chunk s conditions are also constrained by any not equal lt gt tests for pairs of indentifiers used in the conditions of productions that are included in the chunk These tests are saved in the production traces and th
213. nt blocks world propose move block dont do this sp blocks world propose move block dont do this state lt s gt problem space blocks thing lt thing2 gt thing lt gt lt thing2 gt lt thing1 gt ontop lt o 1 gt ontop lt o 2 gt lt thing2 gt clear yes lt thing1 gt clear yes type block lt o 1 gt top block lt thing1 gt lt o 2 gt bottom block lt gt lt thing2 gt lt b 1 gt gt lt s gt operator lt o gt 3 3 PRODUCTION MEMORY 53 lt o gt name move block moving block lt thing1 gt destination lt thing2 gt Soar has expanded the production into the longer form and created two distinctive variables lt o 1 gt and lt o 2 gt to represent the ontop attribute These two variables will not necessarily bind to the same identifiers in working memory Negated multi valued attributes and attribute path notation Negations of multi valued attributes can be combined with attribute path notation How ever it is very easy to make mistakes when using negated multi valued attributes with attribute path notation Although it is possible to do it correctly we strongly discourage its use For example sp blocks negated conjunction example state lt s gt name top state ontop bottom object name table A gt lt s gt nothing ontop A or table true gets expanded to sp blocks negated conjunction example state lt s gt name top
214. ntation that signifies the current operator and is created based on preferences The specific attributes that the Soar architecture automatically creates are listed in Section 3 4 Productions may create any other attributes for states Preferences are held in a separate preference memory where they cannot be tested by produc tions however acceptable preferences are held in both preference memory and in working memory By making the acceptable preferences available in working memory the accept able preferences can be tested for in productions allowing the candidates operators to be compared before they are selected 16 CHAPTER 2 THE SOAR ARCHITECTURE production name condition1 maybe some more conditions _ action1 Maybe some more actions C A C A C A C A C A C l C A C A C A C A C A C A C A C A C A C A C A An Abstract View of Production Memory Figure 2 7 An abstract view of production memory The productions are not related to one another 2 3 Production Memory Long term Knowledge Soar represents long term knowledge as productions that are stored in production memory illustrated in Figure 2 7 Each production has a set of conditions and a
215. of counting Each newly counted item replaces the old value of the count 3 Remembering Agents oftentimes need to remember an external situation or stim ulus even when that perception is no longer available 4 Avoiding Expensive Computations In some situations an agent may have the information needed to assert some belief in a new world state but the expense of performing the computation necessary for the assertion given what is already known makes the computation avoidable For example in dynamic complex domains deter mining when to make an expensive calculation is often formulated as an explicit agent task When remembering or avoiding an expensive computation the agent designer is making a commitment to retain something even though it might not be supported in the current context In Soar 8 these WMEs should be asserted in the top state For many Soar systems especially those focused on execution in a dynamic environment most o supported elements will need to be stored on the top state For any kind of local non monotonic reasoning about the context counting projection planning features should be stored locally When a dependent context change occurs the GDS interrupts the processing by removing the state While this may seem like a severe over reaction formal and empirical analysis have suggested that this solution is less computationally expensive than attempting to identify the specific dependent assumption Operato
216. of pro ductions via base level decay none chunks rl chunks none apoptosis decay Base level decay parameter 0 1 0 5 apoptosis thresh Base level threshold parameter negates plied value sup 0 inf chunk stop If enabled chunking does not create du plicate RL rules that differ only in numeric indifferent preference value on off on discount rate Temporal dis count gamma 0 1 0 9 eligibility trace decay rate Eligibility trace decay factor lambda 0 1 eligibility trace tolerance Smallest eli gibility trace value not con sidered 0 0 inf 0 001 hrl discount Discounting of RL updates over time in impassed states on off off learning Reinforcement learning abled en on off off learning rate Learning rate alpha 0 1 0 3 learning policy Value policy update sarsa q learning sarsa temporal discount RL over Discount updates gaps on off on temporal extension Propagation of RL updates over gaps on off on 8 4 CONFIGURING SOAR S RUNTIME PARAMETERS 171 Apoptosis is a process to automatically excise chunks via the base level decay model where rule firings are the activation events A value of chunks has this apply to any chunk whereas rl chunks means only chunks that are also RL rules can be forgotten
217. of the destination This production is part of the application of a move block operator The conditions establish that 1 An operator has been selected for the current state a the operator is named move block b the operator has a moving block and a destination The actions 1 create an acceptable preference for a new ontop relation create acceptable preferences for the substructure of the ontop relation the top block and the bottom block HH HH HOH OH OH OF sp blocks world apply move block add new ontop state lt s gt operator lt o gt 211 lt o gt name move block moving block lt block1 gt destination lt block2 gt gt lt s gt ontop lt ontop gt lt ontop gt top block lt block1 gt bottom block lt block2 gt HEFHHHHEHHHHAAHHHAEAEHHAAEHHAA AERA R AERA R RHA HERR A EH ER RRR H RRR A RRR A RES HHFHHHHEHHHHAAHHHEEAEHHRAREHAR AERA R ARERR RHR H HERR AA EERE A RRR A REE AERA RRR Detect that the goal has been achieved The conditions establish that 1 The state has a problem space named blocks 2 The state has three ontop relations b a block named B is ontop a block named C a block named C is ontop a block named TABLE a The actions 1 print a message for the user that the A B C tower has been built 2 halt Soar a a block named A is ontop a block named B sp blocks world detect goal state lt s gt problem space
218. on cycle When set to dc new episodes will be encoded every decision cycle The exclusions parameter can be used to prevent episodic memory from encoding parts of working memory into new episodes The value of exclusions is a list of string constants During encoding episodic memory will walk working memory starting from the top state identifier If it encounters a WME whose attribute is a member of the exclusions list episodic memory will ignore that WME and abort walking the children of that WME and they will not be included in the encoded episode Note that if the children of the excluded WME can be reached from top state via an alternative non excluded path they will still be included in the encoded episode The exclusions parameter behaves differently from other parameters in that issuing epmem set exclusions lt val gt does not set its value to lt val gt Instead it will toggle the membership of lt val gt in the exclusions list Various runtime behaviors for the SQLite database backing episodic memory can be set via the parameters The most commonly used is path which specifies the file system path the database is stored in When path is empty the database is stored in main memory and will be lost when the agent exits When path is set to a valid file system path the SQLite database is written to that path When this is the case the commit and optimization parameters control how often cached database changes are written to dis
219. on their left hand sides matches assertions wmes This example prints the wme timetags for a single production matches t my first production 126 CHAPTER 8 THE SOAR USER INTERFACE memories Print memory usage for partial matches Synopsis memories options number memories production_name Options c chunks Print memory usage of chunks d default Print memory usage of default productions j justifications Print memory usage of justifications u user Print memory usage of user defined productions production_name Print memory usage for a specific production number Number of productions to print sorted by those that use the most memory T template Print memory usage of Soar RL templates Description The memories command prints out the internal memory usage for full and partial matches of production instantiations with the productions using the most memory printed first With no arguments the memories command prints memory usage for all productions If a production_name is specified memory usage will be printed only for that production If a positive integer number is given only number productions will be printed the number productions that use the most memory Output may be restricted to print memory usage for particular types of productions using the command options Memory usage is recorded according to the tokens that are allocated in the rete network for the give
220. on will also fire once and never retract sp elaborate table clear state lt s gt problem space blocks thing lt table gt lt table gt type table gt lt table gt clear yes HHHHHHEHHHHHHHHHHHHHHHEHHHEHHHHHEHHEHHEHHHHHEHE HEHEHE HHHE HE RHHHHHHHH HEHEHE RHEE Calculate whether a block is clear The conditions establish that 1 The state has a problem space named blocks The state has a thing of type block 3 There is no ontop relation having the block as its bottom block The action 1 create an acceptable preference for an attribute value pair asserting the block is clear This production will retract whenever an ontop relation for the given block 209 is created Since the lt block gt clear yes wme only has i support it will be removed from working memory automatically when the production retracts sp elaborate block clear state lt s gt problem space blocks thing lt block gt lt block gt type block lt ontop gt bottom block lt block gt gt lt block gt clear yes HHEHHHHEAHHHEAHHHRAAH HAAS HHA AHR AHH HAR AHH A AHR AHA Raa Raa Suggest MOVE BLOCK operators This production proposes operators that move one block ontop of another block The conditions establish that The state has a problem space named blocks The block moved and the block moved TO must be both be clear The block moved
221. once inconsistency arises the problem being solved in the subgoal may no longer be the problem that actually needs to be solved Luckily not all changes to a superstate lead to inconsistency In order to detect inconsistencies Soar maintains a dependency set for every subgoal substate The dependency set consists of all working memory elements that were tested in the condi tions of productions that created O supported working memory elements that are directly or indirectly linked to the substate Thus whenever such an O supported working memory element is created Soar records which working memory elements that exist in a superstate were tested directly or indirectly in creating that working memory element dependency set Whenever any of the working memory elements in the dependency set of a substate change the substate is regenerated Note that the creation of I supported structures in a subgoal does not increase the depen dency set nor do O supported results Thus only subgoals that involve the creation of internal O support working memory elements risk regeneration and then only when the basis for the creation of those elements changes Substate Removal Whenever a substate is removed all working memory elements and preferences that were created in the substate that are not results are removed from working memory In Figure 2 10 state S3 will be removed from working memory when the impasse that created it is resolved that is when su
222. ong term identifier satisfies ALL of these requirements an error is returned lt s gt smem result failure lt cue gt Otherwise two WMEs are added lt s gt smem result success lt cue gt lt s gt smem result retrieved lt retrieved lti gt During a cue based retrieval it is possible that the retrieved long term identifier is not in working memory If this is the case semantic memory will add the long term identifier to working memory with letter number pair as was originally stored As with non cue based retrievals all of the augmentations of the long term identifier in se mantic memory are added as new WMEs to working memory It is possible that multiple long term identifiers match the cue equally well In this case se mantic memory will retrieve the long term identifier that was most recently stored retrieved The cue based retrieval process can be further tempered using optional modifiers e The prohibit command requires that the retrieved long term identifier is not equal to a supplied long term identifier lt s gt smem command prohibit lt bad 1ti gt Multiple prohibit command WMEs may be issued as modifiers to a single cue based retrieval This method can be used to iterate over all matching long term identifiers 6 5 PERFORMANCE 97 6 5 Performance Initial empirical results with toy agents show that semantic memory queries carry up to a 40 overhead as compared to comparable rete matching However t
223. ons If no id is supplied currently selected operator if applicable is displayed id case insensitive operator id of the operator to be selected in the next decision phase Description The select command will force the selection of an operator whose id is supplied as an argument during the next decision phase If the argument is not a proposed operator in the next decision phase an error is raised and operator selection proceeds as if the select command had not been called Otherwise the supplied operator will be selected as the next operator regardless of preferences If select is called with no id argument the command returns the operator id currently forced for selection by a previous call to select if one exists Examples Assuming operator O2 is a valid operator this would select it as the next operator to be selected select 02 After this command issuing just select will get O2 as a return select See Also predict 8 4 CONFIGURING SOAR S RUNTIME PARAMETERS 173 set stop phase Controls the phase where agents stop when running by decision Synopsis set stop phase ABadiop Options Options A and B are optional and mutually exclusive If not specified the default is B Only one of a d i o p must be selected With no options reports the current stop phase A after Stop after specified phase B before Stop before specified phase
224. or the most accurate and up to date information For a concise overview of the Soar interface functions see the Function Summary and Index on page 235 This index is intended to be a quick reference into the commands described in this chapter 109 110 CHAPTER 8 THE SOAR USER INTERFACE Notation The notation used to denote the syntax for each user interface command follows some general conventions e The command name itself is given in a bold font e Optional command arguments are enclosed within square brackets and e A vertical bar separates alternatives e Curly braces are used to group arguments when at least one argument from the set is required e The commandline prompt that is printed by Soar is normally the agent name followed by gt In the examples in this manual we use soar gt e Comments in the examples are preceded by a and in line comments are preceded by For many commands there is some flexibility in the order in which the arguments may be given See the online help for each command for more information We have not incorporated this flexible ordering into the syntax specified for each command because doing so complicates the specification of the command When the order of arguments will affect the output produced by a command the reader will be alerted 8 1 Basic Commands for Running Soar This section describes the commands used to start run and stop
225. or via an agent bug such as dropping inadvertently to state no change impasses Max memory usage Get the number of bytes that when exceeded by an agent will trigger the memory usage exceeded event Synopsis max memory usage n 8 4 CONFIGURING SOAR S RUNTIME PARAMETERS 165 Options n Size of limit in bytes Description The max memory usage command is used to trigger the memory usage exceeded event The initial value of this is 100MB 100 000 000 allowable settings are any integer greater than 0 The code supporting this event is not enabled by default because the test can be computationally expensive and is needed only for specific embedded applications Users may enable the test and event generation by uncommenting code in mem cpp Using the command with no arguments displays the current limit max nil output cycles Limit the maximum number of decision cycles that are executed without producing output when run is invoked with run til output args Synopsis max nil output cycles n Options n Maximum number of consecutive output cycles allowed without producing output Must be a positive integer Description This command sets and prints the maximum number of nil output cycles output cycles that put nothing on the output link allowed when running using run til output run output If n is not given this command prints the current number of nil output cycles allowed If
226. ores when the working memory elements were created thus allowing the conditions of one chunk to test the results of a chunk learned earlier in the subgoal The user can observe the backtracing process by setting setting backtracing on using the watch command watch backtracing on see Section 8 3 on page 142 This prints out a trace of the conditions as they are collected Certain productions do not participate in backtracing If a production creates only a reject preference or a desirability preference better worse indifferent or parallel then neither the preference nor the objects that led to its creation will be included in the chunk The exception to this is that if the desirability or reject preference is a result of a subgoal it will be in the chunk s actions Desirability and reject preferences should be used only as search control for choosing between legal alternatives and should not be used to guarantee the correctness of the problem solving The argument is that such preferences should affect only the efficiency and not the correctness of problem solving and therefore are not necessary to produce the results Necessity preferences require or prohibit should be used to enforce the correctness of problem solving the productions that create these preferences will be included in backtracing Given that results can be created at any point during a subgoal it is possible for one result to be relevant to another result Whether or
227. ot give returns the current a auto reduce on off Get Set auto reduction setting if setting not provided returns the current Description The indifferent selection command allows the user to set options relating to selection between operator proposals that are mutually indifferent in preference memory The primary option is the exploration policy each is covered below When Soar starts softmaz is the default policy Note As of version 9 3 2 the architecture no longer automatically changes the policy to epsilon greedy the first time Soar RL is enabled Some policies have parameters to temper behavior The indifferent selection command pro vides basic facilities to automatically reduce these parameters exponentially and linearly each decision cycle by a fixed rate In addition to setting these policies rates the auto reduce option enables the automatic reduction system disabled by default for which the Soar decision cycle incurs a small performance cost Exploration Policies b boltzmann Tempered softmax uses temperature g epsilon greedy Tempered greedy uses epsilon x softmax Random biased by numeric indifferent values if a non positive value is encountered resorts to a uniform random selection f first Deterministic first indifferent preference is selected 1l last Deterministic last indifferent preference is selected 8 4 CONFIGURING SOAR
228. out a statistic will list the values of all statistics Unlike timers statistics will always be up dated Available statistics are Name Label Description forgotten wmes Forgotten WMEs Number of WMEs removed from working memory due to forgetting Timers Working memory activation also has a set of internal timers that record the du rations of certain operations Because fine grained timing can incur runtime costs working memory activation timers are off by default Timers of different levels of detail can be turned on by issuing wma set timers lt level gt where the levels can be off or one one being most detailed and resulting in all timers being turned on Note that none of the working memory activation statistics nor timing information is reported by the stats command All timer values are reported in seconds Level one wma_forgetting Time to process forgetting operations each cycle wma_history Time to consolidate reference histories each cycle See Also print 8 5 File System I O Commands This section describes commands which interact in one way or another with operating system input and output or file I O Users can save retrieve information to from files redirect the information printed by Soar as it runs and save and load the binary representation of productions The specific commands described in this section are Summary cd Change directory clog Record all user in
229. p to the environment to check for these flags and honor them Some use cases include run self runs one agent but not the environment run self update runs one agent and the environment run runs all agents and the environment run noupdate runs all agents but not the environment Setting an interleave size When there are multiple agents running within the same process it may be useful to keep agents more closely aligned in their execution cycle than the run increment elaboration phases decisions output specifies For instance it may be necessary to keep agents in lock step at the phase level even though the run command issued is for 5 decisions Some use cases include run d 5 i p run the agent one phase and then move to the next agent looping over agents until they have run for 5 decision cycles run o 3 i d run the agent one decision cycle and then move to the next agent When an agent generates output for the 3rd time it no longer runs even if other agents continue The interleave parameter must always be equal to or smaller than the specified run pa rameter Note If Soar has been stopped due to a halt action an init soar command must be issued before Soar can be restarted with the run command sp Define a Soar production Synopsis sp production_body 118 CHAPTER 8 THE SOAR USER INTERFACE Options production body A Soar product
230. pectedly often something has changed in the goal dependency set causing a subgoal to be regenerated prior to producing a result Warnings gds print is horribly inefficient and should not generally be used except when something is going wrong and you need to examine the Goal Dependency Set internal symbols Print information about the Soar symbol table Synopsis internal symbols Description The internal symbols command prints information about the Soar symbol table Such infor mation is typically only useful for users attempting to debug Soar by locating memory leaks or examining I O structure Example soar gt internal symbols Symbolic Constants operator accept evaluate object problem space sqrt interrupt mod 124 goal io CHAPTER 8 THE SOAR USER INTERFACE additional symbols deleted for brevity Integer Constants Floating Point Constants Identifiers Variables lt o gt lt sso gt lt to gt lt ss gt lt ts gt lt so gt lt sss gt matches Prints information about partial matches and the match set Synopsis matches options production_name matches options alr Options production_name Print partial match information for the named production n names c count For the match set print only the names of the productions that are about to fire or retract the default If printing partial matches for a product
231. pointer is set to NIL the GDS itself is retained because we don t want to have to reallocate memory for the GDS if we need to add to it later 228 APPENDIX E A GOAL DEPENDENCY SET PRIMER PROC create_new_assertion Whenever a new o supported element is asserted the GDS is updated to include any new context dependencies Ainst lt instantiation that asserted acceptable preference for A IF A is an o supported WME G is the goal state in which A is asserted Gaps append G eps elaborate_ GDS A END PROC elaborate GDS assertion A S 4 NIL FOR Each assertion c in Ainsi the instantiation supporting A D IF GoalLevel c closer to top state than GoalLevel A append c S append context dependency to GDS ELSEIF GoalLevel c same as GoalLevel A AND cis NOT an o supported WME AND c has not previously been inspected 6 S append S elaborate_GDS c compute GDS dependencies for c and add to goal s GDS Cinspected lt true c s context dependencies have been added to the GDS no need to consider it again for this GDS return S the list of new dependencies in the GDS END PROC GoalLevel assertion A Return the goal level associated with assertion A Figure E 3 The algorithm for determining members of the GDS 229 State gds_ struct goal wmes_in_gds wme_ struct gds gds_ prev gds_next wme_struct gds gds_ prev gds_next Fi
232. power reserves were still available Of course for this specific example the knowledge designer can encode some knowledge to react to this inconsistency The fundamental problem is that the knowledge designer has to consider all possible interactions between all o supported WMEs and all contexts Soar systems often use the architecture s impasse mechanism to realize a form of decomposition These potential interactions mean that the knowledge developer cannot focus on individ ual problem spaces when creating knowledge which makes knowledge development more difficult Further in all but the simplest systems the knowledge designer will miss some potential interactions The result is agents are that were unnecessarily brittle failing in difficult to understand difficult to duplicate ways The GDS also solves the the problem of non contemporaneous constraints in chunks A non contemporaneous constraint refers to two or more conditions that never co occur simul taneously An example might be a driving robot that learned a rule that attempted to match red light and green light simultaneously Obviously for functioning traffic lights this rule would never fire By ensuring that local persistent elements are always consistent with the higher level context non contemporaneous constraints in chunks are guaranteed not to happen The GDS captures context dependencies during processing meaning the architecture will identify and respond to
233. pplication of operators are maintained throughout the existence of the state in which the operator is applied unless explicitly removed or if they become unlinked Working memory elements are removed by a reject action of a operator application rule Whether a working memory element receives O support or I support is determined by the structure of the production instantiation that creates the working memory element O support is given only to working memory elements created by operator application produc tions An operator application production tests the current operator of a state and modifies the state Thus a working memory element receives O support if it is for an augmentation of the current state or substructure of the state and the conditions of the instantiation that created it test augmentations of the current operator When productions are matched all productions that have their conditions met fire creating 2 4 PREFERENCE MEMORY SELECTION KNOWLEDGE 19 or removing working memory elements Also working memory elements and preferences that lose I support are removed from working memory Thus several new working memory elements and preferences may be created and several existing working memory elements and preferences may be removed at the same time Of course all this doesn t happen literally at the same time but the order of firings and retractions is unimportant and happens in parallel from a functional perspective
234. pport information for all WMEs whose value is the identifier When an identifier and the object flag are specified Soar prints the preferences WME support for all WMEs comprising the specified identifier For the time being cmd _numeric_indifferent_mode numeric indifferent preferences are listed under the heading binary indifferents By default using the wmes option with a WME on the top state will only print the timetags To change this the kernel can be recompiled with DO_TOP_LEVEL_REF_CTS but this has other consequences see comments in kernel h Examples This example prints the preferences on S1 operator and the production names which created the preferences soar gt preferences S1 operator names Preferences for S1 operator acceptables 02 fill From waterjug propose fill 03 fill From waterjug propose fill unary indifferents 02 fill From waterjug propose fill 03 fill From waterjug propose fill If the current state is S1 then the above syntax is equivalent to preferences n This example shows the support for the WMEs with the jug attribute soar gt preferences s1 jug Preferences for S1 jug acceptables S1 jug I4 0 8 2 EXAMINING MEMORY 129 S1 jug J1 0 This example shows the support for the WMEs with value J1 and the productions that generated them soar gt pref J1 1 Support for 31 03 jug J1 03 7 jue J1 From water jug propose fill
235. pressing a button on a Graphical User Interface GUI See Also Tun Warnings If the graphical interface doesn t periodically do an update of flush the pending I O then it may not be possible to interrupt a Soar agent from the command line 8 2 Examining Memory This section describes the commands used to inspect production memory working memory and preference memory to see what productions will match and fire in the next Propose or Apply phase and to examine the goal dependency set These commands are particularly 8 2 EXAMINING MEMORY 121 useful when running or debugging Soar as they let users see what Soar is thinking The specific commands described in this section are Summary default wme depth Set the level of detail used to print WME s gds print Print the WMEs in the goal dependency set for each goal internal symbols Print information about the Soar symbol table matches Print information about the match set and partial matches memories Print memory usage for production matches preferences Examine items in preference memory print Print items in working memory or production memory production find Find productions that contain a given pattern Of these commands print is the most often used and the most complex followed by matches and memories preferences is used to examine which candidate operators have been proposed production find is especially useful when the number of produ
236. put system A WME is a list consisting of three symbols an identifier an attribute and a value where the entire WME is enclosed in parentheses and the attribute is preceded by an up arrow A template for a working memory element is identifier attribute value The identifier is an internal symbol generated by the Soar architecture as it runs The attribute and value can be either identifiers or constants if they are identifiers there are other working memory elements that have that identifier in their first position As the previous sentences demonstrate identifier is used to refer both to the first position of a working memory element as well as to the symbols that occupy that position 33 34 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS 3 1 1 Symbols Soar distinguishes between two types of working memory symbols identifiers and constants Identifiers An identifier is a unique symbol created at runtime when a new object is added to working memory The names of identifiers are created by Soar and consist of a single uppercase letter followed by a string of digits such as G37 or 022 The Soar user interface will also allow users to specify identifiers using lowercase letters for example when using the print command But internally they are actually uppercase letters Constants There are three types of constants integers floating point and symbolic constants e Integer constants numbers The range of values d
237. r Elaborations Operator elaborations i e placing some information on an operator WME should be i supported when using Soar 8 since this information is by definition temporary not persis tent because it s located on the non persistent operator However the kernel itself hasn t kept up with this change Prior to Soar 8 5 Soar s o support modes computed operator 227 elaborations as o supported resulting in the context conditions being added to the GDS This often leads to unwanted unnecessary retractions If you are using a version prior to Soar 8 you should declare any operator elaborations i supported i e using i support Kernel level view of the Goal Dependency Set The actual implementation of the GDS in the Soar kernel is slightly more complex than the conceptual description of the previous section but not significantly so Elements are added to the GDS via elaborate_gds a procedure in decide c that mim ics the chunking backtrace function The algorithm is shown in Figure E 3 When an o supported preference is asserted elaborate_gds is called Conditions in a production instantiation that are located in a higher context can be added directly to the GDS 1 For local conditions elaborate_gds first checks whether the tested WME is o supported or if it has been previously been back traced through 2 If either of these are true the WME can be ignored because it s dependencies have been added to the GD
238. r more actions on the right hand side of productions Any pattern that can appear in productions can be used in this command In addition the asterisk symbol can be used as a wildcard for an attribute or value It is important to note that the whole pattern including the parenthesis must be enclosed in curly braces for it to be parsed properly The variable names used in a call to production find do not have to match the variable names used in the productions being retrieved The production find command can also be restricted to apply to only certain types of pro ductions or to look only at the conditions or only at the actions of productions by using the flags Examples Find productions that test that some object gumby has an attribute alive with value t In addition limit the rules to only those that test an operator named foo production find lt state gt gumby lt gv gt operator name foo lt gv gt alive t Note that in the above command lt state gt does not have to match the exact variable name used in the production Find productions that propose the operator foo production find rhs lt x gt operator lt op gt lt op gt name foo Find chunks that test the attribute pokey production find chunks lt x gt pokey Examples using the water jugs demo source demos water jug water jug soar 134 CHAPTER 8 THE SOAR USER INTERFACE production find lt s gt name lt j gt volum
239. rator lt 01 gt lt 02 gt 3 3 PRODUCTION MEMORY 59 lt 02 gt lt lt ol gt lt 03 gt lt 04 gt lt s gt operator lt ol gt lt 02 gt lt 02 gt lt lt 01 gt lt 04 gt lt 03 gt lt s gt operator lt 01 gt operator lt 02 gt operator lt 02 gt lt lt ol gt operator lt 04 gt lt 03 gt Any one of those actions could be expanded to the following list of preferences lt s gt operator lt 01 gt lt s gt operator lt 02 gt lt s gt operator lt 02 gt lt lt 01 gt lt s gt operator lt 03 gt lt s gt operator lt 04 gt Note that structured value notation may not be used in the actions of productions 3 3 6 6 Righthand side Functions The fourth type of action that can occur in productions is called a righthand side function Righthand side functions allow productions to create side effects other than changing working memory The RHS functions are described below organized by the type of side effect they have 3 3 6 7 Stopping and pausing Soar halt Terminates Soar s execution and returns to the user prompt A halt action irre versibly terminates the running of a Soar program It should not be used if Soar is to be restarted see the interrupt RHS action below sp ans halt interrupt Executing this function causes Soar to stop at the end of the current phase and return to the user prompt This is similar to halt b
240. rators are proposed and the com parisons are insufficient to determine which one should be selected 3 An operator has been selected but there is insufficient knowledge to apply it In response to an impasse the Soar architecture creates a substate in which operators can be selected and applied to generate or deliberately retrieve the knowledge that was not directly available the goal in the substate is to resolve the impasse For example in a substate a Soar program may do a lookahead search to compare candidate operators if comparison knowledge is not directly available Impasses and substates are described in more detail in Section 2 6 2 1 2 An Example Task The Blocks World We will use a task called the blocks world as an example throughout this manual In the blocks world task the initial state has three blocks named A B and C on a table the operators move one block at a time to another location on top of another block or onto the table and the goal is to build a tower with A on top B in the middle and C on the bottom The initial state and the goal are illustrated in Figure 2 2 The Soar code for this task is included in Appendix A You do not need to look at the code at this point The operators in this task move a single block from its current location to a new location each operator is represented with the following information e the name of the block being moved e the current location of the block the thing
241. rators proposed for solving the subgoal may be similar to the operators in the superstate or they may be entirely different While problem solving in the subgoal additional impasses may be encountered leading to new subgoals Thus it is possible for Soar to have a stack of subgoals represented as states Each state has a single superstate except the initial state and each state may have at most one substate Newly created subgoals are considered to be added to the bottom of the stack the first state is therefore called the top level state See Figure 2 10 for a simplified illustrations of a subgoal stack Soar continually attempts to retrieve knowledge relevant to all goals in the subgoal stack although problem solving activity will tend to focus on the most recently created state However problem solving is active at all levels and productions that match at any level will fire 2 6 3 Results In order to resolve impasses subgoals must generate results that allow the problem solving at higher levels to proceed The results of a subgoal are the working memory elements and preferences that were created in the substate and that are also linked directly or indirectly to a superstate any superstate in the stack A preference or working memory element is said to be created in a state if the production that created it tested that state and this is the most recent state that the production tested Thus if a production tests multiple states t
242. re existing structure is relevant to the result that terminates the subgoal The result is dependent only on the existence of the substate within a substate Solution The current solution to this problem is to allow the problem solving to signal the architecture that the test for a substate is being made The signal used by Soar is a test for the quiescence t augmentation of the subgoal The chunking mechanism recognizes this test and does not build a chunk when it is found in a backtrace of a subgoal The history of this test is maintained so that if the result of the substate is then used to produce further results for a superstate no higher chunks will be built However if the result is used as search control it is a desirability preference then it does not prevent the creation of chunks because the original result is not included in the backtrace If the quiescence t being tested is connected to a superstate it will not inhibit chunking and it will be included in the conditions of the chunk 4 6 4 Mapping multiple superstate WMEs to one local WME An agent may have several rule instantiations that match on different structures in a su perstate but create WMEs with the same attribute value pairs in a substate For example there may be a rule that matches several WMEs in a superstate with the same multi valued attribute and elaborates the local state with a WME indicating that at least one WME with that attribute exists In these cases
243. re users may want to pay careful attention to the specific productions that are firing and retracting The run command takes optional arguments an integer count which specifies how many units to run and a units flag indicating what steps or increments to use If count is specified but no units are specified then Soar is run by decision cycles If units are specified but count is unpecified then count defaults to 1 If both are unspecified Soar will run until either a halt is executed an interrupt is received or max stack depth is reached If there are multiple Soar agents that exist in the same Soar process then issuing a run command in any agent will cause all agents to run with the same set of parameters unless the flag self is specified in which case only that agent will execute 8 1 BASIC COMMANDS FOR RUNNING SOAR 117 If an environment is registered for the kernel s update event then when the event it triggered the environment will get information about how the run was executed If a run was executed with the update option then then event sends a flag requesting that the environment actually update itself If a run was executed with the noupdate option then the event sends a flag requesting that the environment not update itself The update option is the default when run is specified without the self option is not specified If the self option is specified then the noupdate option is on by default It is u
244. red output is selected See also command to file command to file Dump the printed output and results of a command to a file Synopsis command to file a filename command args Options a append Append if file exists filename The file to log the results of the command to command The command to log args Arguments for command Description This command logs a single command It is almost equivalent to opening a log using clog running the command then closing the log the only difference is that input isn t recorded Running this command while a log is open is an error There is currently not support for multiple logs in the command line interface and this would be an instance of multiple logs 8 5 FILE SYSTEM I O COMMANDS 187 This command echos output both to the screen and to a file just like clog See also clog dirs List the directory stack Synopsis dirs Description This command lists the directory stack Agents can move through a directory structure by pushing and popping directory names The dirs command returns the stack The command pushd places a new agent current directory on top of the directory stack and changes to it The command popd removes the directory at the top of the directory stack and changes to the previous directory which now appears at the top of the stack See Also cd home ls pushd popd source pwd echo Print a string to the curren
245. remember off coding ignoring in off the next storage phase graph match Graph matching on off on enabled graph match ordering Ordering of identi undefined dfs undefined fiers during graph mcv match lazy commit Delay writing on off on semantic store changes to file until agent exits learning Episodic memory on off off enabled merge Controls how re none add none trievals interact with long term identifiers in work ing memory optimization Policy for commit safety performance ting data to disk performance page size Size of each mem 1k 2k 4k 8k 16k 8k ory page used in 32k 64k the SQLite cache path Location of empty some path empty database file phase Decision cycle output output phase to encode selection new episodes and process epmem link commands timers Timer granularity off one two off three trigger How episode en dc output none output 152 CHAPTER 8 THE SOAR USER INTERFACE The learning parameter turns the episodic memory module on or off When learning is set to off no new episodes are encoded and no commands put on the epmem link are processed The phase parameter determines which decision cycle phase episode encoding and retrieval will be performed The trigger parameter controls when new episodes will be encoded When it is set to output new episodes will be encoded only if the agent made modifications to the output link during that decisi
246. rence arbitration 157 init soar Empties working memory and resets run time 115 statistics internal symbols Print information about the Soar symbol table 123 learn Set the parameters for chunking 159 load library Load a shared library into the local client for 201 the purpose of e g providing custom event han dling 1s matches max chunks max dc time max elaborations max goal depth max memory usage max nil output cycles memories multi attributes numeric indifferent mode o support mode popd port predict preferences print production find pushd pwatch pwd rand remove wme replay input rete net List the contents of the current working direc tory Prints information about partial matches and the match set Limit the number of chunks created during a decision cycle Set a wall clock time limit such that the agent will be interrupted when a single decision cycle exceeds this limit Limit the maximum number of elaboration cy cles in a given phase Print a warning message if the limit is reached during a run Limit the sub state stack depth Get the number of bytes that when exceeded by an agent will trigger the memory usage ex ceeded event Limit the maximum number of decision cycles that are executed without producing output when run is invoked with run til output args Print memory usage for partial matches Declare a symbol to be multi attributed Select method for combining n
247. rent watch status i e the values of each parameter For the named arguments including the named argument turns on only that setting To 146 CHAPTER 8 THE SOAR USER INTERFACE turn off a specific setting follow the named argument with remove or 0 The named argument productions is shorthand for the four arguments default user justifications and chunks Examples The most common uses of watch are by using the numeric arguments which indicate watch levels To turn off all printing of Soar internals do any one of the following not all possi bilities listed watch level 0 watch 1 0 watch N Although the level flag is optional its use is recommended watch level 5 OK watch 5 OK avoid Be careful of where the level is on the command line for example if you want level 2 and preferences watch r 1 2 Incorrect r flag ignored level 2 parsed after it and overrides the s watch r 2 Syntax error 0 or remove expected as optional argument to r watch r 1 2 Incorrect r flag ignored level 2 parsed after it and overrides the s watch 2 r OK avoid watch 1 2 r OK To turn on printing of decisions phases and productions do any one of the following not all possibilities listed watch level 3 watch 1 3 watch decisions phases productions watch d p P Individual options can be changed as well To turn on printing of decisions and wmes but not phases and productions
248. result that is dependent on the operator occurring after all others this fact will not be captured in the conditions of the chunk In both of these cases part of the test for producing a result is implicit in search control productions This move allows the explicit state test to be simpler because any state to which the test is applied is guaranteed to satisfy some of the requirements for success However chunks created in such a problem space will be overgeneral because the implicit parts of the state test do not appear as conditions Solution To avoid this problem necessity preferences require and prohibit should be used whenever a control decision is being made that also incorporates goal attainment knowledge The necessity preferences are included in the backtrace by chunking thereby avoiding overgenerality 4 6 2 Testing for local negated conditions Overgeneral chunks can be created when negated conditions test for the absence of a work ing memory element that if it existed would be local to the substate Chunking has no mechanism for determining why a given working memory element does not exist and thus a condition that occurred in a production in the subgoal is not included in the chunk For example if a production tests for the absence of a local flag and that flag is copied down to the substate from a superstate then the chunk should include a test that the flag in the 4 6 PROBLEMS THAT MAY ARISE WITH CHUNKING 79 supers
249. reward signals present at the time of retraction and the EFR is unchanged Soar s automatic subgoaling and RL mechanisms can be combined to naturally implement hierarchical reinforcement learning algorithms such as MAXQ and options 5 3 3 Eligibility Traces The RL mechanism supports eligibility traces which can improve the speed of learning by up dating RL rules across multiple sequential steps The eligibility trace decay rate and eligibility trace tolerance parameters control this mechanism By setting eligibility trace decay rate to 0 default eligibility traces are in effect disabled When eligibility traces are enabled the particular algorithm used is dependent upon the learning policy For Sarsa the eligibility trace implementation is Sarsa For Q Learning the eligibility trace implementation is Watkin s Q A 5 3 3 1 Exploration The indifferent selection command page 157 determines how operators are se lected based on their numeric indifferent preferences Although all the indifferent selection settings are valid regardless of how the numeric indifferent preferences were arrived at the epsilon greedy and boltzmann settings are specifically designed for use with RL and correspond to the two most common exploration strategies In an effort to maintain back wards compatibility the default exploration policy is softmax As a result one should change to epsilon greedy or boltzmann when the reinforcement learning mecha
250. ring but does not limit the total number of productions that can fire during elaboration This limit is included in Soar to prevent getting stuck in infinite loops such as a production that repeatedly fires in one elaboration cycle and retracts in the next if you see the warning message it may be a signal that you have a bug your code However some Soar programs are designed to require a large number of elaboration cycles so rather than a bug you may need to increase the value of max elaborations max elaborations is checked during both the Propose Phase and the Apply Phase If Soar runs more than the max elaborations limit in either of these phases Soar proceeds to the next phase either Decision or Output even if quiescence has not been reached 164 CHAPTER 8 THE SOAR USER INTERFACE Examples The command issued with no arguments returns the max elaborations allowed max elaborations to set the maximum number of elaborations in one phase to 50 max elaborations 50 max goal depth Limit the sub state stack depth Synopsis max goal depth n Options n Maximum depth of sub states allowed Description The max goal depth command is used to limit the maximum depth of sub states The initial value of this variable is 100 allowable settings are any integer greater than 0 This limit is included in Soar to prevent getting stuck in an infinite recursive loop which may come about due to deliberate actions
251. rking memory elements respectively Some Soar systems may benefit from having multiple input and output links or that use names which are more descriptive of the input or output function such as vision input link text input link or motor output link In addition to providing the default io substructure the architecture allows users to create multiple input and output links via productions and I O functions Any identifiers for io substructure created by the user will be assigned at run time and are not guaranteed to be the same from run to run Therefore users should always employ variables when referring to input and output links in productions Suppose a blocks world task is implemented using a robot to move actual blocks around with a camera creating input to Soar and a robotic arm executing command outputs The camera image might be analyzed by a separate vision program this program could have as its output the locations of blocks on an xy plane The Soar input function could take the output from the vision program and create the following working memory elements on 70 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS state x location on m red blue iy wl 5 o A gt yellow Figure 3 3 An example portion of the input link for the blocks world task the input link all identifiers are assigned at runtime this is just an example of possible bindings S1 io I1 A I1 input link I2 A I2 block B
252. roblem space in the blocks world includes all operators that move blocks from one location to another and all possible configurations of the three blocks 13 An abstract view of production memory The productions are not related to one BOHHOR ok ee ee A ae ee Be ee ee we ee ee Ge ue He ee 16 A detailed illustration of Soar s decision cycle out of date 4 22 A simplified version of the Soar algorithm 02 23 A simplified illustration of a subgoal stack 22 0004 26 A semantic net illustration of four objects in working memory 37 An example production from the example blocks world task 39 An example portion of the input link for the blocks world task 70 An example portion of the output link for the blocks world task 71 Example Soar substate operator trace aooo ee e a a 87 Example long term identifier with four augmentations 92 Example episodic memory cache setting data 02 0000 0 107 An illustration of the preference resolution process There are eight steps only five of these provide exits from the resolution process 0 20 0004 218 Simplified Representation of the context dependencies above the line lo cal os upported WMEs below the line and the generation of a result In Soar 7 this situation led to non contemporaneous constraints in the chunk that generates Bek so Gee ee Re ee SER ee eee ee 223 T
253. s Chapter 6 and Chapter 7 describe Soar s long term declarative memory systems seman tic and episodic Not all users will make use of these mechanisms but it is important to know that they exist Chapter 8 describes the Soar user interface how the user interacts with Soar The chapter is a catalog of user interface commands grouped by functionality The most accurate and up to date information on the syntax of the Soar User Interface is found online on the Soar Wiki at http code google com p soar Advanced users will refer most often to Chapter 8 flipping back to Chapters 2 and 3 to answer specific questions There are several appendices included with this manual Appendix A contains an example Soar program for a simple version of the blocks world This blocks world program is used as an example throughout the manual Appendix B provides a grammar for Soar productions Appendix C describes the determination of o support Appendix D provides a detailed explanation of the preference resolution process Appendix E provides an explanation of the Goal Dependency Set Additional Back Matter The appendices are followed by an index the last pages of this manual contain a summary and index of the user interface functions for quick reference 1 2 CONTACTING THE SOAR GROUP 3 Not Described in This Manual Some of the more advanced features of Soar are not described in this manual such as how to interface with a simulator
254. s i e and The following table shows some examples of legal and illegal conjunctive tests 3 3 PRODUCTION MEMORY 45 Legal conjunctions Illegal conjunctions lt lt a gt gt lt b gt lt x gt lt lt a gt lt b gt lt x gt gt lt y gt gt gt lt b gt lt gt lt x gt lt y gt lt lt ABC gt gt lt x gt lt gt lt x gt gt lt y gt lt lt 1 2 3 4 gt gt lt z gt SARA AS Because those examples are a bit difficult to interpret let s go over the legal examples one by one to understand what each is doing In the first example the value must be less than or equal to the value bound to variable lt a gt and greater than or equal to the value bound to variable lt b gt In the second example the value is bound to the variable lt x gt which must also be greater than the value bound to variable lt y gt In the third example the value must not be equal to the value bound to variable lt x gt and should be bound to variable lt y gt Note the importance of order when using conjunctions with predicates in the second example the predicate modifies lt y gt but in the third example the predicate modifies lt x gt In the fourth example the value must be one of A B or C and the second conjunctive test binds the value to variable lt x gt In the fifth example there are four conjunctive tests First the value must be the same type as the val
255. s opposite of the value of the same parameter in episodic memory The mirroring parameter controls a useful form of automatic encoding If enabled on all changes to long term identifiers LTIs in working memory are mirrored to semantic memory assuming the LTI in working memory has at least one augmentation i e no accidental clearing The mirrors statistic is incremented for each LTI that is updated in this way Statistics Semantic memory tracks statistics over the lifetime of the agent These can be accessed using smem stats lt statistic gt Running smem stats without a statistic will list the values of all statistics Unlike timers statistics will always be updated Available statistics are 8 4 CONFIGURING SOAR S RUNTIME PARAMETERS 177 Name Label Description act_updates Activation Updates Number of times memory activa tion has been calculated db lib version SQLite Version SQLite library version edges Edges Number of edges in the semantic store mem usage Memory Usage Current SQLite memory usage in bytes mem high Memory Highwater High SQLite memory usage wa termark in bytes mirrors Mirrors Number of LTIs that have been automatically encoded via mirroring nodes Nodes Number of nodes in the semantic store queries Queries Number of times the query com mand has been issued retrieves Retrieves Number of times the retrieve co
256. s soar this is also a comment 3 3 5 The condition side of productions or LHS The condition side of a production also called the left hand side or LHS of the production is a pattern for matching one or more WMEs When all of the conditions of a production match elements in working memory the production is said to be instantiated and is ready to perform its action The following subsections describe the condition side of a production including predicates disjunctions conjunctions negations acceptable preferences for operators and a few ad vanced topics 3 3 5 1 Conditions The condition side of a production consists of a set of conditions Each condition tests for the existence or absence explained later in Section 3 3 5 6 of working memory elements Each condition consists of a open parenthesis followed by a test for the identifier and the tests for augmentations of that identifier in terms of attributes and values The condition is terminated with a close parenthesis Thus a single condition might test properties of a single working memory element or properties of multiple working memory elements that constitute an object identifier test attributei test valuei test attribute2 test value2 test attribute3 test value3 test 42 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS sa The first condition in a production must match against a state in working memory Thus the first condition must begin with the additional
257. se decision phase apply phase or output phase s self If other agents exist within the kernel do not run them at this time u update Sets a flag in the update event callback requesting that an envi ronment updates This is the default if self is not specified n noupdate Sets a flag in the update event callback requesting that an environ ment does not update This is the default if self is specified count A single integer which specifies the number of cycles to run Soar i interleave Support round robin execution across agents at a finer grain than the run size parameter e elaboration p phase d decision o output g goal Run agent until a goal retracts Deprecated Options These may be reimplemented in the future operator Run Soar until the nth time an operator is selected state Run Soar until the nth time a state is selected Description The run command starts the Soar execution cycle or continues any execution that was temporarily stopped The default behavior of run with no arguments is to cause Soar to execute until it is halted or interrupted by an action of a production or until an external interrupt is issued by the user The run command can also specify that Soar should run only for a specific number of Soar cycles or phases which may also be prematurely stopped by a production action or the stop soar command This is helpful for debugging sessions whe
258. small fraction of that space during its execution For example consider the grid world example with 1000 rows and columns Attempting to generate RL rules for each grid cell and action a priori will result in 1000 x 1000 x 4 4 x 10 productions However if most of those cells are unreachable due to walls then the agent will never fire or update most of those productions Templates give the programmer the convenience of the gp command without filling production memory with unnecessary rules Rule templates have variables that are filled in to generate RL rules as the agent encounters novel combinations of variable values A rule template is valid if and only if it is marked with the template flag and in all other respects adheres to the format of an RL rule However whereas an RL rule may only use constants as the numeric indifference preference value a rule template may use a variable Consider the following rule template 90 CHAPTER 5 REINFORCEMENT LEARNING sp sample rule template template state lt s gt operator lt o gt value lt v gt gt lt s gt operator lt o gt lt v gt During agent execution this rule template will match working memory and create new productions by substituting all variables in the rule template that matched against constant values with the values themselves Suppose that the LHS of the rule template matched against the state S1 value 3 2 S1 operator 01 Then
259. smem structures to prevent encoding of potentially large and or frequently changing memory retrievals 7 2 2 Storage Location Episodic memory uses SQLite to facilitate efficient and standardized storage and querying of episodes The episodic store can be maintained in memory or on disk per the database and path parameters If the store is located on disk users can use any standard SQLite programs components to access query its contents See the later discussion on performance 7 4 for additional parameters dealing with databases on disk 7 3 RETRIEVING EPISODES 101 7 3 Retrieving Episodes An agent retrieves episodes by creating an appropriate command we detail the types of commands below on the command link of a state s epmem structure At the end of the phase of each decision after episodic storage episodic memory processes each state s epmem command structure Results meta data and errors are placed on the result structure of that state s epmem structure Only one type of retrieval command which may include optional modifiers can be issued per state in a single decision cycle Malformed commands including attempts at multiple retrieval types will result in an error lt s gt epmem result status bad cmd After a command has been processed episodic memory will ignore it until some aspect of the command structure changes via addition removal of WMEs When this occurs the result structure is cleared and the new
260. sory inputs and sending command outputs Soar programs might also interact with simulated environments such as a flight simulator The mechanisms by which Soar receives inputs and sends outputs to an external process is called Soar I O This section describes how input and output are represented in working memory and in productions The details of creating and registering the input and output functions for Soar are beyond the scope of this manual but they are described in the SML Quick Start Guide This section is provided for the sake of Soar users who will be making use of a program that has already been implemented or for those who would simply like to understand how I O is implemented in Soar 3 5 SOAR I O INPUT AND OUTPUT IN SOAR 69 3 5 1 Overview of Soar I O When Soar interacts with an external environment it must make use of mechanisms that allow it to receive input from that environment and to effect changes in that environment An external environment may be the real world or a simulation input is usually viewed as Soar s perception and output is viewed as Soar s motor abilities Soar I O is accomplished via input functions and output functions Input functions are called at the start of every execution cycle and add elements directly to specific input structures in working memory These changes to working memory may change the set of productions that will fire or retract Output functions are called at the end of every execut
261. source soar 0 productions sourced _readme soar 0 productions sourced k initialize mac soar 2 productions sourced Fkk kk move boat soar 7 productions sourced mac_source soar 0 productions sourced mac soar 0 productions sourced Total 18 productions sourced Source finished Combining the a and v flags add excised production names to the output for each file See Also cd dirs home ls pushd popd pwd 8 6 Soar I O Commands This section describes the commands used to manage Soar s Input Output I O system which provides a mechanism for allowing Soar to interact with external systems such as a computer game environment or a robot Soar I O functions make calls to add wme and remove wme to add and remove elements to the io structure of Soar s working memory The specific commands described in this section are Summary add wme Manually add an element to working memory capture input Store the input wmes in a file for reloading later remove wme Manually remove an element from working memory 194 CHAPTER 8 THE SOAR USER INTERFACE replay input Load input wmes for each decision cycle from a file These commands are used mainly when Soar needs to interact with an external environment Users might take advantage of these commands when debugging agents but care should be used in adding and removing wmes this way as they do not fall under Soar s truth maintenance system add wme Manually add an
262. ssed at the beginning of each execution cycle and output occurs at the end of each execution cycle For instructions on how to use input and output functions with Soar refer to the SML Quick Start Guide 32 CHAPTER 2 THE SOAR ARCHITECTURE Chapter 3 The Syntax of Soar Programs This chapter describes in detail the syntax of elements in working memory preference mem ory and production memory and how impasses and I O are represented in working memory and in productions Working memory elements and preferences are created as Soar runs while productions are created by the user or through chunking The bulk of this chapter explains the syntax for writing productions The first section of this chapter describes the structure of working memory elements in Soar the second section describes the structure of preferences and the third section describes the structure of productions The fourth section describes the structure of impasses An overview of how input and output appear in working memory is presented in the fifth section the full discussion of Soar I O can be found in the SML Quick Start Guide This chapter assumes that you understand the operating principles of Soar as presented in Chapter 2 3 1 Working Memory Working memory contains working memory elements WME s As described in Section 2 2 WME s can be created by the actions of productions the evaluation of preferences the Soar architecture and via the input out
263. state lt s gt ontop lt o 1 gt lt o 1 gt bottom object lt b 1 gt lt b 1 gt name A lt b 1 gt name table gt lt s gt nothing ontop A or table true This example does not refer to two different blocks with different names It tests that there is not an ontop relation with a bottom block that is named A and named table Thus this production probably should have been written as sp blocks negated conjunction example state lt s gt name top state ontop bottom object name table ontop bottom object name A gt lt s gt nothing ontop A or table true which expands to sp blocks negated conjunction example state lt s gt name top state lt s gt ontop lt o 2 gt 54 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS lt o 2 gt bottom object lt b 2 gt lt b 2 gt name a lt s gt ontop lt o 1 gt lt o 1 gt bottom object lt b 1 gt lt b 1 gt name table gt lt s gt nothing ontop a or table true Notes on attribute path notation e Attributes specified in attribute path notation may not start with a digit For example if you type foo 3 bar Soar thinks the 3 is a floating point number Attributes that don t appear in path notation can begin with a number e Attribute path notation may be used to any depth e Attribute path notation may be combined with structured values described in Section os be
264. stem flag the stats command lists a summary of run statistics including the following e Version The Soar version number hostname and date of the run e Number of productions The total number of productions loaded in the system including all chunks built during problem solving and all default productions e Timing Information Might be quite detailed depending on the flags set at compile time See note on timers below e Decision Cycles The total number of decision cycles in the run and the average time per decision cycle in milliseconds e Elaboration cycles The total number of elaboration cycles that were executed during the run the average number of elaboration cycles per decision cycle and the average time per elaboration cycle in milliseconds This is not the total number of production firings as productions can fire in parallel e Production Firings The total number of productions that were fired e Working Memory Changes This is the total number of changes to working memory This includes all additions and deletions from working memory Also prints the average match time e Working Memory Size This gives the current mean and maximum number of working memory elements The stats argument memory provides information about memory usage and Soar s memory pools which are used to allocate space for the various data structures used in Soar The stats argument rete provides information about
265. t digiti 1 digit 10 11 lt a02 gt digitl 2 digit 10 12 lt a03 gt digitl 3 digit 10 13 Although not shown here the common dot notation format used in writing productions can also be used for this command Unlike agent storage manual storage is automatically recursive Thus the above example will add a new concept represented by the temporary arithmetic variable with three children Each child will be its own concept with two constant attribute value pairs Visualization When debugging agents using semantic memory it is often useful to in spect the contents of the semantic store Running smem viz lt ltid gt lt depth gt will output the concept rooted at lt 1ltid gt to depth lt depth gt in graphviz format including long term identifier activation levels If lt ltid gt is omitted the entire contents of the seman tic store are outputted For more information on this format and visualization tools see http www graphviz org The smem print option has the same syntax but outputs text that is similar to using the print command to get the substructure of an identifier in working memory which is possibly more useful for interactive debugging Reinitialization If for experimentation debugging it is necessary to reinitialize an agent including its long term memory semantic memory supports reinitialization For semantic memory to be reinitialized all references to long term identifiers in
266. t name top state ontop bottom object type table gt lt s gt nothing ontop table true 52 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS Multi valued attributes and attribute path notation Attribute path notation may also be used with multi valued attributes such as sp blocks world propose move block state lt s gt problem space blocks clear block lt blocki gt lt gt lt block1 gt lt block2 gt ontop lt ontop gt lt block1 gt type block lt ontop gt top block lt block1 gt pottom block lt gt lt block2 gt gt lt s gt operator lt o gt lt o gt name move block moving block lt blocki gt destination lt block2 gt Multi attributes and attribute path notation Note It would not be advisable to write the production in Figure 3 2 using attribute path notation as follows sp blocks world propose move block dont do this state lt s gt problem space blocks clear block lt block1 gt clear block lt gt lt block1 gt lt block2 gt ontop top block lt blocki gt ontop bottom block lt gt lt block2 gt lt block1i gt type block This is not advisable because it corresponds to a different set of conditions than those in the original production the top block and bottom block need not correspond to the same ontop relation To check this we could print the original production at the Soar prompt soar gt pri
267. t y location lt y1 gt lt in gt block lt ib2 gt lt gt lt ib1 gt lt ib2 gt x location lt x1 gt y location lt y2 gt gt lt y1 gt gt lt s gt block lt b1 gt lt s gt block lt b2 gt lt b1 gt x location lt x1 gt y location lt y1 gt clear no lt b2 gt x location lt x1 gt y location lt y2 gt above lt b1 gt This production copies two blocks and their locations directly to the top level state It also adds information about the relationship between the two blocks The variables used for the blocks on the RHS of the production are deliberately different from the variable name used for the block on the input link in the LHS of the production If the variable were the same the production would create a link into the structure of the input link rather than copy the information The attributes x location and y location are assumed to be values and not identifiers so the same variable names may be used to do the copying A production that creates wmes on the output link for the blocks task might look like this sp blocks world apply move block send output command state lt s gt operator lt o gt io output link lt out gt lt o gt name move block moving block lt bi gt destination lt b2 gt lt b1 gt x location lt x1 gt y location lt y1 gt lt b2 gt x location lt x2 gt y location lt y2 gt gt lt out gt
268. t lt t2 gt operator lt o gt name group py attribute lt a gt moving block lt t gt destination lt t2 gt lt t gt type block lt a gt lt x gt lt t2 gt type block lt a gt lt x gt gt lt s gt operator lt o gt gt This production tests that there is acceptable operator that is trying to group blocks accord ing to some attribute lt a gt and that block lt t gt and lt t2 gt both have this attribute whatever it is and have the same value for the attribute Predicates in attributes Predicates may be used with attributes as in sp blocks example production conditions state operator lt o gt table lt t gt lt t gt lt gt type table gt T which tests that the object with its identifier bound to lt t gt must have an attribute whose value is table but the name of this attribute is not type Disjunctions of attributes Disjunctions may also be used with attributes as in 50 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS sp blocks example production conditions state operator lt o gt table lt t gt lt t gt lt lt type name gt gt table gt which tests that the object with its identifier bound to lt t gt must have either an attribute type whose value is table or an attribute name whose value is table Conjunctive tests for attributes Section 3 3 5 5 illustrated the use of conjunctions for the values in condi
269. t is cumbersome or impossible then the user has two options create a remote client that provides the functionality or use load library Load library creates extensions in the local client making it orders of magnitude faster than a remote client To create a loadable library the library must contain the following function ifdef __cplusplus extern C endif EXPORT char sml_InitLibrary Kernel pKernel int argc char argv Your code here ifdef __cplusplus extern C endif This function is called when load library loads your library It is responsible for any initial ization that you want to take place e g registering custom RHS functions registering for events etc The argc and argv arguments are intended to mirror the arguments that a standard SML client would get Thus the first argument is the name of the library and the rest are whatever other arguments are provided This is to make it easy to use the same codebase to create a loadable library or a standard remote SML client e g when run as a standard client just pass the arguments main gets into sml_InitLibrary The return value of sml_InitLibrary is for any error messages you want to return to the load library call If no error occurs return a zero length string An example library is provided in the Tools TestExternalLibraryLib project This exam ple can also be compiled as a standard remote SML client The Tools TestExternalLibraryE
270. t o gt 1 will call the user defined MakeANote function with the argument x1 The return value of the function if any may be placed in working memory or passed to another RHS function For example the log of a number lt x gt could be printed this way sp gt write The log of lt x gt is exec log lt x gt where log is a registered user defined function cmd Used to call built in Soar commands Spaces are inserted between concatenated arguments For example the production sp gt write cmd print depth 2 lt s gt will have the effect of printing the object bound to lt s gt to depth 2 3 3 6 12 Controlling chunking Chunking is described in Chapter 4 The following two functions are provided as RHS actions to assist in development of Soar programs they are not intended to correspond to any theory of learning in Soar This functionality is provided as a development tool so that learning may be turned off in specific problem spaces preventing otherwise buggy behavior The dont learn and force learn RHS actions are to be used with specific settings for the learn command see page 159 Using the learn command learning may be set to one of on off except or only learning must be set to except for the dont learn RHS action to have any effect and learning must be set to only for the force learn RHS action to have any effect dont learn When learning is set to e
271. t output device Synopsis echo nonewline string Options string The string to print n nonewline Supress printing of the newline character 188 CHAPTER 8 THE SOAR USER INTERFACE Description This command echos the args to the current output stream This is normally stdout but can be set to a variety of channels If an arg is nonewline then no newline is printed at the end of the printed strings Otherwise a newline is printed after printing all the given args Echo is the easiest way to add user comments or identification strings in a log file Examples This example will add these comments to the screen and any open log file echo This is the first run with disks 12 See Also clog Is List the contents of the current working directory Synopsis ls Default Aliases dir 1s Description List the contents of the working directory See Also cd dirs home pushd popd source pwd 8 5 FILE SYSTEM I O COMMANDS 189 popd Pop the current working directory off the stack and change to the next directory on the stack Can be relative pathname or fully specified path Synopsis popd Description This command pops a directory off of the directory stack and cmd_cd changes to it See the cmd_dirs dirs command for an explanation of the directory stack See Also cd dirs home ls pushd source pwd pushd Push a directory onto the directory stack cha
272. t selected operator Set the top level directory containing de mos help etc Controls the phase where agents stop when run ning by decision Control the behavior of semantic memory Prints information about the current release Load and evaluate the contents of a file Define a Soar production Seed the random number generator Print information on Soar s runtime statistics Pause Soar Use a default system clock timer to record the wall time required while executing a command Toggle on or off the internal timers used to pro file Soar Undefine an existing alias Control detailed information printed as Soar runs Returns the version number of the Soar kernel Generate a wait state rather than a state no change impasse Enable or disable the printing of warning mes sages from the Soar kernel Print information about wmes matching a cer tain pattern as they are added and removed Control the run time tracing of Soar Control the behavior of working memory activa tion 169 115 aba 172 191 173 174 204 ou 117 204 138 119 204 178 205 141 205 v9 141 147 142 180
273. t the second would lead to I support In order for the architecture to determine whether a result receives I support or O support Soar must first determine the function that the working memory element or preference plays that is whether the result should be considered an operator application or not To do this Soar creates a temporary production called a justification The justification summarizes the processing in the substate that led to the result The conditions of a justification are those working memory elements that exist in the superstate and above that were necessary for producing the result This is determined by collecting all of the working memory elements tested by the productions that fired in the subgoal that led to the creation of the result and then removing those conditions that test working memory elements created in the subgoal The action of the justification is the result of the subgoal Soar determines I support or O support for the justification just as it would for any other production as described in Section 2 3 3 If the justification is an operator application the result will receive O support Otherwise the result gets I support from the justification If such a result loses I support from the justification it will be retracted if there is no other support Justification are not added to production memory but are otherwise treated as an instantiated productions that have already fired Justifications incl
274. t with exactly the same augmentations but a different identifier and the program will still reason about the object appropriately Identifiers are internal markers for Soar they can appear in working memory but they never appear in a production There is no predefined relationship between objects in working memory and real objects in the outside world Objects in working memory may refer to real objects such as block A features of an object such as the color red or shape cube a relation between objects such as ontop classes of objects such as blocks etc The actual names of attributes and In order to allow these links to have some substructure the attribute name may be an identifier which means that the attribute may itself have attributes and values as specified by additional working memory elements 2 3 PRODUCTION MEMORY LONG TERM KNOWLEDGE 15 values have no meaning to the Soar architecture aside from a few WME s created by the architecture itself For example Soar doesn t care whether the things in the blocks world are called blocks or cubes or chandeliers It is up to the Soar programmer to pick suitable labels and to use them consistently The elements in working memory arise from one of four sources 1 The actions of productions create most working memory elements 2 The decision procedure automatically creates some special state augmentations type superstate impasse whenever a
275. tate and chosen operator in decision cycle t Q Si41 Q441 is the Q value of the state and chosen RL operator in the next decision cycle e rz is the total reward collected in the next decision cycle 5 3 UPDATING RL RULE VALUES 85 e a and y are the settings of the Learning rate and discount rate parameters of the rl command respectively Note that since 6 depends on Q 5 41 t 1 the update for the operator selected in decision cycle t is not applied until the next RL operator is chosen For Q Learning we have 0 Q T Y Max Q st41 a Q s1 at ac Aty1 where A 1 is the set of RL operators proposed in the next decision cycle Finally 6 is divided by the number of RL rules comprising the Q value for the operator and the numeric indifferent values for each RL rule is updated by that amount An example walkthrough of a Sarsa update with a 0 3 and y 0 9 the default settings in Soar follows 1 In decision cycle t an operator O1 is proposed and RL rules r1 1 and r1 2 create the following numeric indifferent preferences for it 2 3 1 rl 1 S1 operator 01 rl 2 S1 operator 01 The Q value for 01 is Q s 01 2 3 1 1 3 2 O1 is selected and executed so Q s at Q s 01 1 3 3 In decision cycle t 1 a total reward of 1 0 is collected on the reward link an operator O2 is proposed and another RL rule r1 3 creates the following numeric indifferent preference for it rl 3
276. tate does not exist Unfortunately it is computationally expensive to determine why a given working memory element does not exist Chunking only includes negated tests if they test for the absence of superstate working memory elements Solution To avoid using negated conditions for local data the local data can be made a result by attaching it to the superstate This increases the number of chunks learned but a negated condition for the superstate can be used that leads to correct chunks Alternatively Soar s learning mode can be set to reject chunks when the backtrace encounters a local negation by setting learn through local negations disable There are many cases where local negations are safe to ignore and hence this mode reduces performance but it can substantially reduce the number of overgeneral chunks in big agents and aid in debugging 4 6 3 Testing for the substate Overgeneral chunks can be created if a result of a subgoal is dependent on the creation of an impasse within the substate For example processing in a subgoal may consist of exhaustively applying all the operators in the problem space If so then a convenient way to recognize that all operators have applied and processing is complete is to wait for a state no change impasse to occur When the impasse occurs a production can test for the resulting substate and create a result for the original subgoal This form of state test builds overgeneral chunks because no p
277. tation of working memory activation con 102 CHAPTER 7 EPISODIC MEMORY sider reading Comprehensive Working Memory Activation in Soar Nuxoll A Laird J James M ICCM 2004 The cue based retrieval process can be thought of conceptually as a nearest neighbor search First all candidate episodes defined as episodes containing at least one leaf WME a cue WME with no sub structure in at least one cue are identified Two quantities are calculated for each candidate episode with respect to the supplied cue s the cardinality of the match defined as the number of matching leaf WMEs and the activation of the match defined as the sum of the activation values of each matching leaf WME Note that each of these values is negated when applied to a negative query To compute each candidate episode s match score these quantities are combined with respect to the balance parameter as follows balance x cardinality 1 balance activation Performing a graph match on each candidate episode with respect to the structure of the cue could be very computationally expensive so episodic memory implements a two stage matching process An episode with perfect cardinality is considered a perfect surface match and per the graph match parameter is subjected to further structural matching Whereas surface matching efficiently determines if all paths to leaf WMEs exist in a candidate episode graph matching indicates whether or not the cue
278. tats lt statistic gt smem t timers lt timer gt smem al add lt concept gt smem p print lt lti gt lt depth gt smem v viz lt lti gt lt depth gt smem i init smem b backup lt file name gt Options g get Print current parameter setting s set Set parameter value S stats Print statistic summary or specific statistic t timers Print timer summary or specific statistic a add Add concepts to semantic memory p print Print semantic store in user readable format v viz Print semantic store in graphviz format i init Reinitialize ALL memories b backup Creates a backup of the semantic database on disk Description The smem command changes the behavior of and displays information about semantic mem ory The command watch smem displays additional trace information for semantic memory not controlled by this command Parameters Due to the large number of parameters the smem command uses the get set lt parameter gt lt value gt convention rather than individual switches for each parameter Run ning smem without any switches displays a summary of the parameter settings 8 4 CONFIGURING SOAR S RUNTIME PARAMETERS 175 Parameter Description Possible values Default activation mode Sets the ordering bias for retrievals that match more than one memory recency frequency base level recency
279. terface input and output to a file was log 184 CHAPTER 8 THE SOAR USER INTERFACE command to file Dump the printed output and results of a command to a file dirs List the directory stack echo Print a string to the current output device Is List the contents of the current working directory popd Pop the current working directory off the stack and change to the next directory on the stack pushd Push a directory onto the directory stack changing to it pwd Print the current working directory rete net Save the current Rete net or restore a previous one set library location Set the top level directory containing demos help etc source Load and evaluate the contents of a file The source command is used for nearly every Soar program The directory functions are important to understand so that users can navigate directories folders to load save the files of interest Soar applications that include a graphical interface or other simulation environment will often require the use of echo cd Change directory Synopsis cd directory Default Aliases chdir cd Options directory The directory to change to can be relative or full path Description Change the current working directory If run with no arguments returns to the directory that the command line interface was started in often referred to as the home directory 8 5 FILE SYSTEM I O COMMANDS Exa
280. terrupt s seconds Interpret n as seconds floating point OK Description After output phase the elapsed decision cycle time is checked to see if it is greater than the old maximum and the maximum dc time stat is updated see stats At this time this threshold is also checked If met or exceeded Soar stops at the end of the current output phase with an interrupted state 8 4 CONFIGURING SOAR S RUNTIME PARAMETERS 163 Examples max dc time s 0 05 max dc time 4000 max elaborations Limit the maximum number of elaboration cycles in a given phase Print a warning message if the limit is reached during a run Synopsis max elaborations n Options n Maximum allowed elaboration cycles must be a positive integer Description This command sets and prints the maximum number of elaboration cycles allowed If n is given it must be a positive integer and is used to reset the number of allowed elaboration cycles The default value is 100 max elaborations with no arguments prints the current value max elaborations controls the maximum number of elaborations allowed in a single decision cycle The elaboration phase will end after max elaboration cycles have completed even if there are more productions eligible to fire or retract and Soar will proceed to the next phase after a warning message is printed to notify the user This limits the total number of cycles of parallel production fi
281. test that an object has two values for attribute child the variables in the following condition can match to the same value lt p1 gt type father child lt c1 gt lt c2 gt To do tests for multi valued attributes with variables correctly conjunctive tests must be used as in lt p1 gt type father child lt c1 gt lt gt lt c1 gt lt c2 gt 48 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS The conjunctive test lt gt lt c1 gt lt c2 gt ensures that lt c2 gt will bind to a different value than lt c1 gt binds to Negated conditions and multi valued attributes A negation can also precede an attribute with multiple values In this case it tests for the absence of the conjunction of the values For example lt p1 gt name john child oprah uma is the same as lt p1 gt name john lt p1 gt child oprah lt p1 gt child uma and the match is possible if either lt p1 gt child oprah or lt p1 gt child uma cannot be found in working memory with the binding for lt p1 gt but not if both are present 3 3 5 9 Acceptable preferences for operators The only preferences that can appear in working memory are acceptable preferences for oper ators and therefore the only preferences that may appear in the conditions of a production are acceptable preferences for operators Acceptable preferences for operators can be matched in a condition by testing for a followin
282. that contain a change in at least one of the cue WMEs However a cue that has no match and contains WMEs relevant to all episodes will force inspection of all episodes Thus worst case performance will be linear in the number of episodes 7 4 1 Performance Tweaking When using a database stored to disk several parameters become crucial to performance The first is commit which controls the number of episodes that occur between writes to disk If the total number of episodes or a range is known ahead of time setting this value to a greater number will result in greatest performance due to decreased I O The next two parameters deal with the SQLite cache which is a memory store used to speed operations like queries by keeping in memory structures like levels of index B trees The first parameter page size indicates the size in bytes of each cache page The second parameter cache size suggests to SQLite how many pages are available for the cache Total cache size is the product of these two parameter settings The cache memory is not pre allocated so short small runs will not necessarily make use of this space Generally speaking a greater number of cache pages will benefit query time as SQLite can keep necessary meta data in memory However some documented situations have shown improved performance from decreasing cache pages to increase memory locality This is of greater concern when dealing with file based databases versus in me
283. the pattern argument R reset filter Delete all filters of this type Does not use pattern arg t type Follow with a type of wme filter see below Pattern The pattern is an id attribute value triplet 148 CHAPTER 8 THE SOAR USER INTERFACE id attribute value Note that can be used in place of the id attribute or value as a wildcard that maches any string Note that braces are not used anymore Types When using the t flag it must be followed by one of the following adds Print info when a wme is added removes Print info when a wme is retracted both Print info when a wme is added or retracted When issuing a R or 1 the t flag is optional Its absence is equivalent to t both Description This commands allows users to improve state tracing by issuing filter options that are applied when watching wmes Users can selectively define which object attribute value triplets are monitored and whether they are monitored for addition removal or both as they go in and out of working memory Note The functionality of watch wmes resided in the watch command prior to Soar 8 6 Examples Users can watch an attribute of a particular object as long as that object already exists soar gt watch wmes add filter t both D1 speed or print WMEs that retract in a specific state provided the state already exists soar gt watch wmes add filter t removes 83 or watch any rel
284. the following production will be added to production memory sp rl sample rule template 1 state lt s gt operator lt o gt value 3 2 gt lt s gt operator lt o gt 3 2 The variable lt v gt is replaced by 3 2 on both the LHS and the RHS but lt s gt and lt o gt are not replaced because they matches against identifiers S1 and 01 As with other RL rules the value of 3 2 on the RHS of this rule may be updated later by reinforcement learning whereas the value of 3 2 on the LHS will remain unchanged If lt v gt had matched against a non numeric constant it will be replaced by that constant on the LHS but the RHS numeric indifference preference value will be set to zero to make the new rule valid The new production s name adheres to the following pattern rl template name id where template name is the name of the originating rule template and id is monotonically increasing integer that guarantees the uniqueness of the name If an identical production already exists in production memory then the newly generate production is discarded It should be noted that the current process of identifying unique template match instances can become quite expensive in long agent runs Therefore it is recommended to generate all necessary RL rules using the gp command or via custom scripting when possible 5 4 3 Chunking Since RL rules are regular productions they can be learned by chunking just like any other produ
285. the parameter at 8192 bytes The next parameter is optimization The safety parameter setting will use SQLite default settings If data integrity is of importance this setting is ideal The performance set ting will make use of lesser data consistency guarantees for significantly greater performance First writes are no longer synchronous with the OS synchronous pragma thus episodic memory won t wait for writes to complete before continuing execution Second transaction 7 4 PERFORMANCE 107 EpMem Storage Time 10M Decisions 25 ete y 0 0119x 0 0487x 0 1633 20 R 0 99013 2 F 0 25 Nn Hoz amp z y 0 4982x 8 9027x 36 31 Ey E 10 R7 0 93408 g lt x 5 F 0 15 0 a m 0 1 1k 2k 4k 8k 16k 32k Page Size D Maximum Average Figure 7 1 Example episodic memory cache setting data journaling is turned off journal mode pragma thus groups of modifications to the episodic store are not atomic and thus interruptions due to application os hardware failure could lead to inconsistent database state Finally upon initialization episodic memory maintains a continuous exclusive lock to the database locking mode pragma thus other applica tions agents cannot make simultaneous read write calls to the database thereby reducing the need for potentially expensive system calls to secure release file locks Finally maintaining accurate operation timers can be relat
286. the timetags are not represented in working memory and cannot be matched by productions The timetags are used to distinguish between multiple occurrences of the same WME As preferences change and elements are added and deleted from working memory it is possible for a WME to be created removed and created again The second creation of the WME which bears the same identifier attribute and value as the first WME is different and therefore is assigned a different timetag This is important because a production will fire only once for a given instantiation and the instantiation is de termined by the timetags that match the production and not by the identifier attribute value triples To look at the timetags of WMEs the wmes command can be used soar gt wmes sl 3 S1 io 11 10 S1 ontop 02 9 S1 ontop 03 11 S1 ontop 01 4 S1 problem space blocks 2 S1 superstate nil 6 S1 thing B3 5 S1 thing T1 8 S1 thing B1 7 S1 thing B2 1 S1 type state 36 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS This shows all the individual augmentations of S1 each is preceded by an integer timetag 3 1 4 Acceptable preferences in working memory The acceptable preferences for the operator augmentations of states appear in working mem ory as identifier attribute value preference quadruples No other preferences appear in work ing memory A template for an acceptable preference in working m
287. thout terminating the existing impasse item count For multi choice and constraint failure impasses this contains the number of values listed under the item augmentation above non numeric For tie impasses this contains all operators that do not have numeric indif ferent preferences associated with them If the set of items that tie changes during the impasse the architecture removes or adds the appropriate non numeric augmentations without terminating the existing impasse non numeric count For tie impasses this contains the number of operators listed under the non numeric augmentation above quiescence States are the only objects with quiescence t which is an explicit statement that quiescence exhaustion of the elaboration cycle was reached in the superstate If problem solving in the subgoal is contingent on quiescence having been reached the substate should test this flag The side effect is that no chunk will be built if it depended on that test See Section 4 1 on page 73 for details This attribute can be ignored when learning is turned off Knowing the names of these architecturally defined attributes and their possible values will help you to write productions that test for the presence of specific types of impasses so that you can attempt to resolve the impasse in a manner appropriate to your program Many of the default productions in the demos defaults directory of the Soar distribution provide means for resolving cert
288. tifier mappings this map represents the first unified mapping with respect to episodic memory algorithms 7 4 Performance There are currently two sources of unbounded computation graph matching and cue based queries Graph matching is combinatorial in the worst case Thus if an episode presents a perfect surface match but imperfect structural match i e there is no way to unify the cue with the candidate episode there is the potential for exhaustive search Each identifier in the cue can be assigned one of any historically consistent identifiers with respect to the sequence of attributes that leads to the identifier from the root termed a literal If the identifier is a multi valued attribute there will be more than one candidate literals and this situation can lead to a very expensive search process Currently there are no heuristics in place to attempt to combat the expensive backtracking Worst case performance will be combinatorial in the total number of literals for each cue identifier with respect to cue structure The cue based query algorithm begins with the most recent candidate episode and will stop search as soon as a match is found since this episode must be the most recent Given this procedure it is trivial to create a two WME cue that forces a linear search of the episodic 106 CHAPTER 7 EPISODIC MEMORY store Episodic memory combats linear scan by only searching candidate episodes i e only those
289. tions Conjunctive tests may also be used with attributes as in sp blocks example production conditions state operator lt o gt table lt t gt lt t gt lt ta gt lt gt name table gt oe which tests that the object with its identifier bound to lt t gt must have an attribute whose value is table and the name of this attribute is not name and the name of this attribute whatever it is is bound to the variable lt ta gt When attribute predicates or attribute disjunctions are used with multi valued attributes the production is rewritten internally to use a conjunctive test for the attribute the conjunctive test includes a variable used to bind to the attribute name Thus lt p1 gt type father lt gt name sue sally is interpreted to mean lt p1 gt type father lt gt name lt a 1 gt sue lt a 1 gt sally 3 3 5 11 Attribute path notation Often variables appear in the conditions of productions only to link the value of one attribute with the identifier of another attribute Attribute path notation provides a shorthand so that these intermediate variables do not need to be included Syntactically path notation lists a sequence of attributes separated by dots after the in a condition For example using attribute path notation the production sp blocks world monitor move block state lt s gt operator lt o gt lt o gt name move block moving block
290. to a state in working memory so must the objects referred to in a production s conditions That is one condition must test a state object and all other conditions must test that same state or objects that are linked to that state 2 3 2 Architectural roles of productions Soar productions can fulfill four different roles the three knowledge retrieval problem solving functions and the state elaboration function all described on page 6 1 Operator proposal Operator comparison Operator selection is not an act of knowledge retrieval Operator application State elaboration ok w bdo 18 CHAPTER 2 THE SOAR ARCHITECTURE A single production should not fulfill more than one of these roles except for proposing an operator and creating an absolute preference for it Although productions are not declared to be of one type or the other Soar examines the structure of each production and classi fies the rules automatically based on whether they propose and compare operators apply operators or elaborate the state 2 3 3 Production Actions and Persistence Generally actions of a production either create preferences for operator selection or cre ate remove working memory elements For operator proposal and comparison a production creates preferences for operator selection These preferences should persist only as long as the production instantiation that created them continues to match When the production instantiation no long
291. tor with a numeric indifferent preference will not force a tie impasse When a set of operators are determined to be indifferent based on all other asserted preference types and at least one operator has a numeric indifferent preference the decision mechanism will choose an operator based on their numeric indifferent values and the exploration policy The available exploration policies and how they calculate selection probability are detailed in the documentation for the indifferent selection command on page 157 When a single operator is given multiple numeric indifferent preferences they are either averaged or summed into a single value based on the setting of the numeric indifferent mode command see page 167 Numeric indifferent preferences that are created by RL rules can be adjusted by the reinforcement learning mechanism In this way it s possible for an agent to begin a task with only arbitrarily initialized numeric indifferent preferences and with experience learn to make the optimal decisions See chapter 5 for more information Require A require preference states that the value must be selected if the goal is to be achieved 2 5 SOAR S EXECUTION CYCLE WITHOUT SUBSTATES 21 Prohibit A prohibit preference states that the value cannot be selected if the goal is to be achieved If a value has a prohibit preference it will not be selected for a value of an augmentation independent of the other preferences If there is an
292. tracing infor mation when a chunk or justification is created e epmem remove optional Print episodic retrieval traces and IDs of newly encoded episodes i indifferent selection remove optional Print scores for tied oper ators in random indiffer ent selection mode R rl remove optional Print RL debugging out put s smem remove optional Print log of semantic memory storage events Description The watch command controls the amount of information that is printed out as Soar runs The basic functionality of this command is to trace various levels of information about Soar s internal workings The higher the level the more information is printed as Soar runs At the lowest setting O none nothing is printed The levels are cumulative so that each successive level prints the information from the previous level as well as some additional information The default setting for the level is 1 decisions The numerical arguments inclusively turn on all levels up to the number specified To use numerical arguments to turn off a level specify a number which is less than the level to be turned off For instance to turn off watching of productions specify level 2 or 1 or 0 Numerical arguments are provided for shorthand convenience For more detailed control over the watch settings the named arguments should be used With no arguments this command prints information about the cur
293. ts of the subgoal Soar computes a chunk s conditions based on the productions that fire in the subgoal beginning with the results of the subgoal and then backtracing through the productions that created each result It recursively backtraces through the working memory elements that matched lFor some tasks bottom up chunking facilitates modelling power law speedups although its long term theoretical status is problematic 4 2 DETERMINING CONDITIONS AND ACTIONS 75 the conditions of the productions finding the actions that led to the WME s creation etc until conditions are found that test elements that are linked to a superstate 4 2 1 Determining a chunk s actions A chunk s actions are built from the results of a subgoal A result is any working memory element created in the substate that is linked to a superstate A working memory element is linked if its identifier is either the value of a superstate WME or the value of an augmentation for an object that is linked to a superstate The results produced by a single production firing are the basis for creating the actions of a chunk A new result can lead to other results by linking a superstate to a WME in the substate This WME may in turn link other WMEs in the substate to the superstate making them results Therefore the creation of a single WME that is linked to a superstate can lead to the creation of a large number of results All of the newly created results become t
294. ts the production while the other set delimits the conditions to be conjunctively negated If only the last condition lt bo gt type table were negated the production would match only if the state had an ontop relation and the ontop relation had a bottom object but the bottom object wasn t a table Using the negated conjunction the production will also match when the state has no ontop augmentation or when it has an ontop augmentation that doesn t have a bottom object augmentation The semantics of negated conjunctions can be thought of in terms of mathematical logic where the negation of A A BAC A AA BAC can be rewritten as 4A V 4B V C That is not A and B and C becomes not A or not B or not C 3 3 5 8 Multi valued attributes An object in working memory may have multiple augmentations that specify the same at tribute with different values these are called multi valued attributes or multi attributes for short To shorten the specification of a condition tests for multi valued attributes can be shortened so that the value tests are together For example the condition lt p1 gt type father child sally child sue could also be written as lt p1 gt type father child sally sue Multi valued attributes and variables When variables are used with multi valued attributes remember that variable bindings are not unique unless explicitly forced to be so For example to
295. ttribute 67 justification 27 creation 27 overgeneral 27 learn 73 159 learning 30 73 overgeneral 27 LHS of production 41 link 14 36 linked chunk action 75 Linux 4 load library 201 Is 188 Macintosh 4 make constant symbol 63 matcher 77 matches 124 max chunks 161 max dc time 162 max elaborations 163 max goal depth 164 max memory usage 164 INDEX max nil output cycles 165 memories 126 motor commands see I O multi attribute see multi valued attribute multi attributes 166 multi valued attribute 14 47 77 necessity preference 78 negated conditions 45 79 conjunctions 46 negated conditions 75 no change impasse 24 67 not equal test 43 numeric comparisons 43 numeric indifferent mode 167 O support 18 of result 27 reject 18 o support 215 o support mode 167 object 36 Operating System 4 operator application 11 comparison 10 proposal 10 representation 8 selection 10 support 18 operator no change impasse 24 ordering chunk conditions 77 overgeneral chunk 76 78 path notation 50 persistence 18 19 215 Personal Computer 4 popd 189 port 203 predicates 43 predict 168 preference 19 36 57 acceptable 19 21 36 219 acceptable as condition 48 best 20 219 better 19 219 indifferent 20 numeric indifferent 20 81 persistence see persistence 233 prohibit 21 76 78 219 reject 19 219 require 20 21 76 78 217 semantics
296. tual contents of working memory and most users needn t use this at all Note that multi attributes declarations must be made before productions are loaded into soar or this command will have no effect 8 4 CONFIGURING SOAR S RUNTIME PARAMETERS 167 Examples Declare the symbol thing to be an attribute likely to take more than 1 but no more than 4 values multi attributes thing 4 numeric indifferent mode Select method for combining numeric preferences Synopsis numeric indifferent mode as Options a avg average Use average mode s sum Use sum mode default Description The numeric indifferent mode command sets how multiple numeric indifferent preference values given to an operator are combined into a single value for use in random selection The default procedure is sum which sums all numeric indifferent preference values given to the operator defaulting to 0 if none exist The alternative avg mode will average the values also defaulting to 0 if none exist See Also rl indifferent selection o support mode Choose experimental variations of o support Synopsis o support mode n 168 CHAPTER 8 THE SOAR USER INTERFACE Options 3 Mode 3 is the same as mode 2 except that operator elaborations adding attributes to operators now get i support even though you have to test the operator to elaborate an operator In cases where the rule mixes support types s
297. tured notation 54 variable 214 action side 56 variables 42 variablization 77 INDEX verbose 141 version 205 waitsne 179 warnings 141 watch 142 watch wmes 147 Windows 4 wma 180 WME see working memory element working memory 13 13 acceptable preference 36 object 14 syntax 33 trace 75 working memory element 13 syntax 33 timetag see timetag worse preference 219 worst preference 219 write 60 Summary of Soar Aliases and Functions Predefined Aliases There are a number of Soar commands that are shorthand for other Soar commands Alias Command Page help 114 a alias 198 aw add wme 194 chdir cd 184 d run d 1 115 dir ls 188 e run e 1 115 eb explain backtraces 155 ex excise 111 fc firing counts 135 gds_print gds print 122 h help 114 inds indifferent selection 157 init init soar 115 interrupt stop soar 119 is init soar 115 1 learn 159 man help 114 p print 129 pc print chunks 129 pr preferences ey pw pwatch lsr rn rete net 190 rw remove wme 196 set default depth default wme depth 121 sn soarnews 204 ss stop soar 119 st stats 138 step run 1 115 stop stop soar 119 topd pwd 189 un alias d 198 unalias alias d 198 W watch 142 wmes print i 129 235 236 Summary of Soar Functions The following table lists the commands in Soar See the referenced page number for a complete description of each command Command Summary Page add wme Manually add an ele
298. ude any negated conditions that were in the original productions that participated in producing the results and that test for the absence of superstate working memory elements Negated conditions that test for the absence of working memory elements that are local to the substate are not included which can lead to overgeneralization in the justification see Section 4 6 on page 78 for details 28 CHAPTER 2 THE SOAR ARCHITECTURE 2 6 4 Removal of Substates Impasse Resolution Problem solving in substates is an important part of what Soar does and an operator impasse does not necessarily indicate a problem in the Soar program They are a way to decompose a complex problem into smaller parts and they provide a context for a program to deliberate about which operator to select Operator impasses are necessary for example for Soar to do any learning about problem solving as will be discussed in Chapter 4 This section describes how impasses may be resolved during the execution of a Soar program how they may be eliminated during execution without being resolved and some tips on how to modify a Soar program to prevent a specific impasse from occurring in the first place Resolving Impasses An impasse is resolved when processing in a subgoal creates results that lead to the selection of a new operator for the state where the impasse arose When an operator impasse is resolved Soar has an opportunity to learn and the substate and all its substru
299. ue bound to variable lt x gt Second the value must be greater than the value bound to variable lt y gt Third the value must be equal to 1 2 3 or 4 Finally the value should be bound to variable lt z gt In Figure 3 2 a conjunctive test is used for the thing attribute in the first condition 3 3 5 6 Negated conditions In addition to the positive tests for elements in working memory conditions can also test for the absence of patterns A negated condition will be matched only if there does not exist a working memory element consistent with its tests and variable bindings Thus it is a test for the absence of a working memory element Syntactically a negated condition is specified by preceding a condition with a dash i e see For example the following condition tests the absence of a working memory element of the object bound to lt pi gt type father lt p1 gt type father A negation can be used within an object with many attribute value pairs by having it precede a specific attribute 46 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS lt p1 gt name john type father spouse lt p2 gt In that example the condition would match if there is a working memory element that matches lt p1 gt name john and another that matches lt p1 gt spouse lt p2 gt but is no working memory element that matches lt p1 gt type father when p1 is bound to the same identifier On the other hand the cond
300. umeric prefer ences Choose experimental variations of o support Pop the current working directory off the stack and change to the next directory on the stack Can be relative pathname or fully specified path Returns the port the kernel instance is listening on Predict the next selected operator Examine details about the preferences that sup port the specified identifier and attribute Print items in working memory or production memory Find productions by condition or action pat terns Push a directory onto the directory stack chang ing to it Trace firings and retractions of specific produc tions Print the current working directory Generate a random number Manually remove an element from working memory Load input wmes for each decision cycle from a file Save the current Rete net or restore a previous one 188 124 161 162 163 164 164 165 126 166 167 167 189 203 168 127 129 132 189 137 189 203 196 197 190 237 238 rl run save backtraces select set library location set stop phase smem soarnews source sp srand stats stop soar time timers unalias verbose version waitsnc warnings watch wmes watch wma Control how numeric indifferent preference val ues in RL rules are updated via reinforcement learning Begin Soar s execution cycle Save trace information to explain chunks and justifications Force the nex
301. upport defaults to o support and a warning is printed 4 Mode 4 is the default It is the same as mode 3 except where a rule mixes support types support defaults to i support and a warning is still printed Description The o support mode command is used to control the way that o support is determined for preferences Only o support modes 3 amp 4 are valid other modes require Soar 7 which is no longer supported O support mode 4 should be considered an improved version of mode 3 The default o support mode is mode 4 In o support modes 3 amp 4 support is given production by production that is all preferences generated by the RHS of a single instantiated production will have the same support The difference between the two modes is in how they handle productions with both operator and non operator augmentations on the RHS For more information on o support calculations see the relevant appendix in the Soar manual Running o support mode with no arguments prints out the current o support mode predict Predict the next selected operator Synopsis predict Description The predict command determines based upon current operator proposals which operator will be chosen during the next decision phase If predict determines an operator tie will be encountered tie is returned If predict determines no operator will be selected state no change none is returned If predict determines a conflict wil
302. use for new chunks 8 3 CONFIGURING TRACE INFORMATION AND DEBUGGING 135 Synopsis chunk name format sl p prefix chunk name format sl c count Options s ghort Use the short format for naming chunks 1 long Use the long format for naming chunks default p prefix p If given use p as the prefix for naming chunks Otherwise return the current prefix defaults to chunk c count c If given set the chunk counter for naming chunks to c Otherwise return the current value of the chunk counter Description The short format for naming newly created chunks is lt prefix gt lt chunknum gt The long default format for naming chunks is lt prefix gt lt chunknum gt lt dc gt lt impassetype gt lt dcChunknum gt where e prefix is a user definable prefix string prefix defaults to chunk when unspecified by the user It many not contain the character e chunknum is a counter set by count or starting at 1 for the first chunk created e dc is the number of the decision cycle in which the chunk was formed e impassetype is one of tie conflict cfailure snochange opnochange e dcChunknum is the number of the chunk within that specific decision cycle firing counts Print the number of times each production has fired Synopsis firing counts n firing counts production_name 136 CHAPTER 8 THE SOAR USER INTERFACE Default Aliases fc firing counts
303. ut does not terminate the run The run may be continued by issuing a run command from the user interface The interrupt RHS function has the same effect as typing stop soar at the prompt except that there is more control because it takes effect exactly at the end of the phase that fires the production sp gt interrupt 60 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS Soar execution may also be stopped immediately before a production fires using the interrupt directive This functionality is called a matchtime interrupt and is very useful for debugging See Section 8 1 on Page 117 for more information sp production name interrupt gt 3 3 6 8 Text input and output The function write is provided as a production action to do simple output of text in Soar Soar applications that do extensive input and output of text should use Soar Markup Lan guage SML To learn about SML read the SML Quick Start Guide which should be located in the Documentation folder of your Soar install write This function writes its arguments to the standard output It does not auto matically insert blanks linefeeds or carriage returns For example if lt o gt is bound to 4 then sp gt write lt o gt lt o gt lt o gt x lt o gt lt o gt prints 444 x4 4 crlf Short for carriage return line feed this function can be called only within write It forces a new line at its position in the write a
304. ws Default Aliases sn soarnews time Use a default system clock timer to record the wall time required while executing a command 8 7 MISCELLANEOUS 205 Synopsis time command arguments Options command The command to execute arguments Optional command arguments unalias Undefine an existing alias Synopsis unalias name Default Aliases un cmd_unalias unalias Description This command undefines a previously created alias This command takes exactly one argu ment the name of the alias to remove Use the alias command by itself to list all defined aliases Examples unalias varprint See Also alias version Returns the version number of the Soar kernel 206 CHAPTER 8 THE SOAR USER INTERFACE Synopsis version Description This command gives version information about the current Soar kernel It returns the version number and build date which can then be stored by the agent or the application Appendix A The Blocks World Program HEFHHHHHHHEEHHAEEHEEHHHAEHEHEEHHHHEEHEHHERREERHEE HAH HHAEHRHAEEHHE AHR HRE EHH AHHH HHH HHH HHH HHH HHH HHH HHH HHH HHH HHH HHH HHH HHH HHH HHH File blocks soar Original author s John E Laird lt laird eecs umich edu gt Organization University of Michigan AI Lab Created on 15 Mar 1995 13 53 46 Last Modified By Clare Bates Congdon lt congdon eecs umich edu gt Last Modifie
305. xample S1 reward link R1 R1 reward R2 R2 value 1 0 R2 source environment R1 reward R3 R3 value 0 2 R3 source intrinsic R3 duration 5 The R2 source environment R3 source intrinsic and R3 duration 5 WMEs are arbitrary and ignored by RL but were added by the agent to keep track of where the rewards came from and for how long Note that the reward link is not part of the io structure and is not modified directly by the environment Reward information from the environment should be copied via rules from the input link to the reward link Also note that when collecting rewards Soar simply scans the reward link and sums the values of all valid reward WMEs The WMEs are not modified and no bookkeeping is done to keep track of previously seen WMEs This means that reward WMEs that exist for multiple decision cycles will be collected multiple times 5 3 Updating RL Rule Values Soar s RL mechanism is integrated naturally with the decision cycle and performs online updates of RL rules Whenever an RL operator is selected the values of the corresponding RL rules will be updated The update can be on policy Sarsa or off policy Q Learning as controlled by the learning policy parameter of the rl command Let 6 be the amount the Q value of an RL operator changes in an update For Sarsa we have di a rei YQ St41 t41 Q z ae where e Q s a is the Q value of the s
306. xcept by default chunks can be formed in all states the dont learn RHS action will cause learning to be turned off for the specified state 66 CHAPTER 3 THE SYNTAX OF SOAR PROGRAMS sp turn learning off state lt s gt feature 1 feature 2 feature 3 gt dont learn lt s gt The dont learn RHS action applies when learn is set to except and has no effect when other settings for learn are used force learn When learning is set to only by default chunks are not formed in any state the force learn RHS action will cause learning to be turned on for the specified state sp turn learning on state lt s gt feature 1 feature 2 feature 3 gt force learn lt s gt The force learn RHS action applies when learn is set to only and has no effect when other settings for learn are used 3 4 Impasses in Working Memory and in Productions When the preferences in preference memory cannot be resolved unambiguously Soar reaches an impasse as described in Section 2 6 e When Soar is unable to select a new operator in the decision cycle it is said to reach an operator impasse All impasses appear as states in working memory where they can be tested by productions This section describes the structure of state objects in working memory 3 4 1 Impasses in working memory There are four types of impasses Below is a short description of the four types of impasses This was described in more det
307. xe project tests loading the TestExternalLibraryLib library 8 7 MISCELLANEOUS 203 Examples To load TestExternalLibraryLib load library TestExternalLibraryLib To load a library that takes arguments say a logger load library my logger filename mylog log port Returns the port the kernel instance is listening on Synopsis port rand Generate a random number Synopsis rand returns a real number in 0 1 calls SoarRand rand n returns a real number in 0 n calls SoarRand max rand integer returns an integer in 0 2 32 1 calls SoarRandInt rand integer n returns an integer in 0 n n lt 2 32 calls SoarRandInt max Options i integer Return an integer optional argument is upper bound inclusive Description Generates a random non negative number returning the result in a string Examples rand integer 10 returns 0 10 for example 4 204 CHAPTER 8 srand Seed t he random number generator Synopsis srand seed Options THE SOAR USER INTERFACE seed Random number generator seed Description Seeds the random number generator with the passed seed Calling srand without providing a seed will seed the generator based on the contents of dev urandom if available or else based on time and clock values Examples srand 0 soarnews Prints information about the current release Synopsis soarne
Download Pdf Manuals
Related Search
Related Contents
_761514 - Manual Técnico EV Master (PET NBR) TV PH 21D TV PH 21E TV PH 29C TV PH 21D TV PH 21E TV PH 29C Bogart SE 2 User Manual (including Media X9DBL-3 X9DBL-3F X9DBL-i X9DBL-iF Copyright © All rights reserved.
Failed to retrieve file