Home
AgentTeamwork User Manual - UW Departments Web Server
Contents
1. Keep password for next time Monitoring Period min 1 if no monitoring m Domain Monitor UWB g lt Back Next gt Clear Help battatecstaend PZL obini Figure 0 FTP Server selection window 4 Page 4 describe your application PFLybk Select File Help Commander Agent Agent Teamwork Resource Selection Window user program name user program arguments 1 root path of jar file nomen input output directory f GUI port 2 Ateam port number 40000 The disk is used to store remote files lt Back Next gt Clear Help PAL ay haPBENSE LE Figure 0 User program selection window Specify your application program arguments passed to it the absolute path to its jar file a directory to input output files the GUI port and Ateam port Note that arguments 11 must be delimited with _ rather than a space For Sample java fill out only the user program name Sample 5 Page 5 specify if a commander agent should be submitted remotely or locally If you click the remote submission box specify where a commander agent should be submitted You must also tell the GUI of your computer s IP address in the canonical form rather than localhost 6 Page 6 describe about your bookkeeper agents If you use an XML resource database you may specify the number of bookkeeper agents each of which will be dispatched t
2. Check the following cases you didn t use a resource database and didn t specify extra nodes for resumption purposes all extra nodes you specified were used up the resource database couldn t available computing nodes and the database itself was crashed FIALek Select File Help Output Displa Commander Agent i 2 A gt priam E medusa E mnode Ld mnode1 L mnode2 L uw1 320 20 Le uw1 320 21 bel uw1 320 31 KT gt Standard Input Field Remote Node Information J Agent ID 36 Rank 4 Type of Agent computing agent on public network Current Status crashed Submit User Job L PZL hisni Figure 14 Job monitoring display of crashed agent Remote Node Information Agent ID 36 Rank 4 Type of Agent computing agent on public network Current Status lresuming on uwl 320 31 639 91 198 182 Figure 15 Resumed agent information 17 4 3 Job Termination You may terminate the current job anytime after submitting it Click the Abort User Job button located at the bottom left corner of the submitGUI window see Figure 16 a Applet Viewer SubmitGUI class SAR Applet Select File Commander Agent Output Display dione medusa mnoded mnodel mnode2 mnode3 4 gt Standard Input Field Remote Node Information Agent ID NA Rank NA Type of Agent NA Current Status ino
3. Options required recommended for a See table 3 commander agent Table 2 Options for UWinject Options Remarks Values required p The IP port through which you would like p 12345 to inject a commander agent should be between 5001 and 65535 m The maximum number of children each m 4 agentoan spawn Manuatory The value must be always 4 u The directory where agents code is u located Mandatory home uwagent agentteamwork AgentTe amwork Agents This option is always fixed Jar files an agent should carry with it Must be delimited with a comma Mandatory jars GUIUtil jar jars Agents jar jars Ateam jar jars MPJ jar jars commons net 1 4 1 jar jars jakarta oro 2 0 8 jar applications applications jar This option is always fixed 23 Table 3 Options for Commander Agent Options Remarks Example values AP An IP port number used for the underling AP_11112 GridTcp communication Mandatory Any number between 5001 and 65535 GP An IP port number with which the GP_11111 commander agent contacts SubmitGUI Any number between 5001 and 65535 or FileThread Mandatory but different from AP Gl An IP address with which the Gl_medusa commander agent contacts SubmitGUI or FileThread If omitted GI is set to A Ename OF agclee gt localhost S A list of ip names to dispatch a sentinel S_priam_uw1 320 00_uw1 320 01_uw1 agent If it is not given a r
4. display v Monode2 E uw1 320 20 El uw1 320 21 El uw1 320 31 desktop computing nodes and extra desktop computing node on public network agent selection window Standard Input Field field Remote Node Information Current Status not dispatched yet Submit User Job PZL o ha pbesneLE PALyk Select File Help i Commander Agent o W priam rami W medusa E mnode A mnodel E mnode2 E uw1 320 20 E uw1 320 21 E uw1 320 31 PZL v homens Li remote node information Figure 10 Job monitoring display Output Display userprog lt user program starts gt rank4 ack userprog instanciate GFIS rank4 userprog finish instanciate of GFIS rank4 userprog read from gridfile rank4 userprog nRead 47191 rank4 userprog nRead 1 rank4 userprog finish reading reading bytes 47191 rank4 userprog now write the read data rank4 userprog finish writing rank4 userprog output message O rank4 userprog output message 1 rank4 userprog output message 2 rank4 userprog output message 3 rank4 userprog output message 4 rank4 userprog output message 5 rank4 userprog output message 6 rank4 userprog output message 7 rank4 userprog output message 8 rank4 Ji Standard Input Field Remote Node Information Agent ID Be Rank l4 Type of Agen
5. sshTail sh sshTail bat sshUWPlace sh lt port gt lt f filename ipName ipName gt If a UWPlace daemon is launched remotely from sshUWPlace sh its log messages are all saved in the tmp yourAccount_uwplace log file The script batch file displays the last 10 lines of this log file If a UWPlace daemon stays idle the log shows agentList size 0 If the list size is larger than 0 the daemon has some agents regardless of their active or inactive status sshTrunc sh sshTrunc bat sshUWPlace sh lt port gt lt f filename ipName ipName gt As explained in sshTail sh sshTail bat a remote UWPlace daemon keeps saving its log messages in the tmp yourAccount_uwplace log file To prevent the file from growing unacceptably it is recommended that you should sometimes run the sshTrunc sh sshTrunc bat script to truncate the log file kill sh sshUWPlace sh lt port gt lt f filename ipName ipName gt This shell script shuts down the local UWPlace daemon running at a given IP port No corresponding DOS batch file sshKill sh sshKill bat sshUWPlace sh lt port gt lt f filename ipName ipName gt This script batch file shuts down all the remote UWPlace daemons that are running at a given IP port and listed in filename or in the form of IP names As described in Section 2 you can use the following two shell scripts or DOS batch files to restart a UWPlace daemon runUWPlace sh runUWPlace bat runUWPlace
6. Public Upload all public node XML resource definitions to the Public directory and all cluster definitions to the Cluster directory 21 5 2 Database Set up Before starting up the Database ensure that port 8000 is free AgentTeamwork is hard coded to this port for this purpose so port 8000 must be free for the database before you can use AgentTeamwork properly To start up the database properly run startDB sh or startDB bat reproduced below StartDB sh or startDB bat bin sh ec oo mmle java cp jars Agents jar AgentTeamwork Agents XDBase 8000 Shut down the database by running shutdownDB sh or shutdownDB bat reproduced below Before a shutdown the database saves any resource changes in the Ateam xmls directory which will be reused upon a next invocation You can also kill the database process with CTRL C however no changes will be recorded into the xmls directory shutdownDB sh or startDB bat bin sh echo shutdownDB shutdown the database java cp jars Agents jar AgentTeamwork Agents ShutdownDB Ensure that these scripts run where the resource agent is dispatched which is specified with the commander agent s R option 6 Command Line Operations 6 1 Job Injection from Command line Log in a computer that has installed AgentTeamwork From the command line you must run FileThread sh that takes care of file transfer between the front end and a user application runFile
7. cp S ATeam jars UWAgent jar UWAgent UWPlace p port m 4 localhost HopSkipJump ipAddr0O ipAddrl ipAddr2 ipAddr3 ipAddrN This agent visits each IP address in the same order as specified and prints out a greeting message from one to another place Make sure that the your local computer is running UWPlae and that SampleAgent s last argument i e ipAddrN is your local IP address Do not use localhost for ipAddrN 3 Job Injection 3 1 Compilation of Applications You need to compile your Java application before submitting it to AgentTeamwork Unless you have recompiled the AgentTeamwork system with Java1 6 make sure that you are going to compile your programs with Java1 5 Compile your Java programs with two AgentTeamwork s jar files such as MPJ jar and Ateam jar Compile your program on bash javac cp S ATeam jars MPJ jar SAteam jars Ateam jar java Compile your program on DOS prompt javac cp ATeam jars MPJ jar SAteamSjarssAteam jar java Thereafter create jar files for the user applications in the working directory by running the following command Create jar files on both bash and DOS prompt jar cvf lt name gt jar class When program files are modified or to add new class files to the existing jar file run the following command to append the necessary changes to the existing jar Create jar files jar uvf lt name gt jar class A sample java application was made available at ATeam applica
8. sh lt port gt This script batch file starts a UWPlace daemon locally at a given IP port sshUWPlace sh sshUWPlace bat sshUWPlace sh lt port gt lt f filename ipName ipName gt This script batch file starts a UWPlace daemon at a given IP port on all computing nodes that have been listed in filename or in form of IP names 19 5 XML Resource Database 5 1 XML resource definition To use AgentTeamwork s resource database an XML Resource definition should be written for every computing node and uploaded to an FTP server accessible by all AgentTeamwork daemons There are two types of XML resource definitions one for public nodes and one for clusters An example of each appears below The name of the resource definition files must be lt cur_ip_name gt xml for public nodes and the name of definition files for clusters must be lt cl cluster_name gt xml Public node XML resource definition uw1 320 10 xml lt xml version 1 0 gt lt resource gt KClSsiGia_ ie linie gt lt domain gt UWB lt domain gt lt ip name gt uwl 320 10 lt ip name gt lt io ackle gt 6S 91 198 16l lt ijs exckelie gt lt human_owner gt uwb lt human_owner gt lt cpu_speed gt 2100 lt cpu_speed gt lt cpu_arch gt intel lt cpu_arch gt lt cpu_count gt 2 lt cpu_count gt lt memory gt 1024 lt memory gt lt os_type gt linux lt os_ type gt lt disk_space gt 40 lt disk_space gt lt cpu_load gt 100 lt cpu_load gt lt availabil
9. ssh L lt accountName gt L lt localOutPort gt localhost lt targetInPort gt R lt targetOutPort gt localhost lt localInPort gt lt targetHostIP gt Example SSH Tunnel Command Soham mickey k LOOOs locallinosies 12345 Ik 25005 Loc alaosie gil 3415 uwl 320 00 uwb edu This allows the local computer to establish a socket to mickey s at uwl 320 00 uwb edu so that a local process can write and read data to port 1000 and from port 123455 whereas a remote process at uwl 320 20 can read and write data from port 12345 and to 2500 Once an SSH tunnel is established run the UWPlace command via the SHH tunnel on the remote computer repeat this process as necessary for multiple remote systems UWPlace Command for bash java cp S ATeam jars UWAgent jar UWAgent UWPlace p lt localInPort gt lt localOutPort gt lt targetHostIP gt UWPlace Command for DOS prompt java cp SATeam jars UWAgent jar UWAgent UWPlace p lt localInPort gt lt localOutPort gt lt targetHostIP gt Example java cp SATeam jars UWAgent jar UWAgent UWPlace p 12345 1000 uwl 320 00 uwb edu To shutdown daemons visit each terminal window where UWPlace is running and type control c to terminate the current runUWPlace sh or runUWPlace bat 2 3 UWPlace Execution Test To check if a UWPlace daemon is running at remote nodes as well as your local site inject a sample agent as follows HopSkipJump java cd SATeam SampleAgents java
10. 0 01 uw1 320 02 uw1 320 03 in a separate line Use runUWPlace sh for computing nodes that do not run bash or do not see ATeam as the same path e g Windows clients or Linux Mac not connected to the same NFS Log in each of those computers and type as follows runUWPlace sh for bash runUWPlace sh lt port gt runUWPlace bat for DOS prompt runUWPlace bat lt port gt Example runUWPlace sh 12345 UWPlace daemons can be shut down with one of the following two actions If you have invoked the daemons with sshUWPlace sh at once use sshkill sh sshKill sh sshKill sh lt f filename ipName ipName gt There filename includes all remote IP names Or you can enumerate them as arguments Example sshKill sh 12345 zmac100 zmaci101 zmac102 zmac103 sshkill sh f nodes where the nodes file includes zmac100 zmac101 zmac102 zmac103 in a separate line If you have started up each daemon with runUWPlace sh visit each terminal window where UWPlace is running and type control c to terminate the current runUWPlace sh or runUWPlace bat 2 2 Inter Domain UWPlace Invocation and Shutdown This invocation assumes that only ssh port 22 is available to establish a socket across IP domains In that case your local and remote computers may need to use an SSH tunnel to establish inter UWPlace communication Establish an SSH tunnel from your current system to the target remote computer using the following command SSH Tunnel Command
11. 4 argument the resource probing frequency in minutes default 5 5 argument the resource domain Note To keep a resource agent from spawning sensor agents the 4 argument must be 1 RB If the B option is not specified the RB_2 number of bookkeeper agents to spawn must ba given with RE 2 agents dispatched to a different node RQ A list of query option parameter pairs RQ_cpuarch_linux_total_2 ip directly specify where to run an application This corresponds to the S option cpuspeed the CPU speed in MHz cpuarch the CPU architecture such as intel 68K PowerPC and SPARC cpucount cpus in each computing node memory per node memory size in Mbytes disk per node disk size in Gbytes Two computing nodes under the Linux control are required for an application 25 total computing nodes required for a given user application mandatory if RQ is specified time when to run an applicatioin 855pm34se 205534 or now 0 os operating system type such as linux windows and solaris cpuload percentage of cpu idle state in general 100 bandwidth the network bandwidth required in Mbps choice choice of public cluster or a specific Icuster use RN A multiplier to determine how many RN_2 0 Hac eel ais Se evi The twice large number of computing lt gent X cae nodes will be allocated to a user 1 0 1 5 2 0 or 3 0 Otherwise 1 0 will be ae program used as
12. Agent ID Rank Type of Agent computing agent on public network Current Status not dispatched yet Figure 12 Remote node information in computing agent public network Remote Node Information Agent ID NA Rank Type of Agent extra computing agent on public network Current Status not dispatched yet Figure 13 Remote node information in extra computing agent public network If a certain agent crashed its agent s job monitoring display is shown as in Figure 14 and the crashed agent hops to extra computing node The user can discover where the agent moved by checking job monitoring display of extra computing nodes The status of resumed agent is shown in Figure 15 4 Automatic Job Abortion A job may be aborted automatically when AgentTeamwork can no longer continue to run it because of the following reasons a You have specified wrong computing nodes for a job injection In most case nodes specified for you re a job may not have yet started UWPlace Make sure that UWPlace is running at each of the nodes specified b Your application caused an exception Debug your program 16 c When detecting a node crash AgentTeamwork was not able to find the latest snapshot corresponding to this node Your application did not include checkpoints at all or a node was crashed before your application took the very first snapshot d When detecting a node crash AgentTeamwork was not able to find another node to resume the application
13. AgentTeamwork User Manual Munehiro Fukuda Miriam Wallace Jumpei Miyauchi Computing and Software Systems University of Washington Bothell Department of Computer Science Ehime University Table of Contents AgentTeamwork User Mam alls acc c2eccccterisensass astietcencencsonossdtelabbeliipadied eee cesses ate odad 1 Mable of CONtentS issen mee teeter Mee eater ee EE EEEE 2 1 System MES ell OUR gee ctsets cassis e eote e ese tse edad 3 WU PA ea MALY ace deied assent pcptsascor tence ge teem aside a seat a ena mea annua ERRO R 3 1 2 Downloading and Extraction ccccceccceceseeeseeesseceseeeeneceseeceseceaceceeeceaeecsseeeseeseseeneeenseenseeeaes 3 2 VST IT INC AMON secere erore e rO E OEEO cece cece ane 4 2 1 Intra Domain UWPlace Invocation and Shutdown 0 ccceccceccceseeeseceneceseeceeeeeeeeeseeeeeeees 4 2 2 Inter Domain UWPlace Invocation and Shutdown 0 ccceccceccceseeseeeneeeeeeeeeeeseeenaeeneeeees 5 Be OID NECO ees cena ce Ac e A E eras a etd E E AEEA 6 3 1 Compilation af Applications siseses inris aiee KE aAA E AEAEE EAE ETO ESERE 6 3 2 Job Injection from submitG U sirens roia isois 8 3 3 Fil Transfer with SubmitGU Lionsin n N R EEE aE 14 4 Job Monitoring and TSUNA casei ett aieedatt cue ea cane edioetectgncale taeda stern eemeeeeteeneltnay 14 4 1 Job monitoring atid Standard In Output ssscssisaieseesccsiesisvnasesesicesadenssestdentdanecdousesdsusesssavtovsuedes 14 IJ Agentseleciion WINKO W ein e
14. E EEEE EaR 28 8 Final COMMEN S asica R E aE EEE SE T EEEE 28 Appendix System Directory ccacs se cececrsdidansds od cdi octane noaeiilbueten Gee aes 29 1 System Installation 1 1 Availability The system currently supports bash on Linux bash on Mac OS X and DOS command prompt on Windows XP All the class files have been generated with Java version 1 5 They can be also compiled and executed with Java version 1 6 1 2 Downloading and Extraction AgentTeamwork does not require root accesses at all You can simply install it onto your home directory with an ordinary user account To download the latest version of the AgentTeamwork system go through the following steps 1 Create a tripod account If you belong to Distributed Systems Laboratory at UW Bothell or Multimedia Database Laboratory at Keio SFC use the agentteamwork or the mdbl sfc account and skip to step 3 2 Email us at dslab u washington edu Your email should include your name affiliation and tripod account Wait for our response stating that you have been authorized to download the system 3 Visit http agentteamwork tripod com release ateam tar gz You will be asked to type your tripod account and password in order to download this zipped file To install the system follow the instructions given below Linux Mac OS X users Mac users need to start bash 1 Move ateam tar gz to your home directory if it is not stored there mv ateam tar gz 2 Unzip the file gzi
15. Finish PAL yh PS ELE Figure 0 Root Sentinel agent and desktop computing node selection window 3 3 File Transfer with SubmitGUI SubmitGUI has file transfer function by which input files are transferred automatically to the Commander agent after job submission Similarly output files that are created by each computing agent are received by the Commander agent and then written to the user specified input output directory 4 Job Monitoring and Termination 4 1 Job monitoring and Standard In Output After selecting resources the user inputs the port number for UWPlace as follows r a Input x Java Applet Window 2 Specify port for UWPlace 12345 ok Cancel Figure 1 Port input window Then a job monitoring display is shown 10 As shown Figure 10 SubmitGUI s job monitoring display is constructed from 1 agent selection window 2 standard output display 3 standard input field and 4 remote node information display 1 Agent selection window The agent selection window allows the user to see the standard output from an agent send standard input to an agent and monitor the status of all agents All gateway agents public computing nodes and extra public computing nodes can be found under the root Sentinel agent 14 PILyk Select File Help Output Display E Commander Agent pas standard output root sentinel agent gateway agent computing nodes on cluster network
16. Settings mickey of on Linux file home mickey ise file Volumes Users Users mickey agentteamwork GUI permission java io FilePermission lt lt ALL FILES gt gt read write delete EXECUTE 5 permission java util PropertyPermission user home read permission java util PropertyPermission user dir read permission java util PropertyPermission file encoding read permission java lang RuntimePermission modifyThread permission java lang RuntimePermission modifyThreadGroup permission java net SocketPermission accept grant codeBase file Volumes Users Users mickey agentteamwork jars permission java io FilePermission lt lt ALL FILES gt gt read write delete EXECUTE p permission java util PropertyPermission user home read permission java util PropertyPermission user dir read permission java util PropertyPermission file encoding read permission java lang RuntimePermission modifyThread permission java lang RuntimePermission modifyThreadGroup grant codeBase file Volumes Users Users mickey agentteamwork jars permission java net SocketPermission 127 0 0 1 connect grant codeBase file Volumes Users Users mickey files permission java io FilePermission lt lt ALL FILES gt gt read write delete ELECUTLE grant codeBase file Volumes Users Users mickey permissi
17. Thread sh or runFileThread bat runFileThread sh port directory There port must be the same as the GP option given to a commander agent see table 3 and directory is where your input output files are stored If you intend to inject a job from the same terminal window where runFileThread sh was invoked you must run runFileThread sh with the amp delimiter Thereafter you can inject a commander agent to dispatch your application using UWInject The command format is Inject a job java Xmx512M cp linkToUWAgent jar linkToAgents jar linkToGUIUtil jar UWAgent UWInject optionsforUWInject destination AgentTeamwork Agents CommanderAgent optionsforCommander 22 Table 1 Parameters for Command Line Input Parameters of the command line input Remarks Example values linkToUWAgent jar UWAgent jar A symbolic link or a pass to jars UWAgent jar assuming that the current working directory is agentteamwork linkToAgents jar A symbolic link or a pass to Agent jar jars Agents jar assuming that the current working directory is agentteamwork linkToGUIUtil jar A symbolic link or a pass to GUIUtil jar jars GUIUtil jar assuming that the current working directory is agentteamwork optionsforUWInject Options required recommended for See table 2 UWInject destination The IP where a commander agent is localhost necied Injected locally optionsforCommander
18. a r a aE a R Ea E A EESE 14 2 Standard output CIGD TAY osacesscecsateanneaiesesnceeasteeageennsenssivesceadacesemnsuseensnteneense iayaueniedenbncncnienes 16 3 Sta dard input Deldeni E E O ude ame 16 4 Remote nodeinformation display ssassn anena EE E E EE REEE EERE 16 Automate Job ADO GOD sasssa E E ETE ASETE ETETEA 16 4 3 Job Terminates essi Meee eee ner E EE EE RE EAEE RE REEE 18 4 4 Monitoring and Terminating UWpPlace ee eeccesceeeseceneeeeeeeeseeeeeenaeeeseeenseeseeenaeeeeeenseens 18 Be PAIL RES OUND Occ cise eae Se cece ee ec ees pees scene eee cee cess 20 5 1 XMLr source STATION s nenir ia i NEE EE EER EEE EEE 20 5 2 Database Set up sruseseciinse vecrnusies vcsncisanciennsanaiaenonpiecunniaaslatisansaaueseni sanded sleet uncnmulueasorieeueaunncauren taawnes 22 6 Command Line Operations ss ca idaiatnd ceca wwiaoal duncan Metin ttt ealomsedbotnindimananaicboucladlecclistatesaiets 22 6 1 Job Injection from Command Hine 3 acesisssanzsiusnssevccnssndngarnnd sseatvedienm ase censmiuiasaiiacareroenaes 22 6 2 Job Termination from Command line icsavesscesecennsensneonszvorenidesoeswencievemnseuneonnesier weresnusoenneves 27 6 3 File Transfer with Command Line 5 cicsscaccgntgvsarcorskeinensse toler eaentaen ane pa ees 27 Ts TRU Shooting scan deste rsa ete aae enced eee oe 28 Eirors With Serializable i siuesencevestintoasdsentsciitanscandeneses eere EE E riea eU EOSL EE EHEER EAEE E ETEN A 28 MPJ EOF noioe E O eden stead E EEE EASE ESE
19. an XML database Stare DIS SN Ginn Hone oO a script to start up an XML database WUNMIOMInEOe SIN casacaas a serip Co monircor kilil a local agent XDBaS EGUES A E a script to monitor a local XML database pawi eO SADa tehi aies tinct asec olla eOr reS pond OMS riots MAGEN EVE A TE E ota UWAgent mobile agent execution engine Senio ke ANA E A E O sample mobile agent programs 29 30
20. ce as far as all the computers you will use are connected to the same NFS Otherwise repeat the above installation work at each non NFS connected computer accordingly 2 System Invocation AgentTeamwork runs on top of a network of UWAgent mobile agent execution daemons You need to start up this daemon named UWPlace at each of all computing nodes that you will use for your job execution Once a computing node has the UWPlace daemon running it becomes an available location for agents to migrate and perform tasks 2 1 Intra Domain UWPlace Invocation and Shutdown This invocation assumes that IP ports above 5000 are open to all computing nodes within the same IP domain such as uw1 320 00 uw1 320 31 uwb edu mnode0 mnode31 uwb edu or zmac000 zmac159 sfc keio ac jp Two shell scripts are available to launch UWPlace sshUWPlace sh and runUWPlace sh Use sshUWPlace sh for all computing nodes that run bash and identify the ATeam environment variable with the same path e g Linux Mac machines connected to the same NFS Log in one of those computers and type as follows sshUWPlace sh sshUWPlace sh lt port gt lt f filename ipName ipName gt There filename includes all remote IP names Or you can enumerate them as arguments Example sshUWPlace sh 12345 uw1 320 00 uw1 320 01 uw1 320 02 uw1 320 03 sshUWPlace sh 12345 zmac100 zmac101 zmac102 zmac103 sshUWPlace sh f nodes where the nodes file includes uw1 320 00 uw1 32
21. d show as its arguments The commander agent spawns a resource agent at dione requesting two computing nodes for the application It also spawns a bookkeeper agent at tarvos The resource agent accesses ftp tripod com through the agentteamwork account and its password No sensors will be generated 6 2 Job Termination from Command line After dispatching a job with a commander agent runFileThread sh runFlleThread bat shows the status of the job execution If you want to terminate a job simply type abort Upon receiving a completion signal from the commander agent runFileThread sh runFileThread bat will return its control back to a new command line 6 3 File Transfer with Command Line If the user submits the job from a command line utility input files are not transferred automatically In this case the user can transfer them by using FileThread which has similar functions to SubmitGUI FileThread allows the user to transfer input files receive standard outputs and receive output files FileThread options are specified when it is run such as p port d directory name i and w p is port number to communicate with Commander agent d is directory input output directory name to transfer and receive i is specified if input files are in user side Thus files are transfer if this option is specified If this option is not specified directory information is sent to Commander and then Commander agent itself r
22. eads the files W is specified if input files are stored in remote computing node s tmp directory These options can only be used in the following combinations Accepted FileThread option combinations Specified option Behavior p prot d directory name w i Transfer input files to commander agent Input files are not transferred The files are read by p port d directory name w commander agent and then they are passed by commander agent p port Just receives standard output 21 7 Trouble Shooting Errors with Serializable When using the snapshot feature of Agentfeamwork to save the state of execution of a user program ensure that all objects in the program to be saved are serializable If they are not the snapshot cannot be transmitted to a Bookkeeper and the checkpoint will fail Error java lang outofmemory This error can occur if the user is running both the Commander and the Bookkeeper Agents on the same computer In order to avoid or fix this error give the Commander and Bookkeeper Agents different nodes to execute on Constructor Error User applications must have a constructor that accepts an ATeam object This constructor must exist in addition to any other user defined constructors Sample constructor accepting ATeam object argument public UserProgram Ateam o MPJ Error If ATeam complains about MPI initialization the user may not have passed t
23. ecified in the CL option this ECL means a list of additional computing nodes in the same cluster for recovery purposes 4 _mnode5 When the medusa wub edu cluster detects any node crash it will resume computation on mnode4 and thereafter mnode5 24 A list of IP names to dispatch a bookkeeper agent If it is not given a resource agent is responsible to provide the commander with such a list B_tarvos_phoebe Two bookkeeper agents are dispatched to tarvos and phoebe respectively U A user program name and its arguments U_Mandelbrot_ This is a mandatory option 2 0_1 0 0 0 200000 _GRAD BLUE Mandelbrot is a Java user program and the rest are its arguments R An IP name to dispatch a resource agent R_dione to If it is not given a resource agent is A resource agent is dispatched to launched at the same computing node as dione the commander is working No more than one resource agent should be invoked RA A list of arguments passed to a resource RA_ftp tripod com_agentteamwork_ agent It is mandatory if a resource agent _ 1_UWB o due to missing giihe S aid A resource agent will contact opona ftp tripod com with the account 1 argument a shared ftp name agentteamwork and the password nd f and choose resources in the 2 AUMEN Me Epsaccount UWB domain It won t spawn sensor 3 argument the ftp password agents due to 1 as the 4 argument
24. ecution are located at the agentteamwrok scripts directory Directory Structure AGEN REC AMWOLKY s Ml ra sere oneg are sche eNe AA Sae the root of the AgentTeamwork system AgentTeamwork PROVEN SAL A E E A R T ecg RN includes all agents PSS AM Ye E A A E AS includes AgentTeamwork APIs Giaa oitako E A fault tolerant file 1 0 GAEITN A Se peretec tors E interface to Ct applications GirewciNloy E interface to Ruby applications Grralep we Suea Ea a a a esate ERTA rault trolerant WOP applications appiieatironss seis Ao Ouran ne eee coe all applications so far croiomhe slay atone eee a script to compile applications TADIGVNES CIMA og a Sow es a script to inject an application DESE DUES dE REP a distributed grep Java program Mandelbrory a a E a est a Mandelbrot Java program Meiers Mn NE a matrix multiplication Java program Wave2 DAA E a Schroedinger s wave Java simulator Same yet ends a e E A rer a sample Java program benchmark PUDA TSamnAS AEREE a script to inject a benchmark program runAteamClusters sh a script to use multi clusters TEOUMURIGKES SAY eS oS Gat a script to use a resource XML database IO CY ee E E step i he heey een ates etal Le A OR A agen JavaDoc files CURA eer yeas tty By Onna o Dee ree nee AgentTeamwork s GUI including SubmitGUI Ja rS EAE I have aa A Sy E T eee eee a all AgentTeamwork s jar files MEN a E a E a E a a tice E aL women ate aie MPI Java implementation scripts runUWP
25. esource agent 320 02_uw1 320 03 is fespolisibie to provide the commander The root sentinel reins all others at A oe si a marae 2 priam The other nodes including uw1 E A 320 00 uw1 320 01 uw1 320 02 and that does not participate in actual uw1 320 03 will participate in computaHon computation CL A remote cluster to use as well The CL_medusa uwb edu_medusa_mnode0_ cluster gateway name comes first mnode1_mndoe2_mnode3 followed py Mie clgstel gateway node medusa uwb edu is a cluster gateway PAE e oe a whose cluster internal alias is medusa It N giv reins four cluster internal nodes including the S option a resource agent is mnod d modet mn de2 and responsible to provide the commander mnode3 i with such a list NOTE To specify extra compute nodes within this cluster supply an ECL parameter see below using the same cluster name as this CL parameter E A list of extra IP names to resume an E_uwt1 320 04 uw1 320 05 agent when one is crashed If a resource If a sentinel agent is crashed it is agent is invoked due to missing of the S cummed at ee 1320 04 If one more andlor e option s es provide me crash occurs the next choice will be commander with the E option uw1 320 05 ECL An extra remote cluster to use The ECL_medusa uwb edu_medusa_mnode cluster gateway name comes first followed by the cluster gateway node alias followed by a list of machine nodes within that cluster If the same cluster gateway is sp
26. he correct arguments to ATeams MPI implementation of the Init function Be sure to invoke MPJ Init args ateam Any other invocation i e MPJ init args will result in execution errors Bookkeeper Bottleneck Crashes In order to avoid performance issues stemming from bottleneck at the Bookkeeper accepting and retrieving snapshots ensure that there are multiple Bookkeeper Agents running If one crashes the others can handle the load and if there are a lot of snapshots coming in from other agents having multiple Bookkeepers allow these messages to be processed more quickly Hanging while mainThread waiting for all_locations This message means that all the agents specified by the user have not been started Review the application code ensuring that all agents specified by the program are instantiated Error java net ConnectException Connection refused Check that all locations specified to run the application are also running UWPlace This error occurs most often when UWPlace is not running on a target machine 8 Final Comments AgentTeawork ATeam and associated classes are copy write University of Washington Bothell No advanced notice of changes and revisions to AgentTeamwork and ATeam or relate classes are required Users may use classes and associated methods at their own risk 28 Appendix System Directory The following table summarizes AgentTeamwork s directory structure All scripts necessary for job set up and ex
27. ity multiple true gt lt time gt 0000 1159 lt time gt lt time gt 1200 2359 lt time gt lt availability gt lt time_zone gt pacific lt time_zone gt sinter ne device ei nerneis Inter msc CloyieSs gt lt intra_net_device gt ehternet lt intra_net_device gt lt libraries multiple true gt lt name gt cexec lt name gt lt name gt mpirun lt name gt lt libraries gt lt inter_net_band gt 100 lt inter net_band gt lt intra_net_band gt 100 lt intra_net_band gt lt design time gt lt resource gt Cluster XML resource definition cl medusa 8 15 xml lt xml version 1 0 gt lt cluster gt lt design time gt lt domain gt UWB lt domain gt lt name gt cl medusa 8 15 lt name gt lt gateway gt medusa uwb edu lt gateway gt lt alias gt medusa lt alias gt lt group gt lt i dla sic lt ip name gt mnode8 lt ip name gt 20 lt ip name gt mnode9 lt ip name gt lt ip name gt mnodel0 lt ip name gt lt ip name gt mnodell lt ip name gt lt ip name gt mnodel2 lt ip name gt lt ip name gt mnodel3 lt ip name gt lt ip name gt mnodel4 lt ip name gt lt ip name gt mnodel5 lt ip name gt siio Lises lt human_owner gt uwb lt human_owner gt lt cpu_speed gt 3200 lt cpu_speed gt lt cpu_arch gt intel lt cpu_arch gt lt cpu_count gt 1 lt cpu_count gt lt memory gt 512 lt memory gt lt os_type gt linux lt os_ type gt lt disk_space gt 30 lt disk_space gt lt cpu_load g
28. lace sh a script to launch a UWAgent daemon locally boUWPlaces shy to launch a UWAgent daemon in background SSNPS Sl cesce to start bgUWPlace sh at remote machines ker SA Sb sceeee to kill a local UWAgent daemon i e UWPlace tle rruncr onh no sho a script internally used by sshTrunc sh SSRK INES ea oer set emcee a script to kill remote UWAgent daemons Sis niasial fesihiay Mme aaa hence to view the tail of remote daemons log SHAS SG Rb oe a script to delete remote daemons log SSMUCUC SIN Bos hon a script to truncate remote daemon s log SSMUMMOMEOTINS Sl son geaad a script to monitor remote agents compileAndPackUWAgent sh a script to compile UWAgent compileAndPackAgents sh a script to compile all agents compileAndPackAteam sh a script to compile APIs ComoslleyNncleevelMeU SI so y550 506 a script to compile MPI Java COL levNoChecvelMGwIE SSI a a script to compile GUI compileAndPackBenchmark sh to compile benchmark programs compileAndPackApplications sh to compile all applications Clelii out kevNacheeNolOMbILSSial Ba Soo coo eased demos to compile all files gendJavabDocsshi oesoeososoe a script to generate Java documents cleanclasskitesr sh yar ae a script to delete all class files puneieTthreadr Sel bona ose to launch a file transfer process LAwINCGWGUUSN E a script to launch a SubmitGUI process TAUIMIDIBINCIEMIEV Sl E to inform XDBase of node recovery SINMCCONMMDI Sl no dcaaoea a script to shutdown
29. m cyclett else ateam isResumed System out println isResumed Compute a Sample object in both master and slaves program new Sample ateam registerLocalVar program program System out println takeSnapshot main ateam takeSnapshot 0 start the computation program Computel Terminate the MPI library MPJ Finalize Its compilation should be done with the following scripts Compile Sample java and create its jar on bash cd S ATeam applications Sample javac cp SAl jar cvf Samp Team jars MPJ jar SATeam jars Ateam jar Sample java le jar Sample class Compile Sample java and create its jar on DOS prompt Cd ATeam applications Sample javac cp AT jar cvf Samp Teams jars MPJ jar SATeam jars Ateam jar Sample java le jar Sample class 3 2 Job Injection from submitGUI If it is your very first time to submit a job to AgentTeamwork through its GUI you have to set up your java policy file so as to run the GUI as an applet correctly Cut and past the following policy example and save it as the java policy file under your home directory i e java policy on bash and C Documents and Settings USERNAME java policy on Windows Example java policy on Mac OS X Replace file Volumes Users Users mickey with your home directory For example For example grant codeBase on Windows file C Documents and
30. multiplier RC A list of classes that a resource agent will XCollection_SensorAgent_DatabaseMan agementService_Service_DeletionServic e RetrievalService_QueryService_XPath Util StoreService_SensorAgent 1_Sens orAgent RemoteRscProbeTask_Ttcp_T tcp Connect_Option_StopWatch This option is always fixed For running a job with AgentTeamwork repeatedly it is highly recommended to create and run a shell script that includes the above parameters An example shell script bin sh cd HOME agentteamwork dev java Xmx512M cp jars UWAgent jar jars Agents jar jars GUIUtil jar UWAgent UWInject localhost AgentTeamwork Agents CommanderAgent 12345 m 4 u AgentTeamwork Agents j jars GUIUtil jar jars Agents jar jars Ateam jar jars MPJ jar jars commons mec 4 i jar jars Jjakarta 0ro 2 0 8 jar jars xalan jar xercesImple jar applications applications jar U_Wave2DAteam 448 3000 200 show DERTIEN Ge Lii y R_ dione 26 RQ total 2 B_ tarvos RA TED WieLjOCG COM ACPimicieSeumiyoiehe Kes IL UNIS RC_XCollection SensorAgent DatabaseManagementService Service DeletionServic _RetrievalService QueryService XPathUtil StoreService SensorAgent 1 Senso rAgent SRemoteRscProbeTask Ttcp Ttcp Connect Option StopWatch This example shell script injects a commander agent through IP port 12345 local to where a user currently logs in The application is Wave2Ateam that takes 448 3000 200 an
31. o a different computing node automatically If you do not use the database thus skipping pages 2 and 3 specify computing nodes to accept a bookkeeper agent Such a list of bookkeeper agents must be delimited with _ rather than a comma or a space Click finish if you use the database otherwise click next PIALyb Select File Help Commander Agent Agent Teamwork Resource Selection Window You can specify the number of bookkeepr agents that is dispatched as RB option If you want to specify a location where they are dispatched give there location in the B option field Note that you can not specify RB option if you do not use a resource agent RB 1 B option lt Back Clear Help Finish PIL yh abs nsLE Figure 0 Bookkeeper agent selection window 7 Page 7 specify the number of cluster systems to allocate to your job The user can input the number of clusters used for job execution Optionally the user may also specify the number of extra clusters for cluster resumption If you use computing nodes in the same domain as you are working fill these fields with 0 and skip to page 9 12 EN Select File Help Commander Agent Agent Teamwork Resource Selection Window Number of clusters that assign agents 1 If you need you can specify extra clusters lt Back Next gt Clear Help PILo babes ns LI Figure 0 Number of clusters selection window 8 Page 8 specify clus
32. on java io FilePermission lt lt ALL FILES gt gt read Ie Now run launchGUIL sh that internally starts AgentTeamwo Start launchGUI sh TaunehGul sh Follows the GUI s menus as shown below 1 Page 1 enable disable the resource agent Select File Help Commander Agent E AgentTeamwork Resource Selection Window Do you use a resource agent v use a resource agent Where is a resource agent dispatched medusa Next gt Clear Help PIL y babes ns LIE Figure 0 Resource agent selection window rk s job submission GUI If you have invoked AgentTeamwork s XML resource database and use a resource agent check the use a resource agent check box and specify the IP address where the XML database is running If you don t use the resource database don t check the box to skip to page 4 2 Page 2 decide resource requirements such as OS CPU disk memory etc If you have checked the use a resource agent check box on page 1 decide OS type CPU architecture disk space memory space CPU cores per each node percentage of CPU idle a choice of cluster computers or individual public computers CPU speed total computing nodes when to run the user job e g 855pm34sec 205534 now 0 and extra nodes for resumption purposes RN If necessary you may specify IP addresses of computing node you would like to use The minimum requirement is total computing nodes denoted a
33. p d ateam tar gz 3 Extract all files tar xvf lt ateam tar 4 Setup your bashrc file ATeam SHOME agentteamwork export ATeam PATH S PATH SATeam scripts 5 Reinitialize bashrc source bashre 6 Recompile the source if you want to revise AgentTeamwork to Java 1 6 ATeam scripts compileAndPackAll sh Windows XP users 1 Unzip and extract all files through 7 Zip Explzh or any available windows based file archiving tool 2 Move the ateam folder to C Documents and Setting USERNAME 3 Setup the ATeam and PATH environment variables by clicking Environment Variables in My Computer s Profile menu or using Rapid Environment Editor set ATeam C Documents and Setting USERNAME set path path ATeam bat 4 Recompile the source if you want to revise AgentTeamwork to Java 1 6 ATeam bat compileAndPackAll bat Linux users at UW1 320 UW Bothell No need to download and to recompile the system 1 Simply set up your bashrc as follows ATeam home uwagent agentteamwork export ATeam PATH S PATH SATeam scripts 2 Reinitialize bashrc source bashre Mac24 users at Keio SFC No need to download and to recompile the system Note that Mac users need to run bash 1 Simply set up your bashrc as follows ATeam home mfukuda CNSiMac agentteamwork export ATeam PATH S PATH SATeam scripts 2 Reinitialize bashrc source bashre Note that this installation should be done only on
34. s total Recommendations Fill only this total field if you would like to maximize the opportunity of finding the best computing resources from an XML resource database Choose public in the choice field if you use computing nodes within the same IP domain as you are working Pv b Select File Help e Commander Agent Agent Teamwork Resource Selection Window Operating system windows Y cpu speed GHz 1 4 cpu architecture intel total disk GB 20 time a memory MB 256 RN 1 ip mnoded mnodel mnode2 cpu count 1 cpu load 9 10 v v v v v v bandwidth Mbps 1 5 13 remove add 50 100 lt Back Next gt Clear Help PILo babes LE Figure 0 Specification selection window 3 Page 3 specify an FTP server If you have filled out resource requirements on page 2 you will come to this page to specify an FTP server that your resource agent will contact Users at UW Bothell FTP server ftp tripod com Account agentteamwork 10 Users at Keio SFC FTP server ftp tripod com Account mdbl sfc Ask the corresponding password to us at dslab u washington edu For time being fill out 1 in the Monitoring Period PALyh Select File Help Commander Agent Agent Teamwork Resource Selection Window Select Ftp Server Specify FTP server iftp tripod com v f Acccount uwagent Password
35. t computing agent on public network Current Status lrunning on uw1 320 21 69 91 198 172 Submit User Job Figure 11 Standard output display 15 standard input 2 Standard output display Since standard output is sent only from computing nodes the output display of the Commander agent and gateway agents show no output 3 Standard input field Standard input is sent to a corresponding agent immediately when the user types data to a standard input field 4 Remote node information display Remote node information display shows the dispatched agent id agent rank type of agent and current status The type of agent that is shown in remote node information display is Commander agent Root Sentinel agent Gateway agent Computing agent on cluster network and Computing agent on public network If these agents are extra agents intended for resumption extra is added to the type information as a prefix Current status varies with an agent status When each agent has not been dispatched to remote node the status is not dispatched yet as shown in Figure Agent id and rank are calculated prior to dispatch unless the agent type includes the prefix extra in which case agent id rank and status information is shown as in Figure When an agent does crash the extra agent will take over the id rank and ultimately status of the crashed agent Remote Node Information
36. t 100 lt cpu_load gt lt availability multiple true gt lt time gt 0000 1159 lt time gt lt time gt 1200 2359 lt time gt lt availability gt lt time zone gt pacific lt time_zone gt sinter net devicerethernet nief net device SiMe MSc Clesvicereil esicinsice lt aimee nen CNCE lt libraries multiple true gt lt lib name gt java lt lib name gt lt lib name gt prunjava lt lib name gt lt libraries gt lt inter insie_locinel gt lOO lt simiceie net benc lt intra_net_band gt 1000 lt intra_net_band gt lt group gt lt design time gt lt cluster gt For running AgentTeamwork in the UWB domain connect to ftp tripod com username agentTeamwork and email dslab u washington edu for the password At present the FTP server registers the following xmls for use within UW Bothell and thus you have no need to upload or modify xml files Public directory mnodeOQ xml mnodel xml mnode2 xml mnode3 xml mnode4 xml mnode5 xml mnode6 xml mnode7 xml uw1l 320 00 xml uwl 320 01 xml uwl 320 02 xml uwl 320 03 xml uw1l 320 04 xml uwl 320 05 xml uwl 320 06 xml uwl 320 07 xml perseus xml tarvos xml Cluster directory cl medusa 8 15 xml cl uwl 320 08 15 xml For outside domains establish a public FTP which AgentTeamwork can access so as to maintain its XML database in that FTP The FTP account must have two directories named Cluster and
37. t dispatched yet Abort User Job Submit User Job Applet started Figure 16 Aborting the current user job 4 4 Monitoring and Terminating UWPlace Although you should use SubmitGUI to control your agents engaged in your current job you may also want to manage remote UWPlace daemons i e the underlying agent execution platforms so as to monitor their status to terminate if they got stuck and to restart them The following shell scripts and batch files are available for these purposes UWMonitor sh UWMonitor bat sshUWPlace sh lt port gt lt as kill agentId suspend agentId resume agencie lneilio gt This script batch file shows the status of agents or controls a given agent on the local UWPlace daemon Options Remarks as Shows the status of all local agents kill agentld Terminates a given agent An agent Id is obtained with as 18 suspend agentid Suspends the thread that is executing a given agent resume agentld Resumes the thread that is executing a given agent help Lists all the available options of the UWMonitor sh script sshUWMonitorAs sh sshUWMonitorAs bat sshUWPlace sh lt port gt lt f filename ipName ipName gt This script batch file shows the status of all agents running on each of remote computers that are listed in filename or in forms of IP names More specifically it runs UWMonitor port as at each of these remote computers
38. ters to allocate to your job The cluster options include cluster name gateway IP name and each computing node s IP name If the user would like to provide additional nodes of a given cluster to assist with job resumption in case of node crashes the cluster name and gateway IP with the additional nodes much match the cluster name and gateway IP of the given cluster FAVS Select File Help S Commander Agent Agent Teamwork Resource Selection Window cluster name medusa X gateway ip medusa computing node alias mnode0 mnodel mnode2 remove add node lt Back Next gt Clear Help PIL y babes ns LE Figure 0 Cluster Selection Window 9 Page 9 specify all individual computing nodes to allocate to sentinels thus to your job The sentinel agent options include root sentinel agent IP name desktop computing node IP names and optionally extra desktop computing node IP names for resumption purposes At least two sentinel agent IP names are required The first is for the root sentinel and the second for the rank 0 sentinel Use _ to delaminate between IP names After filling out those fields click the finish button 13 FANSE Select File Help E Commander Agent AgentTeamwork Resource Selection Window SentinelAgent option priam_uw1 320 20_uw1 320 21 v Extra option uw1 320 31 X Usage In the case of using two or more IPs ip _other ip_ lt Back Clear Help
39. tions Sample for your convenience This program repeats writing a greeting message 10 times to each remote node s terminal or tmp log file as well as to AgentTeamwork s GUI window Sample java mport AgentTeamwork Ateam AgentTeamwork WNmeOiete MOP F mpijava import java net for InetAddress public class Sample extends AteamProg private int cycle 0 blank const iow Aree public Sample Ateam o pubike anp eC public void compute ioe alc a Of a lt dOe ala ee A write to each node local terminal or tmp s log file System out println Rank MPJ COMM WORLD Rank W e Beilio va InetAddress getLocalHost getHostName write to SubmitGUI AteamProg ateam gridfile writeStdout Rank MPJ COMM WORLD Rank He Hieiilet w r InetAddress getLocalHost ge cios Nane ap WY ia p Thread currentThread sleep 1000 aream eeih lt SrSiaeyorsiavoyc Cycle 5 catch Exception e e printStackTrace Sample java is a simple program in that each rank prints out its own greeting message 10 times a public static void main String args throws Exception Start the MPI library MPJ Init args ateam program instantiation or retrieval Sample program mwili if ateam isResumed System out println isResumed program Sample ateam retrieveLocalVar program progra
Download Pdf Manuals
Related Search
Related Contents
auroSTEP plus Service Manual SUNSTAR MACHINERY CO., LTD. Samsung WF1602W5C/YLE Наръчник за потребителя Inno Pocket Spot Pearl WOOD STOVE SERVICE MANUAL - Hearth & Home Technologies 「左右部位別インナースキャン50V」の製品概要[PDF] xp ism204:Mise en page 1 - Saint-Martin-de-Crau ViewSonic VS11419 User's Manual Copyright © All rights reserved.
Failed to retrieve file