Home

Athapascan : an API for Asynchronous Parallel Programming User's

image

Contents

1. Figure 6 Execution graph corresponding to a 0 b 1 and h Z 9 3 Scalar Product This example shows the use of a cumulative shared and of an array of parameters Parameter arrays are recursively split until their sizes are 1 Then the result is accumulated in a shared data Steeg Ar Makefile include A1_MAKEFILE all pscal pscal pscal o clean rm o pscal include athapascan 1 h include lt stdio h gt include lt stdlib h gt class add public void operator int amp a const int b const a b H struct pscal public const char graph name const return pscal H void operator Param_array lt Shared r lt int gt gt x Param_array lt Shared_r lt int gt gt y Shared cw lt add int gt res cout lt lt debut pscal lt lt endl cout lt lt on lt lt al_system self_node lt lt lt lt x size lt lt endl int n x size if n 1 L res cumul x 0 read y 0 read else Param_array lt Shared_r lt int gt gt x1 n 2 Param_array lt Shared_r lt int gt gt x2 n 2 n 2 Param_array lt Shared_r lt int gt
2. cel1 state c 1 ifndef FORCE IO define _FORCE_I0 ai 0Stream amp operator lt lt ai 0Stream amp ostr const force amp A ostr lt lt A intensity return ostr ai IStream amp operator gt gt al IStream amp istr force amp A istr gt gt A intensity return istr endif ifndef CELL IO define CELL IO al 0Stream amp operator lt lt ai 0Stream amp ostr const cell amp A ostr lt lt A info state lt lt A info time lt lt A info my_x lt lt A info my_y return ostr ai IStream amp operator gt gt al IStream amp istr cell amp A istr gt gt A info state gt gt A info time gt gt A info my_x gt gt A info my_y return istr H endif output gt Recevoir c sizeof cell_state std cout lt lt OuptutPrintCell lt lt std endl T Main of the program int doit ai Community com int argc char argv int ny atoi argv 1 int nx atoi argv 2 std cout lt lt GameLife with ny lt lt ny lt lt nx lt lt nx lt lt std endl int display frequency atoi argv 3 if display frequency lt 1 display_frequency 1 int global synchronization frequency atoi argv 4 if global_synchronization_frequency lt 1 global_synchronization_frequency 1 Initial state of the cells al Shared lt cell gt board new al Shared lt cell gt ny al Shared lt force gt forces new
3. Pay attention to the type of the effective parameters when performing a sequential call of a struct The type of the effective parameters must exactly match the type of the corresponding formal parameters For example a Shared parameter cannot be used where a Shared_r parameter is required However when Forking a task this behavior is allowed see page to learn more about passing Shared parameters in ATHAPASCAN A task is an object of a user defined type that is instantiated with Fork al Fork lt user task gt effective parameters user_task is executed asynchronously The synchronizations are defined by the access on the shared objects the semantic respects the Example The task hello world displays a message on the standard output struct hello world void operator No parameters here cout lt lt Hello world lt lt endl pi int doit hello_world immediate call not asynchronous al Fork lt hello world gt Creation of a task executed asynchronously al Fork lt hello world gt Creation of another task executed asynchronously return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 22 Roch amp al Remark encapsulating a C procedure Obviously a C procedure i e a C function with void as return type can be directly encapsulated in a C function class and
4. Computation of the force from each neighbour of the cell board i j Loop on the neighbours of each cell if i 0 al Fork lt force integration_task gt OCR board i j forces i j board i 1 j if i 0 amp amp j 0 ai Fork lt force integration_task gt OCR board i j forces i j board i 1 j 1 if j 0 al Fork lt force integration_task gt OCR board i j forces i j board i j 1 if i ny 1 amp amp j 0 ai Fork lt force integration_task gt OCR board i j forces i j board it 1 j 1 if i sny 1 ai Fork lt force integration_task gt OCR board i j forces i j board L D if Cif ny 1 amp amp j nx 1 al Fork lt force integration_task gt OCR board i j forces i j board i 1 j 1 if j nx 1 ai Fork lt force integration_task gt OCR board i j forces i j board i j 1 if i 0 amp amp j nx 1 al Fork lt force integration_task gt OCR board i j forces i j board i 1 j 1 T 2 Application of the force to each cell Application of the force for int i 0 i lt ny i for int j 0 j lt nx j v9 try ail Community com al System create_initial_community argc argv std cout lt lt count argc lt lt argc lt lt std endl if argc 5 for int i 0 i lt argc i std cout lt lt argv i lt lt std endl std cerr lt lt Usage lt lt argv 0
5. 40 ATHAPASCAN 39 TL swap two elements of a shared array template lt class T gt struct swap shared array void operator al Shared r w lt myArray lt T gt gt tab int il int i2 T temp temp tab access elts il 50 tab access elts il tab access elts i2 tab access elts i2 temp F print the data of a shared array to standard output template lt class T gt struct ostream_shared_array void operator al Shared_r lt myArray lt T gt gt tab unsigned int size tab read size 60 for int i 0 i lt size i cout lt lt tab read elts i lt lt n H template lt class T gt class shared_array public al Shared lt myArray lt T gt gt public constructors shared_array al Shared lt myArray lt T gt gt new myArray lt T gt shared_array unsigned int size al Shared lt myArray lt T gt gt new myArray lt T gt size 70 void resize unsigned int newSize al Fork lt resize_shared_array lt T gt gt al Shared lt myArray lt T gt gt amp this newSize void operator const myArray lt T gt amp a al Fork lt equal_shared_array lt T gt gt al Shared lt myArray lt T gt gt amp this a void append shared_array amp t2 80 al Fork lt append_shared_array lt T gt gt al Shared lt myArray lt T gt gt amp this al Shared lt myArray lt T gt gt amp t2 void swap int il int i2 al Fork
6. Example SC Eech le e Ne 12 Roch amp al struct print A Shared void operator al Shared_r lt int gt T 1 printf The Shared data parameter has the value d T read Ps int doit int argc char argv al Shared lt int gt myShared new int 10 al Fork lt print A Shared gt myShared return 0 int maint int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 e communicable type Only serialized classes can be communicated The standard classes and types short unsigned short int unsigned int long unsigned long char unsigned char float double long double char void Standard container classes of the STL are already communicable by default User defined classes may be complex It is therefore necessary to explain to the library how these classes should be serialized These classes and types have to be explicitly made communicable by implementing specific class methods Cf Chapter 4 1 3 Shared Object A task can access only the objects it has declared or received as effective parameters The access to a shared object can be e read only ai Shared_r lt class T gt e write only al Shared_w lt class T gt e cumulative write ai Shared_cu lt class F class T gt e read write ai Shared_r_w lt class T gt NB1 A shared object can implement only communicable classes NB2 A pointer given as parameter has to be cons
7. include lt sys wait h gt include lt sys uio h gt include lt netinet in h gt include lt netdb h gt include lt strings h gt 10 d YOON include lt math h gt include lt stdlib h gt include Message h using namespace std Fonction permettant d eliminer les processus de service quand ils se terminent il suffit que le serveur ignore le signal SIGCLD void Message sigchld wait3 p status options p_usage while wait3 NULL WNOHANG NULL gt 0 T Fonction de creation d une socket le parametre est le numero du port souhaite le numero de port sera envoye en resultat int Message CreerSock int sport int type Adresse de la socket struct sockaddr_in nom Longueur de la socket unsigned int longueur Creation de la socket socket domaine type protocole if _desc socket AF_INET type 0 1 perror Creation de la socket impossible exit 2 H Initialisation a 0 de la zone memoire d adresse et de taille donnees bzero char amp nom sizeof nom bzero char amp nom sizeof nom Numero du port non sin port htons port Adresse Internet nom sin_addr s_addr INADDR_ANY Famille de l adresse nom sin family AF INET AF INET Nommage de la socket bind sock p adresse lg if bind _desc struct sockaddr amp nom sizeof nom 0 perror
8. 19 20 As you reach this point you should now be able to write compile and run a simple ATHA PASCAN based program If you desire to write a real life more complex application you must further study the ATHAPASCAN library There are many useful concepts that have not yet Roch amp al SCC Ae ATHAPASCAN 21 5 Tasks The granularity of an algorithm is explicitly given by the programmer through the creation of tasks that will be asynchronously executed A task is an object representing the association of a procedure and some effective parameters Tasks are dynamically created at run time A task creation execution in ATHAPASCAN can be seen as a standard procedure call The only difference is that the created task s execution is fully asynchronous meaning the creator is not waiting for the execution of the created task to finish to continue with its own execution An ATHAPASCAN program can be seen as a set of tasks scheduled by the system and distributed among nodes for its execution 5 1 Task s Procedure Definition A task corresponds to the execution of a C function object i e an object from a class having the void operator defined struct user_task void operator parameters body A sequential hence not asynchronous call to such a function class is written in C user task effective parameters user task is executed according to depth first ordering
9. Envoi des caracteristiques des blocs au programme Glut void Message EnvoiMsgInit int nx int ny Message envoye au programme Glut taille_bloc ad cell state MsgGlut Preparation du message pour le programme de visualisation MsgGlut my_x nx MsgGlut my_y ny MsgGlut time 0 MsgGlut state 0 Envoi du message au programme de visualisation if Envoyer cell_state amp MsgGlut sizeof cell_state 0 ORIGINAL if Envoyer char amp MsgGlut sizeof MsgGlut 0 fprintf stderr Envoi des parametres incorrect n exit 2 Message cpp eessen c projet SAPPE Author F Zara ORIGINAL char buf iy Sgen e file Message h Definition des messages envoyes entre le programme lut et le programme Athapascan 1 ifndef _MESSAGE_H define _MESSAGE_H Librairies de base include lt strings h gt include lt stdio h gt Pour 1 emploi des sockets include lt unistd h gt include lt sys types h gt include lt sys ipc h gt include lt sys sem h gt include lt sys socket h gt include lt netinet in h gt include lt fcntl h gt 10 d YOON include lt sys ioctl h gt include lt sys types h gt include lt pthread h gt include cell_state cpp Message envoye par le programme Ai vers le programme de visualisation typedef struct int nx ny msg_init Structure de donnees
10. HAL archives ouvertes Athapascan an API for Asynchronous Parallel ProgrammingUser s Guide Jean Louis Roch R mi Revire Thierry Gautier To cite this version Jean Louis Roch R mi Revire Thierry Gautier Athapascan an API for Asynchronous Par allel ProgrammingUser s Guide Research Report RT 0276 2003 pp 77 lt inria 00069901 gt HAL Id inria 00069901 https hal inria fr inria 00069901 Submitted on 19 May 2006 HAL is a multi disciplinary open access archive for the deposit and dissemination of sci entific research documents whether they are pub lished or not The documents may come from teaching and research institutions in France or abroad or from public or private research centers L archive ouverte pluridisciplinaire HAL est destin e au d p t et a la diffusion de documents scientifiques de niveau recherche publi s ou non manant des tablissements d enseignement et de recherche francais ou trangers des laboratoires publics ou priv s 4 0802 VAINRIA INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE ATHAPASCAN an API for Asynchronous Parallel Programming User s Guide Jean Louis Roch Thierry Gautier R mi Revire N 0276 F vrier 2003 THEME 1 apport technique VA INRIA RHONE ALPES ATHAPASCAN an API for Asynchronous Parallel Programming User s Guide Jean Louis Roit Thierry Gautie R mi Revirdi
11. double y empty constructor complex x 0 y 0 copy constructor complex const complex amp z x z x y z y destructor complex packing operator al Ostream amp operator lt lt al Ostream amp out const complex amp z out lt lt ZX lt lt ZY return out unpacking operator al Istream amp operator gt gt al Istream amp in complex amp z in gt gt 2x gt gt zy return in Figure 1 Making the user defined class complex communicable 6 3 2 Example 2 Basic List with Pointers Let s go a bit deeper in the serialization and find out how to write a communicable class implementing a dynamic data structure based upon a list of pointers NB Even though the container classes from the STL are optimized and have been made communicable there is little use for these classes in a real life ATHAPASCAN application Therefore the class in this example is not optimized it is just an example This class implements a chain structure using pointers When running parallel application on a cluster of machines it is meaningless to communicate addresses Therefore we just communicate values The following class implements a chain structure using pointers When running a parallel application on a cluster of machines it is meaningless to communicate addresses Therefore only values are communicated e al Ostream we send first the number of values then the values themselves but
12. if sock NULL Nom du serveur if sock gt connect algonquin 60125 lt 0 if sock gt connect oglala 60125 lt 0 if sock gt connect koguis 60125 lt 0 if sock gt connect node2 cluster 4245 lt 0 if sock gt connect node3 cluster 4245 lt 0 if sock gt connect node4 cluster 4245 lt 0 std exit 1 cell state msgInit ORIGINAL if sock gt recv char amp msgInit sizeof msgInit lt 0 std exit 1 if sock gt recv cell_state amp msgInit sizeof msgInit lt 0 std exit 1 nx msglnit my_x ny msglnit my_y cells new cell_state next_cells new cell_state ORIGINAL if sock gt recv char cells nx ny sizeof cell_state lt 0 std exit 1 if sock gt recv cell_state cells sizeof cell_state lt 0 std exit 1 d cout lt lt INITIAL BLOCK OF CELLS HAVE BEEN RECEIVED lt lt std endl cell_state ok ok state 1 ok time 1 ok my_x 1 ok my_y 1 ORIGINAL sock gt send amp ok sizeof char sock gt send amp ok sizeof char sock gt send amp ok sizeof char sock gt send amp ok sizeof char on autorise 3 images d avance sock gt send amp ok sizeof cell_state sock gt send amp ok sizeof cell_state sock gt send amp ok sizeof cell_state sock gt send amp ok sizeof cell_state on autorise 3 images d avance else nx ny 16 cells new cell_state nx ny for int i 0 i lt nx i for int j 0
13. 0 c x y r _h 2 j r _h 2 if rij w rij h gt 0 for int i 0 i lt it i L Fork lt mandel gt rij z nb_col z c z pow n T H if z abs2 gt 4 H color 1 int floor double i it nb col break int doit int argc char argv return color int i_size atoi argv 1 int thr atoi argv 2 int it atoi argv 3 void compute_region win region r zone z Param_array lt Shared lt int gt gt amp col int nb_col int nb_col atoi argv 4 int i j k zone zO 2 1 2 1 2 1 2 1 i size i size it thr 2 win_gest init for j r _y k 0 j lt r y r _h j for i 4r _x i lt r _x r _w 3 itt ktt int ok w_mand init z0 20 nb_col col k Shared lt int gt new int xy2color z xi z scale_x i z _yi z scale_y j okl w_proc init z0 20 al system node count Z _pow z _it nb_col al system thread_count if ok 0 struct display_region cerr lt lt not enough colors lt lt endl const char graph_name const return display else zone z z0 void operator int node int thread win region r Param_array lt Shared r lt int gt gt col w_proc clean w_mand clean 1D 9 YOON w nand help while z empty cerr precision 15 cerr lt lt endl lt lt Mandelbrot z c z lt lt z _pow lt lt endl lt lt zone lt lt z lt lt endl lt
14. 27 10 20 30 40 50 60 70 28 al Ostream amp operator lt lt al Ostream amp out const myList lt T gt amp z myList lt T gt x z out lt lt x size while z next 0 out lt lt z value x x next out lt lt x value return out unpacking operator template lt class T gt al Istream amp operator gt gt al Istream amp in myList lt T gt amp z int size int temp 0 in gt gt size for int i 0 i lt size i in gt gt temp z push_back temp return in test tasks to see if the class is communicable struct myTaskW void operator al Shared_r_w lt myList lt int gt gt x for int i 0 i lt 100 i x access push_back i H struct myTaskR void operator al Shared_r_w lt myList lt int gt gt x myList lt int gt z x access while z next 0 cout lt lt z pop front lt lt H int doit int argc char argv al Shared lt myList lt int gt gt x new myList lt int gt al Fork lt myTaskW gt x al Fork lt myTaskR gt x return 0 Roch amp al 80 90 100 110 120 SCC Ae ATHAPASCAN 29 6 3 3 Example 3 Resizable Array A simple example of a dynamic structure is a mono dimensional array with two fields a size size and a pointer to an array with size number of elements tinclude lt iostream h gt include lt stdlib h gt include lt stri
15. Nomage de socket impossible exit 3 H Longueur de la socket longueur sizeof nom Recuperation de l adresse getsockname sock p_adr p_lg if getsockname _desc struct sockaddr amp nom amp longueur 0 perror Qbtention du nom de la socket impossible exit 4 NVOSVdVHLYV Passage de la representation reseau d un entier a sa representation locale sport ntohs nom sin_port Renvoi du descripteur return _desc Creation de la socket d ecoute void Message SocketEcoute int port Adresse de la socket struct sockaddr_in adr Taille de 1 adresse de la socket unsigned int lgadr sizeof adr Creation de la socket d ecoute creersock int port type et attachement au port du service d une socket if _sock_ecoute CreerSock amp port SOCK_STREAM 1 fprintf stderr Creation liaison de socket impossible n exit 2 H Creation de la file de connexions pendantes Signale au systeme que le serveur accepte les demandes de connexion listen sock nb avec nb le nombre max de connexion pendantes listen _sock_ecoute 10 Taille de 1 adresse de la socket lgadr sizeof adr Extraction d une connexion pendante dans la file associee a la socket accept sock p_adr p_lgadr _sock_service accept _sock_ecoute struct sockaddr amp
16. buff _gc 0 0 old reg vg old reg h 0 0 setup_buffers region 0 0 width height _caption clean XCopyArea _dpy buff _buff ge 0 0 min old reg w _reg _w min old reg h _reg _h 0 0 XCopyArea _dpy _buff _xwin _gc 0 0 _reg _w reg ht caption 0 0 XFreePixmap _dpy buff _new_zone old zone win_proc_mand C _new_zone _xf old zone xi _old_zone scale_x reg w 1 _new_zone _yf old zone yi _old_zone scale_y _reg _h 1 new zone w reg Hi new zone reg hi Ce H return 0 include win proc mand h int win_proc_mand init zone z0 int caption int nb proc int nb threads Constructors return win_proc init region 0 0 z0 _w z0 _h caption nb proc nb threads Other virtual protected int win proc mand x resize int width int height 1 resize reg return 1 9 10 9 YOON ATHAPASCAN 57 Figure 8 Mandelbrot Set visualization main and mapping windows The execution was on two nodes each having three virtual processors 9 5 Matrix Multiplication This example shows the use of ATHAPASCAN for implementing a parallel application on matrix operations Matrix product and addition are implemented by classical bi dimensional block parallel algorithms Mes le ee pe ER RE EC 8S Makefile include A1_MAKEFILE Sequential Computation of C A B str
17. gt y1 n 2 Param_array lt Shared_r lt int gt gt y2 n 2 n42 for int i 0 i lt n 2 i xifi x i x2 i x m 2 i yifi y i y2 i y n 2 il if n42 1 x2 n 2 xl n 1 1 y2 n 2 y n 1 1 Fork lt pscal gt x1 y1 res Fork lt pscal gt x2 y2 res struct verif public const char graph_name const return verif H void operator int val Shared re int gt x 1 cout lt lt debut verif lt lt endl cout lt lt on lt lt al system self node lt lt Task verif execution lt lt val lt lt lt lt x read lt lt endl if val x read cout lt lt A1 TEST ERROR bad result lt lt val lt lt lt lt x read lt lt endl H T int doit int argc char argv cout lt lt in main lt lt endl Shared lt int gt res new int O res graph_name res Param_array lt Shared lt int gt gt x atoi argv i Param_array lt Shared lt int gt gt YL atoi argv i int val 0 for int i 0 i lt x size i x i Shared lt int gt new int i yli Shared lt int gt new int 2 i char name 10 sprintf name x d i x i graph_name name printf name y d i yli graph_name name val 2 xixi cerr lt lt avant fork lt lt endl Fork lt pscal gt x y res cerr lt lt apres fork lt lt
18. le Asynchrone Manuel de l utilisateur R sum ATHAPASCAN est une interface de type macro data flow pour la programmation parall le asyn chrone Cette interface permet la description du parall lisme entre t ches de calcul qui se synchronisent sur des acc s des objets travers une m moire globale distribu e Le parall lisme est explicite de type fonctionel la d tection des synchronisations est implicite La s mantique est de type s quentielle et un programme crit en ATHAPASCAN est ind pendant de l architecture parall le grappe ou grille L ex cution est bas e sur une interpr tation du programme qui permet la construction d un graphe macro data flow Ce graphe orient et sans cycle d crit les calculs et les d pendances de donn es lecture et criture il est utilis par le support d ex cution pour contr ler l ordonnancement des t ches de calcul et le placement des donn es sur les ressources de l architecture L implantation repose l utilisation de threads et de communications undirectionnelles mes sages actifs Ce rapport pr sente l utilisation de PAPI d ATHAPASCAN en tant que biblioth que C Mots cl s programmation parall le macro data flow ordonnancement grappe et grille de calcul ATHAPASCAN 3 1 Information and contacts The ATHAPASCAN project is still under development We do our best to produce as good documentation and software as possible Please inform us of any bug malf
19. lt max iteration H lt lt z _it lt lt endl lt lt image size lt lt z _wm lt lt x lt lt z _h lt lt endl lt lt threshold M lt lt z _thr lt lt endl al work steal group my_grp al set default group my grp win region r 0 0 z _w z _h Fork lt mandel gt SchedAttributes 1 1 1 1 ai_work_steal basic r win_gest treatXEvents z w_mand new_zone w_proc win resize w mand reg w_proc clean H H win_gest terminate fflush stdout return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 z nb_col Hi Q ZO Q nclude types h ne zone _xi 0 yil 0 _xf O _yf 0 HC 0 hl 0 it 0 _thr O _pou 0 ne zone double xi double yi double xf double yf int w int h int it int thr int pow _xiC xi _yi yi _xf xf _yf yf w w hihi _it it _thr thr _pow pow int zone empty const T return _xf _xi _yf _yi 0 double zone scale x const 1 return _xf _xi _w 1 H double zone scale_y const return _yf _yi _h 1 H ostream amp operator lt lt ostream amp out const zone amp z 1 out lt lt lt lt Zz xi lt lt lt lt Zz yi lt lt x lt lt z xf lt lt M lt lt z _yf lt lt Ms return out al_ostream amp operator lt lt al_ostream amp out const zone amp z
20. 18 Roch amp al The difference here is that we cannot directly access Shared data Thus we cannot add the two results directly and therefore need temporary Shared variables and a specific task to add them An other parallel implementation using concurrent write The parallel implementation we have just studied runs correctly but is not very efficient This program will run faster if we use one of ATHAPASCAN s features called concurrent write This kind of Shared data is designed so that every task can access the data and perform a concurrent cumul operation This means less synchronization needs to take place and thus less time is wasted waiting for other tasks to complete Ideally the granularity of the tasks we wrote is not big since forking tasks can sometimes consume more CPU time than a regular sequentially executed task The best way to increase the speed of the computation is then to Fork enough tasks for each processor to be busy and then let them carry on sequentially to avoid excessive communication For that purpose we introduce a threshold variable As indicated in the text the threshold is user defined Now examine the second parallel implementation of a Fibonacci number algorithm Section page NB The parallel programs we present here are just for educational purpose The granularity of the tasks executed on remote nodes is small while the number of tasks is high If you try to run these programs on different archi
21. Ajout d un constructeur ifndef _PARTICULES_APP_ GOLApp vjKernel kern vjGlApp kern T define _PARTICULES_APP_ Ajout d un destructeur Librairies de base virtual GOLApp include lt stdio h gt include lt iostream gt Execute les initialisations necessaires avant le lancement de 1 API include lt math h gt Initialisation des services include lt algorithm gt virtual void init include lt strings h gt Execute les initialisations necessaires apres le lancement de 1 API GL std vector mais avant que le drawManager ne commence la boucle d affichage include lt vector gt virtual void apilnit GLUT vjDEBUG vjDBG_ALL O lt lt ParticulesApp apiInit n lt lt vjDEBUG_FLUSH include lt GL gl h gt H include lt GL glu h gt Appelee immediatement lors de 1 ouverture d un contexte OpenGL VR Juggler Appel fait une fois pour chaque fenetre d affichage ouverte include lt vjConfig h gt Ressource OpenGL allouee dans cette fonction include lt Kernel GL vjG1App bs virtual void contextInit include lt Kernel GL vjG1lContextData h gt include lt Math vjMatrix h gt Appelee immediatement lors de la fermeture d un contexte OpenGL include lt Math vjVec3 h gt appele lors de la fermeture d une fenetre d affichage include lt Kernel vjDebug h gt Ressource OpenGL desallouee dans cette fonction include lt Input Inp
22. Shared_rp_wp Shared lt T gt al Shared_cw p lt F T gt al Shared_cwl p al Shared_rp_wp ai Shared lt T gt 1 al Shared rp wp lt T gt al Shared_rp_wp lt T gt Dee al Shared r w lt T gt al Shared rp wp lt T gt CUT Thiers Figure 3 Compatibility rules to pass a reference on some shared data as a parameter to a task 7 3 2 Shared Type Summary Figure 77 page 77 summarizes the basic properties of references on shared data 38 Roch amp al faiths lt gt ef T al iSharedup er db o oo fjol tjo ot ot on atare lt r gt fe ___ aissard 277 Of Cat crr oe o to al iShared ow lt FT gt fo oe oo EE lt T gt of oe ee ES atiiShared sp lt T gt fofo _ faiiSharead es fofo _ Figure 4 Figure 6 3 Available types and possible usages for references on shared data Ae stands for a direct property and a o for a postponed one formal P denotes formal parameters type given at task declaration and effective P denotes effective ones type of object given at the task creation concurrent means that more than one task may refer to the same shared data version 7 4 Example A Class Embedding ATHAPASCAN Code A good way to write ATHAPASCAN applications is to hide ATHAPASCAN code in the data structures Proceeding that way will allow you to keep your main
23. al Shared lt force gt ny for int i 0 i lt ny i board i new al Shared lt cell gt nx forces i new ai Shared lt force gt nx for int j 0 j lt nx j poard i j al Shared lt cell gt cell i j ORIGINAL board i j al Shared lt cell gt rand gt RAND_MAX 2 alive forces il j ai Shared lt force gt force JJI IT TI III TITI IV 111 1111777 777 77 END IO STREAMS FOR ATHAPASCAAN 4 For visualization only include Message h Message output struct output_to_buffer al Fork lt OutputInit gt ai SetSite 0 nx ny std cout lt lt Avant sync lt lt std endl com sync std cout lt lt Apres sync lt lt std endl ORIGINAL void operator ai Shared_r_w lt vector lt char gt gt b al Shared_r lt cel1 gt x int k void operator ai Shared_r_w lt vector lt int gt gt b ai Shared_r lt cell gt x int k std cout lt lt etat indice lt lt x read state lt lt lt lt k lt lt std endl for int t 0 true t Time loop NVOSVdVHLV 9 std cout lt lt Avant sync lt lt std endl com sync std cout lt lt Apres sync lt lt std endl 1 Computation of all the forces for the board std cout lt lt Time lt lt t lt lt std endl for int i 0 i lt ny i for int j 0 j lt nx j
24. are fully controlled by the software ATHAPASCAN is an explicit parallelism language the programmer indicates the parallelism of the algorithm through ATHAPASCAN s two easy to learn template functions Fork and Shared The programming semantics are similar to those of a sequential execution in that each read executed in parallel returns the value it would have returned had the read been executed sequentially ATHAPASCAN is implemented by an easy to use C interface Therefore any code written in either the C or C languages can be directly recycled in ATHAPASCAN The ATHAPASCAN interface provides a data flow language The program execution is data driven and determined by the availability of the shared data according to the access made In other words a task requesting a read access on shared data will wait until the previous task processing a write operation to this data has ended ATHAPASCAN is portable and efficient The portability is inherited from the Athapascan 0 communica tion layer of the environment which may be installed on all platforms where a POSIX threads kernel and a MPI communication library have been configured The efficiency with which ATHAPASCAN runs has been both tested and theoretically proven The ATHAPASCAN programming interface is related to a cost model that enables an easy evaluation of the cost of a program in terms of work number of operations performed depth minimal parallel time and communication
25. double p const double d const double f double x 7 x x floor x p f f if x lt p II x gt ptd return 0 else x x p d 2 d 2 x x00 x x double res 1 exp 2 log x return res L as a e Other virtual protected JI ege AE TE XColor win_mand col2XColor int c int max_col 65535 double x double c _nb_col XColor col col red int ceil pic 1 0 2 2 0 3 2 0 x max_col col green int ceil pic 1 0 6 2 0 3 2 0 x max_col col blue int ceil pic 1 0 6 2 0 3 2 0 x max_col return col int win_mand key_pressed const char key 1D 9 YOON int cont 1 quit 0 switch key case hi help break H case q quit 1 cont 0 break H case ri _new_zone old zone cont 0 break H case ei resize region 0 0 _z0 _w _z0 _h clean _new_zone _z0 cont 0 break H case u case Dir case d case D switch key case u old zone it 10 break T case U old zone it 100 break T case dir old zone it 10 break case D old zone it 100 break T H if old zone it lt 0 old zone it 0 char title 100 sprintf title Next draw iterations d old zone it set title title break H case mir case Ni if key m old zone Dong 1 else old zone pow 1 if old zone pow lt 1 old z
26. for int i 0 i lt 10 i al Fork lt getInfoTask gt i return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 16 Roch amp al This program is very simple and shows you how to write a task Fork is instanciated with the class getInfo Task and will execute the code of the method overloading operator Compiling Recall that compiling an ATHAPASCAN program is done by executing the make function from the command line For the getInfoTask example execute sh gt make getInfoTask To execute this newly created program enter the following on the command line sh gt getInfoTask NB To run a program build upon LAM MPI like the ATHAPASCAN library or Inuktitut you have to configure your cluster of machines so that they can run a rsh to each other 4 2 2 Simple Example 2 Fibonacci cpp multiple Fork and Shared algorithm The Fibonacci series is defined as F 0 0 F 1 1 F n F n 1 Fn 2 Vn gt 2 There are different algorithms to resolve Fibonacci numberds some having a linear time complexity O n The algorithm we present here is a recursive method It is a very bad implementation as it has an exponential complexity O 2 as opposed to the linear time complexity of other algorithms However this approach is easy to understand and to parallelize Sequential implementation First let s have a glance at the regular recursive sequenti
27. is clearly displayed by the color variance from cell to cell G9 cell_state cpp force const cell contributor include lt iostream gt void reset intensity 0 gt Annulation of the force include lt math h gt include lt stdlib h gt struct integration_law Hinclude lt stdio h gt To provide a cumulative function class that performs forces integration ifndef CELL STATE void operator force amp res force a res intensity a intensity define CELL STATE const bool alive true const bool dead false Encapsulation of integration into a task struct integration task void operator al Shared cw lt integration law force gt f al Shared_r lt cell gt c class cell state T public F 1 D E HULL IL LE LL LL LL END FORCE STRUCT PIII ALLL int myx ue ZAIT 77277 BEGIN CELL STRUCT 1111111111111111111111111111 1 cell_state if rand gt RAND_MAX 2 state alive else state dead time 0 my_x 0 my lt 08Ft bool alive truer N 77const bool dead false cell_state T class cell 7 blic endif pu Se ORIGINAL char state x cell_state info lifegame cpp ORIGINAL cell char init_state dead state init_state include lt athapascan 1 gt inline cell include lt iostream gt inline cell int x int y info my_x x info my_y y include lt math h gt include cell_state cpp bool is_alive const return info
28. j lt ny jt cells i nx j state N cells i nx j time 0 cells i nx j my_x i 11s i nx j my_y j printf INIT Dani T Appelee immediatement lors de 1 ouverture d un contexte OpenGL Appel fait une fois pour chaque fenetre d affichage ouverte Ressource OpenGL allouee dans cette fonction void GOLApp contextInit glDisable GL_CULL_FACE Definition de la couleur de la fenetre glClearColor 0 1 0 2 0 5 1 Bleu glClearColor 1 1 1 1 Blanc glClear GL_COLOR_BUFFER_BIT GL DEPTH BUFFER BIT glIndexi 0 glDisable GL LIGHTING Depth Buffer glClearDepth 1 0 g1DepthFunc GL_LESS glEnable GL_DEPTH_TEST Choix de la technique du calcul de 1 ombre glShadeModel GL SMOOTH T Fonction appelee apres la mise a jour du tracker mais avant le debut du dessin Calculs et modifications des etats faits ici void GOLApp preFrame Appelee immediatement lors de la fermeture d un contexte OpenGL appele lors de la fermeture d une fenetre d affichage Ressource OpenGL desallouee dans cette fonction void GOLApp contextClose T unsigned char palette 256 4 Dessin de la scene void GOLApp draw glClear GL_DEPTH_BUFFER_BIT print n glPushMatrix glTranslatef 0 25f 4 f 1 05f OL 10 d YOON glTranslatef nx ny 1 glScalef float 1 float nx flo
29. key init title reg caption nb_col res win_zoom init title reg caption nb_col return res Tea aboae Other public RE void win_mand help cerr lt lt endl lt lt Key binding in Mandelbrot window lt lt endl lt lt u iterations 10 lt lt endl lt lt U iterations 100 lt lt endl lt lt d iterations 10 lt lt endl lt lt D iterations 100 lt lt endl lt lt m power of Mandelbrot 1 lt lt endl lt lt M power of Mandelbrot 1 lt lt endl lt lt t threshold 10 lt lt endl lt lt T threshold 10 lt lt endl vg lt lt r redraw picture lt lt endl lt lt s restart on default zone lt lt endl lt lt redraw on five times smaller zone lt lt endl lt lt redraw on five times bigger zone lt lt endl lt lt h print this help lt lt endl lt lt q quit lt lt endl lt lt Mouse binding in Mandelbrot window lt lt endl lt lt Button 1 define a zoomed rectangular zone lt lt endl lt lt Button 2 define a zoomed square zone lt lt endl lt lt Button 3 define a zoomed square zone centered lt lt endl lt lt endl H zone win Hand neng zone if _quit 1 return zone _old_zone _new_zone _new_zone zone char title 100 sprintf title Mandelbrot set z ctz d old zone Dog set title title return _old_zone H static double pic const
30. lt double gt a Shared_r lt double gt b Shared_w lt double gt c c write new double a read b read H T struct compute void operator double a double b Shared_r lt double gt h Shared_w lt double gt res if b a lt h read res write new double g a b else Shared lt double gt resi new double Shared lt double gt res2 new double resi graph_name resi res2 graph_name res2 Fork lt compute gt a atb 2 h resi Fork lt compute gt atb 2 b h res2 Fork lt sum gt resi res2 res struct print res void operator Shared r w lt double gt res cout lt lt res lt lt res access lt lt endl H int doit int argc char argv double a b tmp cout lt lt Give me a b and the steph cin gt gt a gt gt b gt gt tmp Shared lt double gt res new double Shared lt double gt h new double tmp res graph_name res h graph_name h cout lt lt OK start the computation lt lt endl Fork lt compute gt a b h res Fork lt print_res gt res cout lt lt 0K that s done lt lt endl return 0 H int maint int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 T Makefile include A1_MAKEFILE all NC NC NC o clean rm o NC NVOSVdVHLY LV 48 Roch amp al
31. my_write gt i 3 line E Fork lt my_read gt i line F al_system terminate return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 It is possible that the operations in line E and then in line F will execute before the preceeding lines because the rule described above is not broken So do not be surprised to see the following result on the screen x 3 x 2 x 1 34 Roch amp al Keep this in mind especially when measuring the time of computations In that case of adding some extra synchronization variables to the code But be careful because this can decrease the efficiency with which the program runs 7 2 3 Update Right Shared r vw al Shared_r_w lt T gt is the type of a parameter thats value can be updated in place the related value can be read and or written Such an object represents a critical section for the task This mode is the only one where the address of the physical object related to the shared object is available It enables the user to call sequential codes working directly with this pointer In the prototype of a task the related type is ai Shared_r_w lt T gt x Such an object gets the method T amp access that returns a reference on the data contained by the shared referenced by x Note that amp x access is constant during all the execution of the task and can not be changed by the user Example class
32. out lt lt z _xi lt lt z _yi lt lt z _xf lt lt z _yf lt lt z _wW lt lt z _h lt lt z _it lt lt z _thr lt lt z _pow return out al_istream amp operator gt gt al istream amp in zone amp z in gt gt z _xi gt gt z _yi gt gt z _xf gt gt zi y gt gt z _w gt gt z _h gt gt z _it gt gt z _thr gt gt Z _pow return in fesses complex fesse complex complex r 0 i 0 complex complex double re double im _r re _i im double complex re const return _r H double complex im const return _i H double complex abs2 const return rx r _ix_i H complex complex operator const complex amp c const Ate 217 return complex Srteu ns NVOSVdVHLY ES bk ga EC T complex complex operator const complex amp c const return complex _r c _r _i c _i complex complex operator const complex amp c const return complex _r c _r Age i _r c _i _i c _r complex complex pow int n const complex c 1 0 for int i 0 i lt n i c c this return c win_mand C include lt math h gt include win mand h dee Constructors Ai int win mand init zone z0 int caption int nb col _old_zone z0 _z0 z0 char title 100 sprintf title Mandelbrot set z c z d old zone pow region reg 0 0 z0 _w z0 _h int res res win
33. temp gt elts i from elts i start t write temp 40 T copy part of a my_shared_array to a my_shared_array template lt class T gt struct copy2 my shared_array void operator al Shared_r_w lt myArray lt T gt gt t al Shared_r lt myArray lt T gt gt from int start 10 template lt class T gt class my shared_array public shared_array lt T gt public int size 50 int sizeMyArray from read size error control is size of shared big enough to receive data if size gt sizeMyArray cerr lt lt ERROR copy size of dest lt size of source lt lt endl exit 1 for int i 0 i lt size i t access elts i from read elts i start 60 constructor my shared_array shared_array lt T gt my shared_array unsigned int size shared_array lt T gt size sort the array void qsort al Fork lt qsort my shared array lt T gt gt al Shared lt myArray lt T gt gt amp this 0 find a pivot and split void findPivot al Shared lt T gt p al Fork lt fPiv_my shared_array lt T gt gt al Shared lt myArray lt T gt gt amp this p void operator const myArray lt T gt amp a shared_array lt T gt operator a 80 copy void copy const myArray lt T gt amp a int start int size al Fork lt copy my shared_array lt T gt gt al Shared lt myArray lt T gt gt amp this a start size void copy cons
34. thus parallized with ATHAPASCANHere is an example void f inti printf what could I compute with jd n i struct f_encapsulated H void operator int i i is some formal parameter TL i ks int doit int a 3 f a Sequential call to the function f f_encapsulated a Sequential call to the function class f_encapsulated al Fork lt f_encapsulated gt a Asynchronous call to the function class f_encapsulated return 0 H int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 ATHAPASCAN 23 5 2 Allowed Parameter Types The system has to be able to perform the following with a task in order to Fork it 1 detect its interactions with the shared memory in order to be able to determine any synchronizations required by the semantic 2 move it to the node where its execution will take place Here are the different kinds of parameters allowed for a task e T to designate a classical C type that does not affect shared memory However this type must be communicable see Chapter entitled Shared Memory for a definition of communicable types s Shared_ lt T gt to designate a parameter that is a reference to a shared object located in the shared memory T is the type of this object It must be communicable In the case of a classical T class the type T should not refer to the shared memory For example when initia
35. 1 fprintf stderr Erreur dans la connexion code 4 H H else by default the sock is valid sock 1 H printf c connect 4s d BCAST n isMaster M S serveur port ifndef NOMPI if MPI_Bcast amp code 1 MPI INT masterNode MPI_COMM_WORLD MPI_SUCCESS return 1 endif if code lt 0 sock 0 invalid socket printf c connect 4s 4d END d n isMaster M S serveur port code return code int NJSocket recv cell_state buf int size char buf int size if sock 0 return 1 invalid socket if isMaster cell_state buf inter int NbRead 0 int NbToRead size buf inter buf Boucle de reception while NbToRead 0 buf inter NbRead NbRead recv sock buf_inter NbToRead 0 NbToRead NbRead if NbRead lt 0 break while Verification que tout a ete recu if NbRead lt 0 amp amp NbToRead 0 La connexion est rompue return 2 H ifndef NOMPI MPI_Bcast buf size MPI BYTE masterNode MPI COMM WORLD endif return 0 Fonction envoyer envoi sur la socket fd size octets et retourne 0 si coupure de la connexion ou size int NJSocket send cell_state buf int size char buf int size if sock 0 return 1 invalid socket if isMaster cell_state buf_inter int NbSent 0 int NbToSend size but inter buf Boucle d envoi while NbToSen
36. 2 t4 gt copy t2 size halfl half12 half12 t1 gt resize halfl half12 2 gt resize size halfl half12 we merge arrays lt gt pivot together 40 t1 gt merge t2 t3 gt merge t4 we append the second array to the first one and that s it t1 gt append t3 print the result to stdout cout lt lt endl lt lt qsort gt lt lt t1 lt lt endl 50 int doit int argc char argv 10 set the scheduling to work_stealing al base group sch sch new al work steal basic al set default group sch first for testing purpose we fill an array with randomized data int size atoi argv 1 myArray lt int gt t new myArray lt int gt size 60 for int i 0 i lt size i t gt elts i rand 10 20 then we sort the array myQsort lt int gt t return 0 int maint int argc char argv 70 30 MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 NVOSVdVHLY Dr 46 Roch amp al 9 2 Adaptative Quadrature Integration by Newton Cotes The aim of this very simple divide and conquer strategy is to compute the integration of the function f on the interval a b according ot the Newton Cotes method a b b tef fast L jia 2 b Vib a lt h g a b P fdx SCC Ae NC C include lt athapascan 1 h gt double g double a double b 1 return a atb b b a 2 0 H struct sum void operator Shared_r
37. E de Se 30 if newSize lt size T newtab new T newSize memmove newtab elts newSize sizeof T delete elts elts newtab size newSize return then new myArray is bigger T newtab new T newSize memmove newtab elts size sizeof T delete elts elts newtab size newSize return test tasks to see if the class is communicable struct myTaskW void operator al Shared_r_w lt myArray lt int gt gt x int z x access size for unsigned int i 0 i lt z i x access elts i 2 i H struct myTaskR void operator al Shared_r_w lt myArray lt int gt gt x cout lt lt size of the array lt lt x access size lt lt endl x struct resize Tab void operator al Shared_r_w lt myArray lt int gt gt x unsigned int n x access resize n J int doit int argc char argv al Shared lt myArray lt int gt gt x new myArray lt int gt 100 al Fork lt myTaskW gt x al Fork lt myTaskR gt x al Fork lt resizeTab gt x 10 al Fork lt myTaskR gt x return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 Roch amp al 70 80 90 100 110 120 SCC Ae ATHAPASCAN 31 7 Shared Memory Access Rights and Access Modes Shared memory is accessed through typed references One possible type is Shared The consistency asso
38. Th me 1 R seaux et syst mes Projet APACHE Rapport technique n 0276 F vrier 2003 pages Abstract ATHAPASCAN was an macro data flow application programming interface APT for asynchronous parallel programming The APT permits to define the concurrency between computational tasks which make synchronization from their accesses to objects into a global distributed memory The parallelism is explicit and functional the detection of the synchronizations is implicit The semantic is sequential and a ATHAPASCAN s program is independent from the target parallel architecture cluster or grid The execution of program relies on an interpretation step that builds a macro data flow graph The graph is direct and acyclic DAG and it encodes the computation and the data dependencies read and write It is used by the runtime support and the scheduling algorithm to compute a schedule of tasks and a mapping of data onto the architecture The implantation is based on using light weight process thread and one side communication active message This report presents the C library of the API of ATHAPASCAN Key words parallel programming macro data flow scheduling cluster and grid computing MdC INPG Leader of the ATHAPASCAN s team jean louis roch imag fr CR INRIA thierry gautier inrialpes fr T Doctorant MENSR remi revire imag fr Unit de recherche INRIA Rh ne Alpes ATHAPASCAN une Interface pour la Programmation Parall
39. _x int my_y T class NJSocket public NJSocket bool isMaster connecte au serveur int connect char serveur int port envois des donnees au serveur Note ceci n a d effet que sur le maitre int send char buffer int size int send cell_state buffer int size recois des donnees int recv char buffer int size int recv cell_state buffer int size int close protected int masterNode int sock endif GOLApp cpp include lt unistd h gt include lt signal h gt include lt pthread h gt include lt strings h gt include lt fcntl h gt include lt netdb h gt include lt sys types h gt include lt sys ipc h gt include lt sys sem h gt include lt sys ioctl h gt include lt sys socket h gt include lt sys wait h gt include lt sys uio h gt include lt netinet in h gt include GOLApp h Namespace de la std using namespace std struct msg init int nx ny Execute les initialisations necessaires avant le lancement de 1 API Initialisation des services void GOLApp init vjDEBUG vjDBG_ALL 0 lt lt Particules App init lt lt endl lt lt vjDEBUG_FLUSH NVOSVdVHLYV 69 st ce Preparation de la connexion vers le programme A1 Message recu sock NULL sock new NJSocket
40. adr amp lgadr _desc _sock_service Fermeture de la socket d ecoute close _sock_ecoute K i G9 Fonction recevoir lit sur la socket fd size octets et retourne 0 si coupure de la connexion ou size int Message Recevoir cell_state buf int size BAK int buf int fd _desc cell_state buf_inter int NbRead 0 int NbToRead size buf inter buf Boucle de reception while NbToRead 0 buf inter NbRead NbRead recv fd buf_inter NbToRead 0 NbToRead NbRead if NbRead lt 0 break while Verification que tout a ete recu if NbRead lt 0 amp amp NbToRead 0 La connexion est rompue return 0 else Envoi de la taille des donnees return size Za Fonction envoyer envoi sur la socket fd size octets et retourne 0 si coupure de la connexion ou size int Message Envoyer cell_state buf int size BAK int buf int fd _desc cell_state buf_inter int NbSent 0 int NbToSend size buf inter buf Boucle d envoi while NbToSend 0 buf inter NbSent NbSent send fd buf_inter NbToSend 0 NbToSend NbSent if NbSent lt 0 break while Verification que tout a ete envoye if NbSent lt 0 amp amp NbToSend 0 99 La connexion est rompue return 0 ORIGINAL char buf int gize else Envoi de la taille des donnees return size
41. al program tinclude lt iostream h gt include lt stdlib h gt int fibonacci int n if n lt 2 return n else return fibonacci n 1 fibonacci n 2 int main int argc char argv d make sure there is one parameter if argc lt 1 cout lt lt usage fibonacci lt N gt lt lt endl exit 1 cout lt lt result lt lt fibonacci atoi argv 1 lt lt endl return 0 Parallel implementation We assume that you wish to make this program parallel An easy way to do it would be to use the same algorithm recursive Well it s not that easy Two reasons for that 1 in the sequential program we used a function while ATHAPASCAN only supports procedure void function 2 you can access shared data only from a task having this data as a parameter ex you can t display the value of a Shared from the main To write this parallel program we will then need SES me D ATHAPASCAN 17 e a task doing the same job the sequential function was doing recursive e a task to add the result of the two recursive calls to fibonacci e a task to print the result to stdout include lt athapascan 1 gt include lt iostream gt include lt stdlib h gt struct add This procedure sums two integers in shared memory and writes the result in shared memory void operator al Shared_r lt int gt a al Shared_r lt int gt b al Shared_w lt int gt c c write new int a read b
42. at 1 float ny 1 float 1 float ny Deo ooo dog dh e OR TG TINA k k d edie dh dh hh dh hh dh ial bh doi glTranslatef 0 0 6 void GOLApp draw glTranslatef nx ny 1 glClear GL_COLOR_BUFFER_BIT GL_DEPTH_BUFFER_BIT printf n NVOSVdVHLV cell_state cur cells glPushMatrix unsigned char roygbiv 4 glTranslatef 0 25f 4 f 1 05f int time cur gt time 1536 glTranslatef 0 3 1 AAA II AAA III A NII I NII TI ALTA 111771 NISUALIZATION 1 Y NY YANN Y YA 44444 glScalef float 1 float nx float 1 float ny float 1 float ny For visualization of cell state in 2D A cell that is alive state true appears in red a cell that is dead state false appgdfsais lated nx ny 0 if cur gt state roygbiv 0 255 roygbiv 1 0 roygbiv 2 0 roygbiv 3 0 else roygbiv 0 0 roygbiv 1 0 roygbiv 2 255 roygbiv 3 0 palette R 0 255 palette G 11 255 palette B 21 255 AVI II IT TI II ATV IT 1111111 1777 17777 NISUALIZATION 2 1 11 1111 11 11 11 11 41 f l NI 01 03 palette N 1 0 define NbrColoursForTime 5 palette N 2 0 For visualization of task execution time 2D Discrete color changes over time red orange yellow green blue int modu time NbrColoursForTime glBegin GL_QUADS switch modu cell_state cur cells case 0 froygbiv 0 255 roygbiv 1 0 roygbiv 2 0 break for int y 0 y lt ny yt case 1
43. at would enable us to enlarge this section Q On which systems do ATHAPASCAN run A Currently ATHAPASCAN has been tested on e IBM SPx running aix 4 2 using LAM MPI or a dedicated switch with x1C 4 2 C compiler e Sparc or Intel multiprocessor and network of workstations using LAM MPI with CC 4 2 C compiler Athapascan 0 is currently supported on e AIX 3 2 5 and IBM MPI F IBM SP with AIX 4 2 and IBM MPI e DEC Alpha with OSF 1 4 0 and LAM MPI 6 1 e HP 9000 with HP UX 10 20 and LAM MPI 6 1 e SGI MIPS with IRIX 6 3 and LAM MPI 6 1 SGI MIPS with IRIX 6 4 and SGI MPI 3 1 e Sparc or Intel with Solaris 2 5 and LAM MPI 6 3 e Intel with Linux 2 0 25 MIT threads 1 60 6 and LAM MPI 6 3 Q How do I get a copy of ATHAPASCAN Q Where can I comment about ATHAPASCAN Q How do I get up to date information A There is a web page dealing with ATHAPASCAN at http www apache imag fr The ATHAPASCAN distri bution the manual the document you are reading and some other related papers are also available from this web page Q The compilation failed A1_MAKEFILE not known What do I do A Check to see if you properly set up your environment by sourcing the appropriate setup file the ones for Athapascan 0 and ATHAPASCAN Q The option a1 trace file has no effect at execution Why could this be A Make sure you are using a program compiled with an appropriate ATHAPASCAN library one compiled to generate dynamic graph visualiza
44. ated type is ai Shared_w lt T gt x RE ae hG ATHAPASCAN 33 Such an object gets the method void write T address that assigns the value pointed to by address to the shared object This method assigns the value pointed to by address to the shared object No copy is made the data pointed by lt address gt must be considered as lost by the programmer Further access via this address are no more valid in particular the deallocation of the pointer it will be performed by the system itself Example class read_complex void operator al Shared_w lt complex gt z complex a new complex cin gt gt a x gt gt a y z write a Note To clarify the rule that each read sees the last write due to lexicographical order being observed follow the example below include athapascan 1 h include lt stdio h gt struct my read void operator al Shared_r lt int gt x printf x 4i n x read x struct my write void operator al Shared_w lt int gt x int val x write new int val H int doit int argc char argv al system init argc argv al_system init_commit if al_system self_node 0 al al al al al al al Shared lt int gt i new int 1 Fork lt my_write gt i 1 line A Fork lt my_read gt i line B Fork lt my_write gt i 2 line C Fork lt my _read gt i line D Fork lt
45. ble usually contains the value of a reached minimum or maximum No information with respect to another task s activity is associated with this variable We are currently in the process of finishing the implementation of global variables for Athapascan 1 variables that can be both read and written on the collection of nodes in a community without the constraints of data dependancy that exist for Shareds Please bare with us as this project is still in development 8 1 Memory Consistencies The system offers three different consistencies on this memory e A Causal Consistency where the data consistency is maintained along the paths of the precedency graph That is to say that if the task Ti preceeds the task T in the precedency graph then the modification on the memory made by T will be seen by 7 e A Processor Consistency where the data consistency is maintained among the virtual processors of the system That is to say that the order of modification of the memory on a virtual processor P is the same that the order of modification seen on an other virtual processor P2 e An Asynchronous Consistency where the data consistency is maintained on the system in its globality That is to say that each modification made on one virtual processor will eventually be seen on other virtual processors 8 2 Declaration The declaration of a global data has the following prototype s al mem cc lt T gt x T pval for causal consistency e al mem
46. cedures have a precedence relation A procedure requiring a shared parameter with a direct read access r has a precedence relation with the last procedure to take this same shared parameter with a write access However a procedure taking some shared parameter with a postponed read access rp has no precedence relation It is guaranteed by the access mode that no access to the data will be made during the execution of this task The precedence will be delayed to a sub task created with a type r In essence the type Shared can be seen as a synonym for the type a1 Shared_rp_wp lt T gt it denotes a shared datum with a read write access right but on which no access can be locally performed An object of such a data type can thus only be passed as an argument to another procedure 7 3 1 Conversion Rules When forking a task t with a shared object x as an effective parameter the access right required by the task t has to be owned by the caller More precisely the Figure page enumerates the compatibility at task creation between a reference on a shared object type and the formal type required by the task procedure declaration Note that this is available only for task creation and not for standard function calls where the C standard rules have to be applied type of formal parameter required type for the effective parameter al Shared_r p lt T gt Shared_r p Shared_rp_wp Shared lt T gt al Shared w p Shared v Lp
47. ciated with the shared memory access is that each read sees the last write according to the lexicographic order Tasks do not make any side effects on the shared memory of type Shared Therefore they can only access the shared data on which possess a reference This reference comes either from the declaration of some shared data or from some effective parameter A reference to some shared data is an object with the following type al Shared RM lt T gt The type T of the shared data must be communicable see Section page The suffix RM indicates the access right on the shared object read r write w or cumul c and the access mode local or postponed p RM can be one of the following suffixes r TP W WP cw cwp r_w and rp_wp Access rights and modes are respectively described in section page and page 7 1 Declaration of Shared Objects If T is a communicable type the declaration of an object of type a1 Shared lt T gt creates a new shared datum in the shared memory managed by the system and returns a reference to it Depending on whether the shared object is initialized or not three kinds of declarations are allowed e ai Shared lt T gt x new TL I The reference x is initialized with the value pointed to by the constructor parameter Note that the memory being pointed to will be deallocated by the system and should not be accessed anymore by the program
48. cker for use in lifgame file Message C Definition des messages envoyes entre le programme lut et le programme Athapascan 1 define NOMPI include lt stdio h gt include lt unistd h gt include lt iostream gt include lt sys socket h gt include lt netinet in h gt include lt netdb h gt include lt strings h gt include lt string h gt include lt math h gt ifndef NOMPI include mpi h endif include NJSocket h NJSocket NJSocket masterNode 0 sock 0 int NJSocket connect char serveur int port print c connect 4s 4d n isMaster M S serveur port int code 0 if isMaster Adresse de la socket struct sockaddr_in nom Adresse internet du serveur struct hostent hp L9 Creation de la socket socket domaine type protocole if sock socket AF INET SOCK STREAM 0 1 fprintf stderr Creation de la socket impossible code 2 H Recherche de l adresse internet du serveur else if hp gethostbyname serveur NULL fprintf stderr s site inconnu n serveur code 3 H else Preparation de l adresse de la socket destinataire bcopy hp gt h_addr amp nom sin_addr hp gt h length nom sin_family AF_INET nom sin_port htons port Demande de connexion connect sock p_adr lgadr if connect sock struct sockaddr amp nom sizeof nom
49. ction F v in the shared data referenced by x For the first accumulation a copy of v may be taken if the shared data version does not contain some valid data yet Example a ae Ae ATHAPASCAN 35 generic function class that performs the multiplication of two values template lt class T gt class multiply void operator T amp x const T amp val X X XY Fn A task that multiplies by 2 a shared object class mult_by_2 void operator al Shared cw lt multiply lt int gt int gt x x cumul new int 2 Fe Note Keep in mind that a program written in ATHAPASCAN can benefit at run time from the associative and com munative properties of the accumulation function F It is therefore possible that the execution of the code in Figure will result in x 3 val 2 x 5 val 1 include athapascan 1 h include lt stdio h gt struct F void operator int amp x const int amp val printf x i val i n x val x val L struct add void operator al Shared cw lt F int gt x int val x cumul val H int doit int argc char argv al Shared lt int gt i new int 1 al Fork lt add gt i 2 al Fork lt add gt i 3 return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN THE API CHAPTER return 0 Figure 2 Demonstration of associativity and commutativity of cumulative mode It may
50. d 0 buf inter NbSent NbSent send sock buf_inter NbToSend 0 NbToSend NbSent if NbSent lt 0 break while H return 0 int NJSocket close if sock 0 return 1 invalid socket if isMaster close sock sock 0 return 0 pool NJSocket isMaster ifndef NOMPI int node 89 10 d YOON MPI Comm rank MPI COMM WORLD amp node return node masterNode Helse return 0 0 endif H NJSocket h ifndef NJSOCKET_H define NJSOCKET_H include lt stdio h gt include lt iostream gt include lt strings h gt include lt math h gt include lt stdlib h gt include GOLApp h include cell state cpp Classe permettant de recevoir des donnees d un serveur dans un programme netjuggler c projet SAPPE Author F Zara modified by Joe Hecker for use in lifgame file GOLApp C Application permettant la visualisation des particules Librairies de base include lt iostream gt include lt math h gt include lt Math vjQuat h gt include lt stdio h gt std vector include lt vector gt S utilise comme une socket TCP Sur le noeud maitre c est effectivement une socket mais led oPadx dsehpied decsobvekethes donnees du maitre par broadcast Note les fonctions renvoyent 0 si succes code d erreur negatif sinon typedef struct cell state char state int time int my
51. define a class or type as being communicable The three examples provided in this section are Complex Numbers which creates a simple communicable class Basic List with Pointers which generates a singly linked list which behaves as a queue data structure and Resizable Array which generates a class for the creation of a list of dynamic size 6 3 1 Example 1 Complex Numbers For example let us consider the complex numbers type This type can be set communicable by simply imple menting the four communication operators NB Note that the C provided defaults can often be used to impliment the empty constructor the copy constructor and the destructor In the case of the complex type the defaults operators are used refer to a C guide to learn more about these defaults constructors 3You must be careful when communicating pointers in fact if your program is executed on several nodes option a0n 2 for example the communication can be performed between the nodes and the pointer is often meaningless on other nodes 4By default the system considers that all types possess iterators that run all over the data this is the case of STL types For all others the necessary functions have to be provided to override this default definition 26 Roch amp al class complex is an Athapascan 1 communicable class complex z x i y NB This class implements only the methods needed by Athapascan 1 class complex public double x
52. design parallel programs through complex examples Chapter how to select the proper scheduling algorithm for specific programs Appendix how to debug programs using special debugging and visualizing tools Appendix 771 SCC Ae ATHAPASCAN 7 3 Installation of Athapascan 1 version 2 0 Beta ATHAPASCAN is easy to install This chapter only covers the installation of Athapascan 1 version 2 0 Beta on a UNIX system The lastest releases of ATHAPASCAN software are available for download on A DACH Eis web site http www apache imag fr software athi archives 3 1 Installation of Inuktitut and Athapascan The entire installation is based upon the couple configure makefile There is a makefile in the top level of the both the Inuktitut the excecution support and the Athapascan JIT 1 3 folders that provide sound settings for the installation Modify the settings at the beginning of the file in order to finalize the installation to the desired folder Next execute make config In order to define certain variables at the time of compilation use the command make CXXGLAGS in each of the folders Inuktitut and Athapascan JIT 1 3 SC Eech d A Ne Roch amp al SCC Ae ATHAPASCAN 9 4 Getting Started API This chapter presents an overview of ATHAPASCAN s API and demonstrates how to build ATHAPASCAN pro grams through simple examples NB The source codes of the examples presented in this tutorial are available
53. endl Fork lt verif gt val res cout lt lt out main lt lt endl return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 NVOSVdVHLY 67 50 Roch amp al Figure 7 Execution graph corresponding to pscal 3 execution 9 4 Mandelbrot Set This example intends to show the possible interaction between an ATHAPASCAN program and a X server The following code results in a visualization of the Mandelbrot set on a X window The algorithm is standard the size of the image is split in four until a given threshold has been reached The visualization is made during the computation so that visualization threads have to execute on the X server site a special scheduling policy is used for these threads Rte tel Makefile include A1_MAKEFILE X11_INCLUDES usr openwin include X11_LIB usr openwin lib CXXFLAGS g I X11_INCLUDES LDFLAGS L X11_LIB 1X11 lm 0BJ SRC wildcard C OBJ patsubst C 0 SRC all mandel mandel 0BJ clean rm rf Templates DB mandel mandel o 0BJ tempinc core o ptclean cleanDB rm rf Templates DB mandel mandel o auto_test ifndef TYPES_H define TYPES_H include lt stream h gt include athapascan 1 h class zone xi yi top left _xf _yf bottom rig
54. es short unsigned short int unsigned int long unsigned long char unsigned char float double long double char voidx s all types from the sr Note that two generic functions for packing unpacking an array of contiguous elements are provided al 0stream amp al pack ai 0stream amp out const T x int size al Istream amp al unpack a1 Istream amp in T x int size Both functions require the number of elements They are specialized for basic C types 6 2 Serialization Interface for Classes Stored in Shared Memory A communicable type T must have the following functions e The empty constructor TO e The copy constructor T const T amp NB the copy shall not have any overlap with the source s The destructor TO NB only one task executes the deallocation at a time s The sending operator ai Ostream amp operator lt lt ai 0stream amp out const T amp x puts into the output stream out the information needed to reconstruct an image of x using the operator gt gt s The receiving operator al Istream amp operator gt gt al Istream amp in T amp x takes from the input stream in the information needed to construct the object x it allocates and initializes x with the value related to the information Note that the system always calls this function with an object x that has been constructed with the empty constructor 6 3 Examples This section teaches through examples how to
55. f roygbiv 0 255 roygbiv 1 175 roygbiv 2 0 break case 2 f roygbiv 0 255 roygbiv 1 255 roygbiv 2 0 break for int x 0 x lt nx x case 3 froygbiv 0 0 roygbiv 1 255 roygbiv 2 0 break case 4 roygbiv 0 0 roygbiv 1 0 roygbiv 2 255 break glColor4ubv palette cur gt state default should be treated differently glVertex2i 2 x 2 y roygbiv 0 255 roygbiv 1 0 roygbiv 2 0 break glVertex2i 2 x 1 2 y glVertex2i 2 x 1 2 y 1 glVertex2i 2 x 2 y 1 cur ZAIT IT TTL AIT ATTA TAA ATT TAA NISUALIZATION 3 1 11 Y 11 11 1111 1 1 ITT AAT EDT TT For visualization of task execution time in 2D glEnd Gradual increment of color through the spectrum over time black red orange yellow green blue black if time lt 256 roygbiv 0 time roygbiv 1 0 roygbiv 2 0 roygbiv 3 0 glPopMatrix else if time gt 256 amp amp time lt 514 roygbiv 0 255 roygbiv 1 time 256 roygbiv 2 0 roygbiv 3 0 else if time gt 512 amp amp time lt 768 roygbiv 0 255 time 512 roygbiv 1 255 roygbs 24 0 jitg bik ES 530 3 ARR ot ta a RER OK else if time gt 768 amp amp time lt 1024 roygbiv 0 0 roygbiv 1 255 roygbiv 2 time 768 roygbiv 3 0 else if time gt 1024 EE time lt 1280 roygbiv 0 0 roygbiv 1 255 time 1024 roygbrul 60bSppoyghiraBra er else if time gt 1280 amp amp time lt 1536 roygbiv 0 0 roygbiv 1 0
56. g this way makes the main application code a lot more simple to understand and to write The idea of the algorithm is 1 to split the array in two parts 2 to sort each array in parallel using qsort 3 to split those arrays in two parts elements lt pivot and elements gt pivot 4 to merge the arrays containing data lt pivot and data gt pivot 5 to append the second array to the first one bk EE mySharedArray h include sharedArray h qsort the array template lt class T gt struct qsort_my_shared_array void operator al Shared_r_w lt myArray lt T gt gt t t access qsort F kr find a pivot template lt class T gt struct fPiv_my_shared_array 1 void operator al Shared_r lt myArray lt T gt gt t al Shared_w lt T gt pivot int middle t read size 2 T result new T t read elts middle t read elts middle 1 2 pivot write result T 20 copy part of a myArray to a my_shared_array template lt class T gt struct copy my shared_array 1 void operator al Shared_w lt myArray lt T gt gt t myArray lt T gt from int start int size 1 int sizeMyArray from size error control 30 is size of shared big enough to receive data from array if size gt sizeMyArray cerr lt lt ERROR copy size of shared array lt size of myArray cerr lt lt endl exit 1 myArray lt T gt temp new myArray lt T gt size for int i 0 i lt size i
57. ht _xi lt _xf _yf lt _yi public zone zone double xi double yi double xf double yf int w int h int it int thr int pow int empty const double scale_x const double scale_y const double _xi _yi _xf _yf int _w _h _it _thr _pow friend ostream amp operator lt lt ostream amp out const zone amp z friend al ostream amp operator lt lt al_ostreamk out const zonek z friend al istream amp operator gt gt al_istream amp in zone amp z T class complex public complex complex double re double im double re const double im const double abs2 const complex operator const complex amp const complex operator const complex amp const complex operator const complex amp const complex pow int n const NVOSVdVHLY private double _r double _i endif win_mand h ifndef WIN_MAND_H define WIN_MAND_H include win_zoom h include win key h class win mand public win zoom public win key public int init zone z0 int caption int nb col zone new_zone void help protected virtual int key pressed const char key virtual int zoom const regionk r virtual int x resize int width int height virtual XColor col2XColor int c private int quit zone _z0 zone old zone zone new zone T endif win_proc_mand h ifndef WIN_PROC_MAND_H define WIN_PROC_MAND_H include win_proc h include
58. idered as lost by the programmer all further access through this pointer is invalid Declaration e al Shared lt T gt x the reference x must be initialized before use e al Shared lt T gt x 0 the reference x can be used but no initial data is associated with it e al Shared lt T gt x new T the reference x can be used and possesses an initial value NB Be aware that non initialized shared data is a common programming error that gives no compile time warning SE ae hG ATHAPASCAN Access Rights and Methods Each task specifies as a parameter the shared data objects that will be accessed during its execution and which type of access will be performed on them According to the requested access right tasks should use this methods e read only access a1 Shared r lt T gt x use x read prototype const T amp read const e write only access ai Shared_w lt T gt x use x write p prototype void write T NB Deallocation is made by ATHAPASCAN e read write access a1 Shared_r_w lt T gt x use x write p or x access prototype void write Tx prototype T amp access e accumulation write access a1 Shared_cw lt F T gt x use x cumul v prototype void cumul const T amp NB The call x cumul v accumulates v into the shared data according to the accumulation function F A copy of v may be made during the first accumulation if the data present
59. incr 1 void operator al Shared_r_w lt int gt n n access n access 1 7 2 4 Accumulation Right Shared_cw al Shared_c lt T gt is the type of a parameter whose value can only be accumulated with respect to the user defined function class F F is required to have the prototype struct cumul_fn void operator T amp x const Tk y body to perform x lt accu x y ds Example struct add void operator int amp x int amp y x y The resulting value of the concurrent write is an accumulation of all other values written by a call to this function After the first accumulation operation is executed the initial value of x becomes either the previous value or remains the current value depending on the lexicographic access order If the shared object has not been initialized then no initial value is considered Since an accumulation enables a concurrent update of some shared object the accumulation function F is assumed to be both associative and commutative Note that only the accumulations performed with respect to a same law F can be concurrent If different accumulation functions are used on a single shared datum the sequence of resulting values obeys the lexicographic order semantics In the prototype of a task the related type is al Shared_cw lt F T gt x Such an object gets the method void cumul T amp v that accumulates according to the accumulation fun
60. ing user defined structures the communicability of user defined classes and the internal scheduling of tasks by ATHAPASCAN The program as a whole can be divided into two parts the simulation lifegame cpp Message cpp Mes sage h and the visualization SappeJuggler cpp SappeJuggler h NJSocket cpp NJSocket h GOLApp cpp GO LApp h The simulation in particular lifegame cpp uses ATHAPASCAN to parallelize the code The Message class defined by the other two files sends the information needed for the visualization through the sockets it creates The visualization portion of this project contains no parallel code and is only used for recieving the messages sent by Message cpp and generating a graphic output with OpenGL from the information received Lifegame cpp creates a matrix of cells caracterized by a boolean state an integer time and two interger co ordinates x y as defined in the cell_ state class Given the state of the current cell and the current state of the cells surrounding it the program calculates a new state for the current cell updates the time variable and sends this inforlation as a message through a socket to visualization The visualization recieves this message and displays the matrix of cells Each cell is displayed with a color corresponding to the time at which the cell was updated and the information sent When running this program in ATHAPASCAN s different modes except sequential the asynchronous task exe cution
61. irtual processors 5The result of registration is to associate an unique identifier to the object This identifier result of an incrementation of a variable locale to the virtual processor So two object are considered as identicall if their identifier are equal that is to say if they have been registered at the same rank 42 8 4 Usage Three operations are allowed on an al mem object x representing a data of type T e The call x read returns a constant reference on the data located in x 8 5 Destruction The destruction of a1 mem objects is managed by the system and occurs e When no task possess it for objects used as parameter e At the end of the program execution for objects that have been registered 8 6 Example Roch amp al The call x write pval writes the value pointed to by x The pointer T val must be considered as lost by the programmer The call x fetch_and_op int f T amp a const T amp b val where the object val is of type const T amp and f designates a pointer to the funtion to be performed The first parameter of this function will be the data stored in x and the second will be stored in val The result of this function should be 1 if the data have been modified otherwise it should be 0 The following example Figure shows the basic usage of a1 mem objects The complex type is communicable and has been previously defined in Chapter page al mem cc lt int gt x 0 al mem pc lt co
62. is not yet valid Example 14 Roch amp al struct add void operator int amp a const int amp b const at b struct addToShared void operator al Shared_cw lt add int gt T inti T cumul i Fs int doit int argc char argv al Shared lt int gt myShared new int 10 al Fork lt addToShared gt myShared 5 return 0 H int maint int argc charxxargv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 Conversion Rules Due to the several types of Shared data with specific reading and writing capabilities there are restrictions on how Shared objects may be passed with respect to their access right Since this material is rather extensive Chapter page is devoted to the study of the Shared object and thus not covered in this chapter Adding thread information Some scheduling policies benefit from thread information ATHAPASCAN uses four variables to determine the best method of ececution Each datum has a default value but a better scheduling may be obtained if the programmer assigns significant values The variable information retained is e The cost of the thread a C double s The locality of the thread a C int e The priority of the thread a C int Low values traduce higher priorities e An extra attribute a C double that represents whatever the scheduler wants it to represent This information is given at the thread creation al F
63. lexicographic semantic any shared parameter of a task is tagged in the prototype of t according to the access performed by t on it This tag indicates what kind of manipulation the task and due to the lexicographic order semantics all the sub tasks it will fork is allowed to perform on the shared object This tag is called the access right it appears in the prototype of the task as a suffix of the type of any shared parameter Four rights can be distinguished and are presented below read write update and accumulation 7 2 1 Read Right Shared_r al Shared_r lt T gt is the type of a parameter thats value can only be read This reading can be concurrent with other tasks referencing this shared object in the same mode In the prototype of a task the related type is ai Shared_r lt T gt x Such an object gets the method const T amp read const that returns a constant reference to the value related to the shared object x For example using the Class complex that is defined in class print void operator al Shared_r lt complex gt z cout lt lt z read x lt lt i lt lt z read y 7 2 2 Write Right Shared_w ai Shared_w lt T gt is the type of a parameter whose value can only be written This writing can be concurrent with other tasks referencing this shared data in the same mode The final value is the last one according to the reference order In the prototype of a task the rel
64. lizing a shared object a1 Shared lt T gt s d where d points to an object of type T this pointer should no longer be used in the program 5 3 Task Creation To create an ATHAPASCAN task the keyword al Fork must be added to the usual procedure call Here user task is a function class as described in the previous section C function object call ATHAPASCAN task creation See args asap user task gt args The new created task is managed by the system which will execute it on a virtual processor of its choice cf Appendix 5 4 Task Execution The task execution is ensured by the ATHAPASCAN system The following properties are respected e The task execution will respect the synchronization constraints due to the shared memory access e all the created tasks will be executed once and once only e no synchronization can occur in the task during its execution Hence for most but not all implementations of ATHAPASCAN programs the system guarantees that every shared data accessed for either reading or updating is available in the local memory before the execution of the task begins 5 5 Kinds of Memory a Task Can Access Each ATHAPASCAN task can access three levels of memory e the stack a local memory private to the task This local memory contains the parameters and local variables it is the classical C or C stack This stack is automatically deallocated at the end of the task e The heap the local memor
65. lt cell gt this cell al Shared r w lt force gt public 1 S int intensity this_cell access evolution f access amp f access reset force intensity 0 T a void cell evolution cell operator ai Shared_r_w lt cell gt this cell 1 this_cell access info time current time this_cell access info my_x x this_cell access info my_y y this_cell access evolution f access f access reset force force const cell contributor intensity contributor is_alive 1 0 void force integration_task operator al Shared_cw lt integration_law force gt f al Shared r lt cell gt c f cumul force c read 3 ai Shared_r_w lt force gt bf actetsdin kdnt_tkimeadijt statimt y T struct QutputInit void operator int nx int ny ai Shared_r_w lt vector lt char gt gt buffer void operator int nx int ny std cout lt lt QutputInit lt lt std endl output new Message output gt SocketEcoute NUM_SOCKET_VRJUGGLER output gt EnvoiMsgInit nx ny std cout lt lt QutputInit lt lt std endl struct QuptutPrintCell void operator al Shared_r_w lt cell gt this_cell std cout lt lt QuptutPrintCell lt lt std end1 output gt Envoyer amp this_cell access info sizeof cell_state J 77777777 10 STREAMS FOR ATHAPASCAN
66. lt lt nb_lines nb_cols display_frequency global_synchronizatfion_frquency lt lt std endl return 0 H doit com argc argv com leave H catch const al lInvalidArgument amp E std cout lt lt Catch invalid arg lt lt std endl H catch const al BadAlloc amp E std cout lt lt Catch bad alloc lt lt std endl catch const ai Exception amp E std cout lt lt Catch E print std cout std cout lt lt std endl H catch std cout lt lt Catch unknown exception lt lt std endl H return 0 H Message cpp fe c projet SAPPE al Fork lt cell evolution_cell gt OCR board i j C board i j forces i j t i yy author F Zara if t display frequency 0 ai Fork lt OuptutPrintCell gt board i j H H if t global synchronization frequency 0 std cout lt lt Before sync t lt lt t lt lt std endl com sync std cout lt lt After sync t lt lt t lt lt std endl H return 0 main entry point Athapascan initialization int main int argc char argv file Message C Definition des messages envoyes entre le programme lut et le programme Athapascan 1 include lt stdio h gt include lt iostream gt include lt sys types h gt include lt sys ipc h gt include lt sys sem h gt include lt sys ioctl h gt include lt sys socket h gt
67. lt swap_shared_array lt T gt gt U al Shared lt myArray lt T gt gt amp this il i2 J ostream operator template lt class T gt 90 ostream amp operator lt lt ostream amp out const shared_array lt T gt amp z al Fork lt ostream_shared_array lt T gt gt al Shared lt myArray lt T gt gt z return out The following main file tests the shared class As you can see there is no more reference to specific parallel code include athapascan 1 h include sharedArray h define SIZE 100 int doit int argc char argv d shared_array lt int gt t1 10 t2 20 myArray lt int gt tab SIZE 10 fill the array for int i 0 i lt SIZE i ME dt lt li athe are 40 tab elts i i resize the shared array to test the methods tl resize SIZE move the data to the shared array tl tab try to swap a data tl swap 2 27 append another shared array t2 tab tl append t2 cout lt lt tl lt lt endl return 0 int main int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED return 0 Roch amp al SCC Ae 20 30 40 ATHAPASCAN 41 8 Other Global Memory Paradigm Access to shared data involve task synchronization tasks are unable to perform side effects In some applications like Branch amp Bound it s conveniant to share a variable with all other tasks data that can be read and written by anybody This varia
68. luceau Rocquencourt BP 105 78153 Le Chesnay Cedex France Unit de recherche INRIA Sophia Antipolis 2004 route des Lucioles BP 93 06902 Sophia Antipolis Cedex France Editeur INRIA Domaine de Voluceau Rocquencourt BP 105 78153 Le Chesnay Cedex France http www inria fr ISSN 0249 0803
69. mplex gt z 0 int min int amp a const int amp b L if ach return 0 a b return 1 task T1 al mem ac lt double gt f f fetch and op amp min 0 01 task T2 if x read gt 5 z write new complex 1 2 2 5 D int al main int argc char argv ai_system init x register new int 1 y register new complex al system init commit al system terminate return 0 Figure 5 Basic usage of a1 mem objects 8 7 Consideration on Global Data Global data permits side effects to occur therefore the guarantee of a sequential execution is not maintained if these data are used ATHAPASCAN 43 9 Examples In this chapter we preset several complete examples of ATHAPASCAN programs These examples are simple enough to be extensively presented within the confines of this chapter and complex enough to demonstrate the implementation of ATHAPASCAN in the context of real world applications All these examples come with the library distribution 9 1 Simple Parallel Quicksort The aim of this implementation is to sort an array of data on two processors using an algorithm based upon a pivot This implementation uses the class myArray a resizable array as defined in page as well as the shared array class as defined in 77 page What we wish to show here is how to embed parallel instructions in the classes representing a user s data structures Programmin
70. n this document will define a method doit It is to be assumed that the program is executed with a main as defined above That is to say that the main method will not be shown in the examples 4 1 2 Fork Fork is the keyword used by ATHAPASCAN to define tasks that are to be parallelized To Fork a task one must ATHAPASCAN 11 e write the code to be parallelized overload the operator of the class to be Forked syntax struct my Task operator C formal parameters task body x e invoke the task call to the method Fork syntax al Fork lt my task gt C effective parameters Example struct PrintHello void operator int id printf Hello world from task number n id Lee int doit int argc char argv al Fork lt PrintHello gt i return 0 int maint int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 gt NB All the formal parameters must be made communicable Cf Chapter e parameters A task parameter can be 1 a regular object or variable ex al Fork lt myTask gt class T with T communicable 2 a Shared data that can be used by different tasks ex al Fork lt myTask gt al Shared r lt myClass gt T with myClass communicable A Shared data can have different access rights Cf Chapter NB Shared data must be initialized A runtime error occurs if this is not done
71. ng h gt include athapascan 1 h fre class myArray is an Athapascan 1 communicable class implementing a resizable myArray NB T has to be communicable too 7 10 template lt class T gt class myArray public unsigned int size The size of the myArray T elts ith entry is accessed by eltsfi empty constructor myArray size 0 elts 0 20 constructor my Array unsigned int k size k if size 0 elts 0 else elts new T size copy constructor myArray const myArray lt T gt amp a size a size elts new T size 30 for int i 0 i lt size i elts i a elts i destructor myArray delete elts resize the myArray void resize unsigned int newSize F 40 Packing operator template lt class T gt al_ostream amp operator lt lt al ostream amp out const myArray lt T gt amp z out lt lt z size for int i z size 1 i gt 0 i out lt lt z elts i return out Unpacking operator template lt class T gt 50 al_istream amp operator gt gt al_istream amp in myArray lt T gt amp z in gt gt z size z elts new T z size for int i z size 1 i gt 0 i in gt gt z elts i return in void myArray lt T gt resize unsigned int newSize erasing the data if newSize lt 0 60 size 0 if elts NULL delete elts elts 0 return new myArray is smaller M
72. nitial_community argc argv std cout lt lt count argc lt lt argc lt lt std endl if argc N for int i 0 i lt argc i std cout lt lt argv i lt lt std endl std cout lt lt usage lt lt argv 0 lt lt PROPER USAGE lt lt std endl return 0 gt doit argc argv com leave catch const al InvalidArgument amp E std cout lt lt Catch invalid arg lt lt std endl catch const al BadAlloc amp E std cout lt lt Catch bad alloc lt lt std endl catch const al Exception amp E std cout lt lt Catch E print std cout std cout lt lt std endl catch std cout lt lt Catch unknown exception lt lt std endl gt return 0 The function doit should contain the code to be executed in parallel The function main in this case simply creates the community executes doit and catches any exceptions thrown The variable N represents the constant number of inputs needed to run the program when coding this line replace N with the desired number If there are not enough inputs specified at run time the program should terminate and output a message describing the proper usage of the program It is necessary to execute the com leave function to facilitate the termination of the ATHAPASCAN application Calling doit from main simplifies the code making it easier to read Note From this point on all examples contained i
73. ntaining the include and library files An ATHAPASCAN program is executed in the same manner as one would execute any other program from the command line For example a common execution may resemble sh program_name lt inputs gt 4 2 Basic Examples This section is a brief tutorial of how to build simple ATHAPASCAN programs it proceeds by teaching through examples The two examples presented in the section are getInfoTask cpp a program that demonstrates how to retrieve system information and Fibonacci cpp a program which commputes the Fibonacci number of a given input More thourough examples are offered in Chapter but use concepts that have not been discussed thus far 4 2 1 Simple Example 1 getInfoTask cpp 1 Fork 0 Shared Let s start with an example implementing the ATHAPASCAN keyword Fork We will see later how to use Shared Assume we want to print to the console data about the task execution for example which processor is involved which node number etc Here is a basic example code include athapascan 1 h struct getInfoTask void operator int i cout lt lt Task number lt lt i lt lt endl cout lt lt Node number lt lt al_system self_node lt lt out of lt lt al_system node_count lt lt endl cout lt lt Thread number lt lt al_system self_thread lt lt out of lt lt al_system thread_count lt lt endl H int doit int argc char argv
74. one pow 1 char title 100 sprintf title Next draw Mandelbrot d old zone Dog set_title title break case t case T if key t _old_zone _thr 10 else _old_zone _thr 10 int nb if old zone thr lt 1 _old_zone _thr 10 nb int ceil log double _reg _w _old_zone _thr log 2 0 nb int ceil exp nb log 2 0 char title 100 sprintf title Next draw threshold jd gt d blocks _old_zone _thr nb nb set_title title break case region r 2 _reg _w 5 2 reg h 5 _reg _w 5 _reg _h 5 _new_zone old zone new zone xi old zone xi _old_zone scale_x r new zone xf new zone xi _old_zone scale_x r new zone yi old zone git _old_zone scale_y r _y neg zone yf new zone yit _old_zone scale_y r cont 0 Pixmap buff XCreatePixmap _dpy groot reg w reg h depth copy _buff r buff reg NVOSVdVHLY clean XCopyArea _dpy buff _buff _gc 0 0 _reg _w _reg _h 0 0 XCopyArea _dpy _buff _xwin _gc 0 0 _reg _w _reg _h _caption 0 0 XFreePixmap _dpy buff break case 2 2 double dx 2 _old_zone _xf _old_zone _xi double dy 2 _old_zone _yf old zone yi _new_zone old zone new zone xi old zone xi dx new zone xf old zone xf t dx new zone yi old zone yi dy new zone yf old zone yft dy region r 0 0 reg w 5 reg h 5 Pixmap buff XCreatePi
75. online on our web site 4 1 Overview of ATHAPASCAN 4 1 1 Starting an ATHAPASCAN Program The execution of an ATHAPASCAN program is handled by a community A community restructures a group of nodes Inuktitut processes so that they can be distributed to the different parallel machines Therefore prior to the declaration of any ATHAPASCAN object or task a community must be created Currently this community only contains the group of nodes defined at the start of the application A community is created by executing the instruction al Community com al System create_initial_community argc argv Once the community has been created the following methods will be available e com is_leader that returns true on one and only one of the nodes processes of that community e com sync that waits for all of the created tasks to finish executing on the collection of nodes before execution on this node resumes e com leave indicates to the current process to leave that community Usually a community is defined in the main method of the program To ensure a proper creation of a commuunity it is also necessary to catch any exceptions that might be thrown by the intialization procedures The skeleton 10 Roch amp al of an ATHAPASCAN program should resemble the following block of code int doit int argc char argv return 0 gt int main int argc char argv try ai Community com al System create_i
76. ork lt user_task gt SchedAttributes infos lt parameters gt Where lt infos gt represent the list of the four possible thread attributes Here is an example of how to use the scheduling attributes al Fork lt user_task gt SchedAttributes prio cost loc extra lt parameters gt Note that if both a specific scheduler and some information are given the order in which the variables are passed is not important al Fork lt user_task gt SchedAttributes prio cost loc extra sched group lt parameters gt al Fork lt user_task gt sched_group SchedAttributes prio cost loc extra lt parameters gt GS HE Ae ATHAPASCAN 15 4 1 4 System Information It is possible to get the following runtime information e al system node count returns the number of nodes e al system self node returns the node identification number on which the function is invoked an integer from 0 to al system node count 1 s al system thread count returns the number of a0 threads dedicated to a thread s execution on the virtual processor s al system self thread returns the a0 thread identification number that hosts the execution of the thread an integer from 0 to al system thread count 1 4 1 5 Compilation and Execution The compilation of an ATHAPASCAN program is performed by the make command using the Makefile created upon installaion Be sure to modify the Makefile as needed to personalize the folders co
77. pc lt T gt x Tx pval for processor consistency e al mem ac T gt x T pval for asynchronous consistency The type T must be communicable and pval assigns a pointer to the initial value assumed by the object This pointer can be null and is entirely managed by the system That is to say the pointer must be considered as lost by the programmer This declaration is permitted anywhere in the code An object of type al mem can be used as a parameter of a task or declared globally If recieved as a parameter the scope of the vaiable is limited to the procedure s body whereas it has the scope of the entire code if it is declared globally 8 3 Registration Due to some implementation characteristics the global data have to be registered effectively linking all the representatives one per processor to the same global data If the object is used in the task parameters this registration is made automatically Otherwise if the object is declared globally the registration must be manually performed Manually registering global data is done by invoking the register pval method on each object during the initialization phase between the ai_system init and ai_system init_commit invocation The pval parameter assigns a pointer to the initial value taken by the object This pointer can be null and is entirely managed by the system that is to say must be considered as lost by the programmer The order of invocation must be the same on all v
78. program free from parallel instructions making it easier to write and understand In Chapter 5 3 3 we wrote a communicable class implementing a resizable communicable array called myArray We are now going to write a shared data structure on top of this class include athapascan 1 h include resizeArray h class shared_array is a class hiding Athapascan code so that the main code of the application could be written as if it was sequential It is based upon the resizable array class called myArray P resize the shared array template lt class T gt struct resize_shared_array void operator al Shared_r_w lt myArray lt T gt gt tab unsigned int size tab access resize size affect a local myArray to a shared_array template lt class T gt struct equal shared array NB we use a read write access because we need the size void operator al Shared_r_w lt myArray lt T gt gt shtab myArray lt T gt tab myArray lt T gt t new myArray lt T gt tab t gt resize shtab access size shtab access t J append a shared array to a shared array template lt class T gt struct append shared array void operator al Shared r w lt myArray lt T gt gt tl al Shared r lt myArray lt T gt gt t2 int i tl access size int k i t2 read size tl access resize k for int j i j lt k j tl access elts j t2 read elts j i GS me hG 10 20 30
79. read H struct fibonacci This procedure recursively computes fibonacci n where n is an int and writes the result in the shared memory void operator int n al Shared_w lt int gt res if n lt 2 res write new int n else al Shared lt int gt res1 0 al Shared lt int gt res2 0 al Fork lt fibonacci gt a 1 resl al Fork lt fibonacci gt n 2 res2 al Fork lt add gt U resi res2 res T struct print This procedure writes to stdout the result of fibo n where n is an int in the shared memory void operator int n al Shared_r lt int gt res cout lt lt Fibonacci lt lt n lt lt lt lt res read lt lt endl int doit int argc char argv al Shared lt int gt res int 0 al Fork lt fibonacci gt atoi argv 1 res al Fork lt print gt atoi argv 1 res return 0 int maint int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 Explanation The purpose of this excercise is to share the Fibonacci computation with others processors Therefore we need to define a Shared data in order to hold the result and to create a task calling the Fibonacci function The fibonacci task works the same way the sequential function works first we check if n lt 2 if not we make recursive calls to get Hin 1 and F n 2 ean de A E
80. relatives aux messages transmis entre le programme Ai et de visualisa ou class Message public Constructeur vide inline Message Fonction permettant d eliminer les processus de service quand ils se terminent il suffit que le serveur ignore le signal SIGCLD void sigchld Fonction de creation d une socket le parametre est le numero du port souhaite le numero de port sera envoye en resultat int CreerSock int port int type Creation de la socket d ecoute void SocketEcoute int port Fonction recevoir lit sur la socket fd size octets et retourne 0 si coupure de la connexion ou size int Recevoir char buf int size int Recevoir int buf int size int Recevoir cell_state buf int size Fonction envoyer envoi sur la socket fd size octets et retourne 0 si coupure de la connexion ou size int Envoyer char buf int size int Envoyer int buf int size int Envoyer cell_state buf int size Envoi des caracteristiques des blocs du programme A1 void EnvoiMsgInit int nx int ny Destructeur de la class Message inline Message private Descripteur de la socket creee int _desc Connexion pendante associee a la socket int _sock_service Socket d ecoute int _sock_ecoute NVOSVdVHLV H endif SappJuggler cpp NJSocket cpp c projet SAPPE Author F Zara modified by Joe He
81. roygbiv 2 255 tim 1280 roygbiv 3 0 if sock NULL 1 sock gt recv char next_cells nx ny sizeof cell_state J DISPLAY OF VISUALIZATIONS 11 11 11 11 11 11 1111 11 111 1 1117 sock gt recv cell_state next_cells sizeof cell_state POSSIBLE ERROR CAUSER glTranslatef 0 0 6 cell_state ok glBegin GL_QUADS ok state 1 glColor4ubv roygbiv ok time 1 glVertex2i 2 cur gt my_x 2 cur gt my_y ok my_x 1 glVertex2i 2 cur gt my_x 1 2 cur gt my_y ok my_y 1 glVertex2i 2 cur gt my_x 1 2 cur gt my_y 1 sock gt send amp ok sizeof cell_state glVertex2i 2 cur gt my_x 2 cur gt my_yt1 i char ok 1 1 sock gt send amp ok sizeof char glEnd glPopMatrix H TZ void GOLApp postFrame include lt sys ipc h gt include lt sys sem h gt include lt sys socket h gt if sock NULL include lt netinet in h gt cell_state temp cells include lt netdb h gt cells next_cells next_cells temp Pour la visualisation include cell_state cpp H include NJSocket h GOLApp h typedef struct cell state char state int time c projet SAPPE int myx Author F Zara int my_y modified by Joe Hecker for use in lifgame file ParticulesApp h Application permettant la visualisation des particules Application permettant la visualisation des particules Mi ai class GOLApp public vjGlApp public
82. seem as though the program was implemented according to the sequential depth first algorithm 36 Roch amp al al Fork lt add gt i 2 al Fork lt add gt i 1 This is not the case Naturally the above code is semantically correct as well and could produce the same result as the program in It is therefore important to realize that since the function F is associative and commutative the precise manner in which the reductions are performed cannot be predicted even in the case where initial values are known SCC Ae ATHAPASCAN 37 7 3 Shared Access Modes In order to improve the parallelism of a program when only a reference to a value is required and not the value itself ATHAPASCAN refines its access rights to include access modes An access mode categorizes data by restricting certain types of access to the data By default the access mode of a shared data object is immediate meaning that the task may access the object using any of the write read access or cumul methods during its execution An access is said to be postponed access right suffixed by p if the procedure will not directly perform an access on the shared data but will instead create other tasks that may access the shared data In functional languages such a parallelism appears when handling a reference to a future value With this refinement to the access rigths ATHAPASCAN is able to decide with greater precision whether or not two pro
83. state alive define NUM_SOCKET_VRJUGGLER 60125 void evolution const force i namespace AC ACOM_NET_NAMESPACE if Cinfo state dead amp amp i intensity gt 3 info state alive T else if info state alive amp amp i intensity gt 4 i intensity lt 1 info state dead Simple life game Encapsulation of evolution into a function class for Athapascan class cell class force struct evolution_cell al 0Stream amp operator lt lt al 0Stream amp ostr const force amp A void operator al Shared_r_w lt cell gt this cell ai Shared_r_w lt force gt f int current Kine int x int ai IStream amp operator gt gt ai IStreamk amp istr force amp A T ai 0Stream amp operator lt lt ai 0Streamk amp ostr const cell amp al IStream amp operator gt gt ai IStream amp istr cell amp A struct evolution task void operator al Shared_r_w lt cell gt this cell al Shared_r_w lt force gt f 3 AI III III 11 111 1111111171 1 7 7 777 BEGIN FORCE STRUCT W RER ARR Et class force The intensity of the force brought by a ceil c has intensity 1 is c is alive 0 else 11111114111111 11 ND CELL HEAD 111111111111111111111111 The intensity of the force brought by a set of ceil is the integration sum of the intesities of the forces of each cell void cell evolution task operator ai Shared_r_w
84. t my shared_array lt T gt amp a int start int size al Fork lt copy2_my shared_array lt T gt gt 90 al Shared lt myArray lt T gt gt amp this al Shared lt myArray lt T gt gt amp a start size merge arrays lt to a certain value so that they are sorted Not optimized void merge my shared_array lt T gt amp a 10 d YOON this gt append a this gt qsort T mainQSORT cpp include mySharedArray h void usage cerr lt lt usage qsort lt size_of_array gt lt lt endl exit 1 template lt class T gt void myQsort myArray lt T gt t int size t gt size int halfl abs size 2 we split the array in 2 smaller my_shared_arrays my shared_array lt int gt t1 t2 tl new my shared_array lt int gt half1 t2 new my shared_array lt int gt size halfl k t2 gt copy t half1 size half1 we sort the 2 shared_array using the standard C qsort t1 gt qsort t2 gt gqsort we search for a pivot al Shared lt T gt pivot new T t1 gt find Pivot pivot we split each shared array in 2 pieces lt pivot and gt pivot lt pivot stay in initial shared array gt pivot goes to a new array int halt abs half1 2 my shared_array lt int gt t3 t4 t3 new my shared_array lt int gt half12 t4 new my shared_array lt int gt size half half12 100 t3 gt copy t1 hali half12 half1
85. tecture in order to compare the performances you will realize that it can take even more time to execute in parallel than sequentially This is a normal behavior for these kinds of programs SE uae aE A ATHAPASCAN include athapascan 1 h include lt iostream h gt include lt stdlib h gt struct add this method is instanciated by the cumul method of the concurrent write Shared data see fibonacci void operator int amp x const int amp a x a H sequential fibonnaci int fibo_seq int n H if n lt 2 return n else return fibo_seq n 1 fibo_seq n 2 struct fibonacci void operator int n int threshold al Shared_cw lt add int gt r H if n lt threshold r cumul fibo_seq n else al Fork lt fibonacci gt n 2 threshold r fibonacci n 1 threshold r TL struct print This procedure writes to stdout the result of fibo n where n is an int in the shared memory void operator int n al Shared_r lt int gt res cout lt lt Fibonacci lt lt n lt lt lt lt res read lt lt endl L int doit int argc char argv al Shared lt int gt res int 0 al Fork lt fibonacci gt atoi argv 1 atoi argv 2 res al Fork lt print gt atoi argv 1 res return 0 int maint int argc char argv MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 e ere
86. tion information Q An ATHAPASCAN internal error occurs at execution What can I do to correct this error A If you are using MPI LAM please clean up and reboot LAM before executing your ATHAPASCAN program If the problem persists please follow the instructions on the ATHAPASCAN webpage Q The compiler does nt find a task corresponding to my a1 Fork instruction Why could this be A Make sure that all the shared modes and rights are compatible A Make sure the procedure does not have too many arguments If so recompile your library after having increased the authorized number of parameters at configuration option nbp of configure script Bet de d eg AN 76 Roch amp al Q I have tried all the previous suggestions and I still have some errors What shall I do A Send an e mail to Jean Louis Roch imag fr stating your problem Seat 79 ATHAPASCAN Sege de n Se 77 A Unit de recherche INRIA Rh ne Alpes 655 avenue de l Europe 38330 Montbonnot St Martin France Unit de recherche INRIA Futurs Domaine de Voluceau Rocquencourt BP 105 78153 Le Chesnay Cedex France Unit de recherche INRIA Lorraine LORIA Technop le de Nancy Brabois Campus scientifique 615 rue du Jardin Botanique BP 101 54602 Villers l s Nancy Cedex France Unit de recherche INRIA Rennes IRISA Campus universitaire de Beaulieu 35042 Rennes Cedex France Unit de recherche INRIA Rocquencourt Domaine de Vo
87. tr LocalMat amp A The main function for int i 0 i lt 50 i int doit int argc char argv for int j 0 j lt 50 j istr gt gt A tab il j Shared lt LocalMat gt A 10 return istr Shared lt LocalMat gt R 10 ved for int i 0 i lt 10 i ALi new Shared lt LocalMat gt 10 Sequential Computation of C B R i new Shared lt LocalMat gt 10 struct seqmatrixadd for int j 0 j lt 10 j void operator Shared_r lt LocalMat gt A Shared_r lt LocalMat gt B Shared_w lt LocalMat gt C LocalMat Temp ALi j Shared lt LocalMat gt new LocalMat const LocalMat amp my_A A read R i j Shared lt LocalMat gt new LocalMat const LocalMat amp my_B B read for int i 0 i 50 i ParSquareMat R A 10 H lt for int j 0 j lt 50 j Temp tab il j my A tab il j my B tab il j return 0 by H C write new LocalMat Temp S int main int argc char argv LS S MAIN METHOD AS PREVIOUSLY DEFINED IN API CHAPTER return 0 NVOSVdVHLY 6G 60 Roch amp al SCC Ae ATHAPASCAN 61 10 Culminating Example lifegame cpp The lifegame program was developed to provide a visualization of the asynchronous task execution performed by ATHAPASCAN This program serves as an example for most of the concepts covered in this manual passing and declariation of Shared data Fork
88. types h class win_proc_mand public win_proc public int init zone z0 int caption int nb proc int nb threads protected virtual int x_resize int width int height TS bk ga EC GG int _col new int col size 1 Zort int i 0 i lt col size i endif _col i col i read win_gest Xenter mandel C w_mand draw r _col w_proc draw r node thread include lt stream h gt win_gest Xleave include lt stdlib h gt delete _col include lt stdio h gt H include lt math h gt E include lt athapascan 1 h gt struct mandel const char graph_name const return mandel include win_gest h mandel include win h include win mand h void operator win region r zone z int nb col include win proc mand h H include types h if r _w lt z thr Param_array lt Shared lt int gt gt col r _w r _h static win mand w_mand static win_proc_mand w_proc compute_region r z col nb col Fork lt display_region gt SchedAttributes 1 0 10 1 ai_mapping fixed ai_system self_nbde class my_sch public ai_mapping group ai_system self_thread r col public else int priority return 10 int i j Zort j 0 j lt 2 j for i 0 i lt 2 i int xy2color double x double y int n int it int nb col win region rij r x ixr _w 2 r y j r _h 2 int color 0 r _w 2 ix r _w42 complex z 0
89. uct seqmatrixmultiply all matrix void operator Shared_r lt LocalMat gt A Shared_r lt LocalMat gt B Shared_w lt LocalMat gt C LocalMat Temp matrix matrix o const LocalMat amp my_A A read const LocalMat amp my B B read clean rm o Matrix for int i 0 i lt 50 i oO for int j Temp tab i j 0 for int k 0 k lt 50 k Temp tab i j my_A tab i k my_B tab k j 3 j lt 50 j include lt athapascan 1 h gt A basic two dimensional matrix type H struct LocalMat double tab 50 50 C write new LocalMat Temp LocalMat By Local Hart const LocalMat amp A for int i 0 i lt 50 i for int j 0 j lt 50 j Parallel Computation of a matrix product R R A A tabli j A tab i j void ParSquareMat Shared lt LocalMat gt R Shared lt LocalMat gt A int dim 3 Zort int i 0 i lt dim i Zort int j 0 j lt dim j for int k 0 k lt dim k Shared lt LocalMat gt tmp new LocalMat al ostream amp operator lt lt ail_ostream amp ostr const LocalMat amp A Fork lt seqmatrixmultiply gt A i k A k j tmp for int i 0 i lt 50 i Fork lt seqmatrixadd gt R i j tmp R i j for int j 0 j lt 50 j H ostr lt lt A tab i j T return ostr ai_work_steal basic al istream amp operator gt gt al_istream amp is
90. unction question or comment that may arrise More information about ATHAPASCAN and the APACHE project can be found online at http www apache imag fr The user can subscribes to the following mailing lists e http listes imag fr wws info id_al_hotline to have help from the ATHAPASCAN s group about installation or programming pitfalls or bug report e http listes imag fr wws info id_al_devel to reach the developers of ATHAPASCAN s group about implementation details questions remarks design The authors thank all the people who has worked on this project PhD Students e Francois Galil e e Mathias Doreille e Gerson Cavalheiro e Nicolas Maillard Engineers students e Arnaud Defrenne e Jo Hecker A Roch amp al Contents SCC Ae ATHAPASCAN d 2 Introduction ATHAPASCAN 1 is the C application programming interface of ATHAPASCAN It is a library designed for the programming of parallel applications 2 1 About ATHAPASCAN ATHAPASCAN is build on a multi layered structure 1 Athapascan 0 is the communication layer based upon MPI and POSIX threads the extension independent from the transport library is called INUKTITUT 2 ATHAPASCAN 1 is the user end API 3 ATHAPASCAN also contains visualization tools for debugging purposes ATHAPASCAN is a high level interface in the sense that no reference is made to the execution support The synchronization communication and scheduling of operations
91. utManager vjPosInterface h gt virtual void contextClose include lt Input InputManager vjAnalogInterface h gt include lt Input InputManager vjDigitalInterface h gt name Drawing Loop Functions include lt Kernel vjUser h gt The drawing loop will look similar to this Pour la sauvegarde des images include lt tiffio h gt SC drawing D Pour 1 emploi des sockets preFrame include lt unistd h gt draw include lt sys types h gt intraFrame Drawing is happening while here 10 d YOON sync postFrame Drawing is now done Fonction appelee avant la mise a jour Calculs effectues ici UpdateTrackers virtual void postFrame T private Message Fonction appelee apres la mise a jour du tracker mais avant le debut du dessin NJSocket sock Calculs et modifications des etats faits ici int nx ny virtual void preFrame cell_state cells cell_state next_cells Fonction de dessin de la scene virtual void draw Fonction appelee apres que le dessin soit declenche mais AVANT qu il soit fini endif virtual void intraFrame des trackers mais apres que la frame soit dessinee NVOSVdVHLV el 74 Roch amp al Seat 79 ATHAPASCAN 75 11 Frequently Asked Questions This section contains a list of frequently asked questions about ATHAPASCAN and some attempts at answering Please feel free to send us any questions th
92. volume maximum number of accesses to remote data The execution time on a machine can be related to these costs 7 ATHAPASCAN has been developed in such a way that one does not have to worry about specific machine architecture or optimization of load balancing between processors Therefore it is much simpler and faster to use ATHAPASCAN to write parallel applications than it would be to use a more low level library such as MPI 2 2 Reading this Document This document is a tutorial designed to teach one how to use ATHAPASCAN s API Its main goal is not to explain the way ATHAPASCAN is built If new to ATHAPASCAN it is recommened to read all of the remaining text However if the goal is to immediately begin writing programs with ATHAPASCAN feel free to skip the next two chapters They simply provide an overview of e how to install ATHAPASCAN s librairies include files and scripts Chapter 771 e how to test the installation performed Chapter s the API Chapter 7 The other sections will delve deeper into ATHAPASCAN s API so that the user can benefit from all of its functionalities They explain s the concepts of tasks and shared memory Chapters and respectively Roch amp al how to write the code of desired tasks Chapter how to make shared types communicable to other processors Chapter which type of access right to shared memory should be used Section how to
93. we don t send the pointers e al Istream we receive first the number of values then we insert the values in the chain using local pointers ATHAPASCAN include lt iostream h gt include lt stdlib h gt include athapascan 1 h Z class myList is an Athapascan 1 communicable class We use a chain structure to store the values NB T has to be communicable too 4 template lt class T gt class myList public T value myList next empty constructor myList value next 0 constructor myList T v myList n value v next n copy constructor myList const myList lt T gt amp d value d value if d next 0 next new myList lt T gt d next else next 0 destructor myList if next 0 delete next return the size of the list int size int s 0 myList lt T gt x this while x gt next 0 S x x gt next delete x return s we push the data at the end of the list void push_back T newval myList lt T gt x this while x gt next 0 x x gt next x gt next new myList newval 0 we pop th efirst data from the list remove it end return its value T pop front if Inext return 1 else myList lt T gt x next T ret next gt value next x gt next x gt next 0 delete x return ret H packing operator template lt class T gt RSS de e ege
94. wing operations are allowed on an object of type Shared e Declaration in the stack as presented above e Declaration in the heap using the operator new to create a new shared object In the current implemen tation the link between a task and a shared data version is made through the C constructor and destructor of the shared object So to each construction must correspond a destruction else dead lock may occur Therefore in the case of allocation in the heap the delete operator corresponding to the already exectured new has to be performed e Affectation a shared object can be affected from one to another This affectation is symbolic having the same semantics as pointer affectation The real shared object is then accessed through two distinct references 7 2 Shared Access Rights In order to respect the sequential consistency lexicographic order semantic ATHAPASCAN has to identify the value related to a shared object for each read performed Parallelism detection is easily possible in the context that any task specifies the shared data objects that it will access during its execution on the fly detection of independent tasks and which type of access it will perform on them on the fly detection of a task s precedence Therefore an ATHAPASCAN task can not perform side effects All manipulated shared data must be declared in the prototype of the task Moreover to detect the synchronizations between tasks according to
95. x can not be accessed by the task that creates it It is only possible to Fork other tasks with x as an effective parameter Example ai Shared lt int gt x new int 3 x is initialized with the value 3 double v new double 3 14 ai Shared lt double gt sv v sv is initialized with the value v v can not be used anymore in the program and will be deleted by the system e ai Shared lt T gt x 0 The reference x is declared but not initialized Thus the first task that will be forked with x as parameter will have to initialize it using a write statement 77 page 771 Otherwise if a task recieves this reference as a parameter and attempts to read a value from it dead lock will occur Example ai Shared lt int gt a mew int 0 a is a shared object initialized with the value 0 ai Shared lt int gt x 0 x is a NON initialized shared object e ai Shared lt T gt x The reference x is only declared as a reference with no related value X therefore has to be assigned to another shared object before forking a task with x as a parameter Such an assignment is symbolic having the same semantics as pointer assignment Example 32 Roch amp al al Shared lt int gt x x is just a reference not initialized al Shared lt int gt a new int 0 a is a shared object initialized with the value 0 x a x points to the same shared object as a The follo
96. xmap _dpy _wroot r _w r h depth 7 copy _buff reg buff r clean XCopyArea _dpy buff _buff _gc 0 0 _reg _w _reg _h 2Q _reg _w 5 2 _reg _h 5 XCopyArea _dpy _buff _xwin _gc 0 0 _reg _w reg ht caption 0 0 XFreePixmap _dpy buff cont 0 break H H return cont int win mand zoom const regionk r 1 region new reg region 0 0 1 _reg _h new reg w int ceil double r _w new reg h r _h if new reg w gt reg ail new reg w reg Hi new reg h int ceil double r _h new reg w r _w H Pixmap buff XCreatePixmap dpy groot new reg w new reg h _depth copy _buff r buff new reg resize new reg XCopyArea _dpy buff _buff _gc 0 0 reg w reg h 0 0 XCopyArea _dpy _buff _xwin _gc 0 0 reg w reg h caption 0 0 XFreePixmap _dpy buff GG bk bi eae _new_zone old zone neng zone xi old zone xi old zone scale x r _x new zone yi old zone yi _old_zone scale_y r _y neng zone xf old zone xi _old_zone scale_x r _x r _w 1 new zone yf old zone yi _old_zone scale_y r _y r _h 1 new zone Hr reg Hi new zone br reg hi return 0 H static int min int a int b return a lt b a b H int win mand x resize int width int height 1 region old reg reg Pixmap buff XCreatePixmap _dpy _wroot old reg w old reg h depth XCopyArea _dpy _buff
97. y of the node Unix process that executes the task Objects are allocated or deallocated in or from the heap directly using C C primitives malloc free new delete Therefore all tasks executed on a given node share one common heag consequently if a task does not properly deallocate the objects located in its heap then some future tasks may run short of memory on this node 1Caution this is not verified either at compile nor at execution time The user has to take care of not including any reference on the shared memory in classical types 2In the current implementation the execution of a task on a node is supported by a set of threads that share the heap of a same heavy process representing the node 24 Roch amp al e The shared memory accessed concurrently by different tasks The shared memory in ATHAPASCAN is a non overlapping collection of physical objects of communicables types managed by the system SCC Ae ATHAPASCAN 25 6 Communicable Type Using a distributed architecture means handling data located in shared memory mapping migration consis tency In order to make ATHAPASCAN able to do this the data must be serialized This serialization has to be explicitly done by the programmer to suit the specific needs of the program NB All the classes and types used as task parameters must be made communicable 6 1 Predefined Communicable Types The following types are communicable s The following C basic typ

Download Pdf Manuals

image

Related Search

Related Contents

ACH Origination Manual  取扱説明書 電気衣類乾燥機 家庭用 品番 AQD-K45  取扱説明書 - MakeShop  DT9810 and DT9817 Series User`s Manual  XEphem Manual  Guía rápida Frenic Multi  Advance Acoustic ACS 97  Emerald Series 60 YEARS LIMITED WARRANTY  BEDIENUNGSANLEITUNG STOCKRÜTTLER P14-B  NATURAMINE  

Copyright © All rights reserved.
Failed to retrieve file