Home

User's Manual - Computation - Lawrence Livermore National

image

Contents

1. set vector values HYPRE IJVectorSetValues ij vector nvalues indices values HYPRE IJVectorAssemble ij vector HYPRE IJVectorGet0bject ij vector void amp par vector The Create routine creates an empty vector object that lives on the comm communicator This is a collective call with each process passing its own index extents jlower and jupper The names 28 CHAPTER 6 LINEAR ALGEBRAIC SYSTEM INTERFACE IJ of these extent parameters begin with a j because we typically think of matrix vector multiplies as the fundamental operation involving both matrices and vectors For matrix vector multiplies the vector partitioning should match the column partitioning of the matrix which also uses the j notation For linear system solves these extents will typically match the row partitioning of the matrix as well The SetObjectType routine sets the underlying vector storage type to HYPRE_PARCSR this is the only storage type currently supported The Initialize routine indicates that the vector coefficients or values are ready to be set This routine may or may not involve the allocation of memory for the coefficient data depending on the implementation The SetValues routine sets the vector values for some number nvalues of indices Each process should set only those vector values that it owns in the data distribution The Assemble routine is a trivial collective call and finalizes the vector assembly makin
2. Smaller values of tol lead to more accurate preconditioners but can also lead to increases in the time to calculate the preconditioner 42 CHAPTER 7 SOLVERS AND PRECONDITIONERS Chapter 8 Additional Information 8 1 Building the Library Usually hypre can be built by simply typing configure followed by make in the top level source directory 8 1 1 Library configuration To automatically generate machine specific makefiles type configure in the top level directory The configure script is a portable script generated by GNU Autoconf It runs a series of tests to determine characteristics of the machine on which it is running and it uses the results of the these tests to produce the machine specific makefiles called Makefile from template files called Makefile in in each directory Once the makefiles are produced you can run make as you would with any other makefile The configure script primarily does the following things e selects a compiler e provides either optimization or debugging options for the compiler e finds the headers and libraries for MPI The configure script makes these decisions based on a hierarchical check First it attempts to identify the machine on which it is running as a specific supported machine Next it will try to identify the architecture as a supported architecture If both of these fail generic default decisions are made by the script However the script does have some command line o
3. you can then pass options on the command line per the following example mpirun np 2 phoo level 3 Since Euclid looks for the database file when HYPRE_EuclidCreate is called and parses the command line when HYPRE EuclidSetParams is called option values passed on the command line 7 5 EUCLID 39 will override any similar settings that may be contained in the database file Also if same option name appears more than once on the command line the final appearance determines the setting Some options such as bj see next subsection are boolean Euclid always treats these options as the value 1 true or 0 false When passing boolean options from the command line the value may be committed in which case it assumed to be 1 Note however that when boolean options are contained in a configuration file either the 1 or 0 must stated explicitly Method 3 Individual options can be hardcoded via calls such as the following HYPRE_EuclidSetParams solver level 3 Method 4 There are two ways in which you can read in options from a file whose name is other than database First you can call HYPRE EuclidSetParamsFromFile to specify a configuration filename Second if you have passed the command line arguments as described above in Method 2 you can then specify the configuration filename on the command line using the db_filename filename option e g mpirun np 2 phoo db_filen
4. 130720 E Chow A priori sparsity patterns for parallel sparse approximate inverse preconditioners SIAM J Sci Comput 21 1804 1822 2000 R L Clay et al An annotated reference guide to the Finite Element Interface FEI specifica tion Version 1 0 Technical Report SAND99 8229 Sandia National Laboratories Livermore CA 1999 R D Falgout and J E Jones Multigrid on massively parallel architectures In E Dick K Riemslagh and J Vierendeels editors Multigrid Methods VI volume 14 of Lecture Notes in Computational Science and Engineering pages 101 107 Berlin 2000 Springer Proc of the Sixth European Multigrid Conference held in Gent Belgium September 27 30 1999 Also available as LLNL technical report UCRL JC 133948 V E Henson and U M Yang BoomerAMG a parallel algebraic multigrid solver and pre conditioner Applied Numerical Mathematics to appear Also available as LLNL technical report UCRL JC 141495 D Hysom and A Pothen Efficient parallel computation of ILU k preconditioners SC99 ACM November 1999 published on CDROM ISBN 1 58113 091 0 ACM Order 415990 IEEE Computer Society Press Order RS00197 D Hysom and A Pothen A scalable parallel algorithm for incomplete factor preconditioning SIAM J Sci Comput 22 6 2194 2215 2001 G Karpis and V Kumar Parallel threshold based ILU factorization Technical Report 061 University of Minnesota Department of Computer Science Army HPC Resear
5. 18 i values i 0 0 HYPRE_StructVectorSetBoxValues b ilower 0 iupper 0 values 14 CHAPTER 3 STRUCTURED GRID SYSTEM INTERFACE STRUCT HYPRE_StructVectorSetBoxValues b ilower 1 iupper 1 values HYPRE_StructVectorAssemble b The Create routine creates an empty vector object The Initialize routine indicates that the vector coefficients or values are ready to be set This routine follows the same rules as its corresponding Matrix routine The SetBoxValues routine sets the vector coefficients over the gridpoints in some box and again follows the same rules as its corresponding Matrix routine The Assemble routine is a collective call and finalizes the vector assembly making the vector ready to use 3 5 Symmetric Matrices Some solvers and matrix storage schemes provide capabilities for significantly reducing memory usage when the coefficient matrix is symmetric In this situation each off diagonal coefficient appears twice in the matrix but only one copy needs to be stored The Struct interface provides support for matrix and solver implementations that use symmetric storage via the SetSymmetric routine To describe this in more detail consider again the 5 pt finite volume discretization of 3 1 on the grid pictured in Figure 3 1 Because the discretization is symmetric only half of the off diagonal coefficients need to be stored To turn symmetric storage on the following line of code n
6. 3 2 3 3 Matrix values associated with the non stencil graph entries are also set up HYPRE_SStructMatrix A double values 32 int sindices 2 0 3 int gindices 4 9 10 11 0 int 1 HYPRE_SStructMatrixCreate MPI_COMM_WORLD graph amp A HYPRE_SStructMatrixInitialize A for i 0 i lt 32 i 2 values i values i 1 00 o 1 0 HYPRE_SStructMatrixSetBoxValues A 0 ilower 0 iupper 0 0 2 sindices values set values at non stencil graph entries for i 0 i lt 3 i values i 1 0 HYPRE_SStructMatrixSetValues A 0 addindex 0 0 2 gindices values HYPRE_SStructMatrixSetValues A 0 addindex 1 0 3 gindices values HYPRE_SStructMatrixSetValues A 0 addindex 2 0 w gindices values HYPRE_SStructMatrixSetValues A 0 addindex 7 HYPRE_SStructMatrixSetValues A 0 addindex 7 1 o o ta gindices values amp gindices 3 values p zero out coefficients that reach outside of the domain or grid part set boundary conditions at domain boundaries HYPRE_SStructMatrixAssemble A 20 CHAPTER 4 SEMI STRUCTURED GRID SYSTEM INTERFACE SSTRUCT 4 5 Setting Up the SStruct Right Hand Side Vector The right hand side vector is set up similarly to the matrix set up described in Section 4 4 above The main difference is that there is no graph Chapter 5 Finite Element Interface FEI The finite ele
7. below The following example code illustrates the basic usage of the IJ interface for building matrices MPI_Comm comm HYPRE_I JMatrix ij_matrix HYPRE_ParCSRMatrix parcsr_matrix int ilower iupper int jlower jupper int nrows 25 26 CHAPTER 6 LINEAR ALGEBRAIC SYSTEM INTERFACE IJ int ncols int xrOWS int cols double values HYPRE_IJMatrixCreate comm ilower iupper jlower jupper amp ij_matrix HYPRE_IJMatrixSetO0bjectType ij_matrix HYPRE_PARCSR HYPRE_I JMatrixInitialize ij_matrix set matrix coefficients HYPRE_I JMatrixSetValues ij_matrix nrows ncols rows cols values add to matrix cofficients if desired HYPRE_I JMatrixAddToValues ij_matrix nrows ncols rows cols values HYPRE_I JMatrixAssemble ij_matrix HYPRE IJMatrixGet0bject ij matrix void amp parcsr matrix The Create routine creates an empty matrix object that lives on the comm communicator This is a collective call i e must be called on all processes from a common synchronization point with each process passing its own row extents ilower and iupper The row partitioning must be contiguous i e iupper for process i must equal ilower 1 for process i 1 Note that this allows matrices to have 0 or 1 based indexing The parameters jlower and jupper define a column partitioning and should match ilower and iupper when solving square linear systems See the Reference Manual for more information The Se
8. done by Karypis and Kumar 9 in terms of the Cray SHMEM library The code was subsequently modified by the hypre team SHMEM was replaced by MPI some algorithmic changes were made and it was software engineered to be interoperable with several matrix implementations including hypre s ParCSR format PETSc s matrices and ISIS RowMatrix The algorithm produces an approximate factorization LU with the preconditioner M defined by M LU Note PILUT produces a nonsymmetric preconditioner even when the original matrix is sym metric Thus it is generally inappropriate for preconditioning symmetric methods such as Conju gate Gradient Parameters e SetMaxNonzerosPerRow int LFIL Default 20 Set the maximum number of nonze ros to be retained in each row of L and U This parameter can be used to control the amount of memory that L and U occupy Generally the larger the value of LFIL the longer it takes to calculate the preconditioner and to apply the preconditioner and the larger the storage re quirements but this trades off versus a higher quality preconditioner that reduces the number of iterations e SetDropTolerance double tol Default 0 0001 Set the tolerance relative to the 2 norm of the row below which entries in L and U are automatically dropped PILUT first drops entries based on the drop tolerance and then retains the largest LFIL elements in each 7 6 PILUT PARALLEL INCOMPLETE FACTORIZATION 41 row that remain
9. ee a a a SR ee ide a a N 6 2 TI Vector Interiace t ers ia 2 bee fone Sab E Re we Re ee oe da ed 10 11 11 13 14 15 16 17 18 18 20 21 22 22 22 ii CONTENTS 7 Solvers and Preconditioners 29 ed SMG us raa E Bo E AA AAA O AA A Ee A 30 12 DEM ve euit da ee onl E E E wee ad SL O E 31 3 BoomerA MG i a eds eg are a ee 31 La SYNOPSIS e toro a Ae A te A e a e oe a 31 1 3 2 Interface functions s i t me ato a id E e Be ER 32 14 ParaSalls carrer a a De ta a a Y e ee aaa 34 GAA Sy DOPSIS Lie Ed wl Mae doe o hee A ee dnd 34 4 2 Interface functions amu ab os 2080 bea a ee eas 35 7 4 3 Preconditioning nearly symmetric matrices 36 Lo BC ts Mead ict Ae bra o a A IA ce A ae de E la 36 Goel HSYDOPSISe Ack eye kes ew ee ede che Monee MR hits dn ido Sewers 37 7 5 2 Setting options examples e 38 TDZ Optiolis summary ia a dd eee O 0 39 7 6 PILUT Parallel Incomplete Factorization 0 2 00 0 005 40 8 Additional Information 43 Sl Building the Library a gt A it ed 43 8 1 1 Library configuration ee ee 43 8 1 2 Linking to the Library e 44 8 2 Calling from Fortran tea a ese ele caio dhs He vd de ec ae a et 44 8 3 Big Reporting 4 isa Godin E a ee A A RE ec a ae We SD 45 Chapter 1 Introduction hypre is a software library for solving large sparse linear systems of equations on massively parallel computers The library was created
10. following coarsening techniques are available e the Cleary Luby Jones Plassman CLJP coarsening e various variants of the classical Ruge Stueben RS coarsening algorithm and e the Falgout coarsening which is a combination of CLJP and the classical RS coarsening algorithm The following relaxation techniques are available e weighted Jacobi relaxation e a hybrid Gauss Seidel Jacobi relaxation scheme e a symmetric hybrid Gauss Seidel Jacobi relaxation scheme and e sequential Gauss Seidel relaxation 7 3 1 Synopsis The solver is set up and run using the following routines where A is the matrix b the right hand side and x the solution vector of the linear system to be solved include HYPRE_parcsr_1s h int HYPRE_BoomerAMGCreate HYPRE_Solver solver lt set certain parameters if desired gt int HYPRE BoomerAMGSetup HYPRE Solver solver HYPRE ParCSRMatrix A HYPRE ParVector b HYPRE ParVector x int HYPRE BoomerAMGSolve HYPRE Solver solver HYPRE_ParCSRMatrix A HYPRE ParVector b HYPRE ParVector x int HYPRE_BoomerAMGDestroy HYPRE_Solver solver 32 CHAPTER 7 SOLVERS AND PRECONDITIONERS 7 3 2 Interface functions Parameters for setting up the code are specified using the following routines HYPRE_BoomerAMGSetMaxLevels int HYPRE_BoomerAMGSetMaxLevels HYPRE_Solver solver int max levels max_levels defines the maximal number of multigrid levels allowed The default is 25 HYPRE_BoomerAMGSetMaxlI
11. integer larger than 0 specifying the desired sparsity of the approx imate inverse The default value is 1 a floating point number between 0 and 1 defining the threshold used to prune small entries in A The default is 0 0 a floating point number between 0 and 1 specifying how load balanc ing has to be done The default is 0 0 set ParaSails to take A as symmetric set ParaSails to take A as nonsymmetric default set ParaSails to reuse the sparsity pattern of A superlu The Table 5 1 Parameters 23 24 CHAPTER 5 FINITE ELEMENT INTERFACE FEI Chapter 6 Linear Algebraic System Interface IJ The IJ interface described in this chapter is the lowest common denominator for specifying linear systems in hypre This interface provides access to general sparse matrix solvers in hypre not to the specialized solvers that require more problem information 6 1 IJ Matrix Interface As with the other interfaces in hypre the IJ interface expects to get data in distributed form because this is the only scalable approach for assembling matrices on thousands of processes Matrices are assumed to be distributed by blocks of rows as follows Ao A 7 6 1 Ap_1 In the above example the matrix is distributed accross the P processes 0 1 P 1 by blocks of rows Each submatrix A is owned by a single process and its first and last row numbers are given by the global indices ilower and iupper in the Create call
12. integers may have any value negative or positive The global indexes allow hypre to discern how data is related spatially and how it is distributed across the parallel machine Each process describes that portion of the grid that it owns one box at a time For example in the figure the global grid can be described in terms of three boxes two owned by process 0 and one owned by process 1 A box is described in terms of a lower and upper index On process 0 the following code will set up the grid shown in the figure the code for process 1 is similar HYPRE_StructGrid grid int ilower 2 2 int iupper 2 2 3 1 0 1 1 2 12 433 HYPRE StructGridCreate MPI COMM WORLD 2 amp grid HYPRE StructGridSetExtents grid ilower 0 iupper 0 HYPRE StructGridSetExtents grid ilower 1 iupperl1 HYPRE StructGridAssemble grid The Create routine creates an empty 2D grid object that lives on the MPT COMM WORLD com municator The SetExtents routine adds a new box to the grid The Assemble routine is a collective call i e must be called on all processes from a common synchronization point and finalizes the grid assembly making the grid ready to use 3 2 SETTING UP THE STRUCT STENCIL 11 3 2 Setting Up the Struct Stencil The geometry of the discretization stencil is described by an array of integer tuples in 2D triples in 3D each representing a relative offset in index space from s
13. of the nonzero elements that are dropped For example if filter 0 9 then approximately 90 percent of the entries in the computed approximate inverse are dropped 7 4 3 Preconditioning nearly symmetric matrices A nonsymmetric but definite and nearly symmetric matrix A may be preconditioned with a sym metric preconditioner M Using a symmetric preconditioner has a few advantages such as guar anteeing positive definiteness of the preconditioner as well as being less expensive to construct The nonsymmetric matrix A must be definite i e A 47 2 is SPD and the a priori sparsity pattern to be used must be symmetric The latter may be guaranteed by 1 constructing the sparsity pattern with a symmetric matrix or 2 if the matrix is structurally symmetric has symmetric pattern then thresholding to construct the pattern is not used i e zero value of the thresh parameter is used 7 5 Euclid The Euclid library is a scalable implementation of the Parallel ILU algorithm that was presented at SC99 7 and published in expanded form in the SIAM Journal on Scientific Computing 8 By scalable we mean that the factorization setup and application triangular solve timings remain nearly constant when the global problem size is scaled in proportion to the number of processors As with all ILU preconditioning methods the number of iterations is expected to increase with global problem size Experimental results have shown that PILU preco
14. to the settings that in our experience will likely prove effective for minimizing execution time include HYPRE_parcsr_1s h HYPRE_Solver eu HYPRE_Solver pcg_solver HYPRE_ParVector b x HYPRE_ParCSRMatrix A Instantiate the preconditioner HYPRE_EuclidCreate comm amp eu Optionally use the following three calls to set runtime options 1 pass options from command line or string array HYPRE_EuclidSetParams eu argc argv 2 pass individual options from within your code HYPRE_EuclidSetParam eu level 3 3 pass options from a configuration file HYPRE_EuclidSetParamsFromFile eu filename Set Euclid as the preconditioning method for some other solver using the function calls HYPRE_EuclidSetup and HYPRE_EuclidSolve We assume that the pcg_solver has been properly initialized HYPRE_PCGSetPrecond pcg_solver HYPRE_PtrToSolverFcn HYPRE_EuclidSolve HYPRE_PtrToSolverFcn HYPRE_EuclidSetup 38 CHAPTER 7 SOLVERS AND PRECONDITIONERS eu Solve the system by calling the Setup and Solve methods for in this case the HYPRE_PCG solver We assume that A b and x have been properly initialized HYPRE_PCGSetup pcg_solver HYPRE_Matrix A HYPRE_Vector b HYPRE Vector x HYPRE_PCGSolve pcg_solver HYPRE_Matrix parcsr_A HYPRE_Vector b HYPRE Vector x Destroy the Euclid preconditioning object HYPRE EuclidDestroy eu 7 5 2 Setting options examples For exposition
15. J described in Chapters 5 and 6 Figure 4 1 gives an example of the type of grid currently supported by the SStruct interface The grid is composed of three parts There is a single cell centered variable across the entire grid except at the grid cell marked with a square which has one additional cell centered variable There are five basic steps involved in setting up the linear system to be solved 1 set up the grid 2 set up the stencils 3 set up the graph 4 set up the matrix 5 set up the right hand side vector To describe each of these steps in more detail consider solving the 2D Laplacian problem Qin i i V u f in the domain 4 1 u 0 on the boundary Assume 3 1 is discretized using standard 9 pt finite volumes on the grid pictured in 4 1 and assume that the problem data is distributed across three processes as depicted In the figure each grid part is distributed onto a different process but this need not be the case each grid part may also be distributed across several processes Assume for simplicity and illustration that there is a 15 16 CHAPTER 4 SEMI STRUCTURED GRID SYSTEM INTERFACE SSTRUCT Part 1 Process 1 9 9 Part 2 Process 2 1 0 i I Part O 1 1 0 Process 0 Figure 4 1 An example 2D block structured grid distributed accross three processes single coupling between the first and second variables in grid part 0 at index 1 2 In general this variable may a
16. User s Manual iasesss DDVZVCVD a aa Software Version 1 6 0 Date 2001 07 27 Center for Applied Scientific Computing Lawrence Livermore National Laboratory Copyright c 1998 The Regents of the University of California Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies This work was produced at the University of California Lawrence Livermore National Laboratory UC LLNL under contract no W 7405 ENG 48 Contract 48 between the U S Department of Energy DOE and The Regents of the University of California University for the operation of UC LLNL The rights of the Federal Government are reserved under Contract 48 subject to the restrictions agreed upon by the DOE and University as allowed under DOE Acquisition Letter 97 1 This work was prepared as an account of work sponsored by an agency of the United States Government Neither the United States Government nor the University of California nor any of their employees makes any warranty express or implied or assumes any liability or responsibility for the accuracy completeness or usefulness of any information apparatus product or process disclosed or represents that its use would not infringe privately owned rights Reference herein to any specific commercial products process or service by trade name trademark manufacturer or otherwise does not necessarily cons
17. aSails solver solver is returned with int HYPRE_ParaSailsCreate MPI_Comm comm HYPRE_Solver solver int symmetry where comm is the MPI communicator The value of symmetry has the following meanings to indicate the symmetry and definiteness of the problem and to specify the type of preconditioner to construct value meaning nonsymmetric and or indefinite problem and nonsymmetric preconditioner SPD problem and SPD factored preconditioner nonsymmetric definite problem and SPD factored preconditioner For more information about the final case see section 7 4 3 Parameters for setting up the preconditioner are specified using int HYPRE_ParaSailsSetParams HYPRE_Solver solver double thresh int nlevel double filter 36 CHAPTER 7 SOLVERS AND PRECONDITIONERS The parameters are used to specify the sparsity pattern and filtering value see above and are described with suggested values as follows default m nlevel F thresh real thresh gt 0 0 0 1 0 01 0 1 thresh thresh filter real filter gt 0 0 0 05 0 001 0 05 filter value filter ee eer co oao mer ate seta atomica When thresh lt 0 then a threshold is selected such that thresh represents the fraction of the nonzero elements that are dropped For example if thresh 0 9 then A will contain approximately ten percent of the nonzeros in A When filter lt 0 then a filter value is selected such that filter represents the fraction
18. al purposes assume you wish to set the ILU k factorization level to the value k 3 There are several methods of accomplishing this Internal to Euclid options are stored in a simple database that contains name value pairs Various of Euclid s internal private functions query this database to determine at runtime what action the user has requested If you enter the option eu_stats 1 a report will be printed when Euclid s destructor is called this report lists among other statistics the options that were in effect during the factorization phase Method 1 By default Euclid always looks for a file titled database in the working directory If it finds such a file it opens it and attempts to parse it as a configuration file Configuration files should be formatted as follows gt cat database this is an optional comment level 3 Any line in a configuration file that contains a character in the first column is ignored All other lines should begin with an option name followed by one or more blanks followed by the character If you include an option option value Note that option names always begin with a name that is not recognized by Euclid no harm should ensue Method 2 To pass options on the command line call HYPRE_EuclidSetParams HYPRE_Solver solver int argc char argv where argc and argv carry the usual connotation main int argc char argv If your hypre application is called phoo
19. ame myConfigFile 7 5 3 Options summary level int Factorization level for ILU k Default 1 Guidance for 2D convection diffusion and similar problems fastest solution time is typically obtained with levels 4 through 8 For 3D problems fastest solution time is typically obtained with level 1 bj Use Block Jacobi ILU preconditioning instead of PILU Default 0 false Guidance if sub domains contain relatively few nodes less than 1 000 or the problem is not well partitioned Block Jacobi ILU may give faster solution time than PILU eu_stats When Euclid s destructor is called a summary of runtime settings and timing information is printed to stdout Default 0 false The timing marks in the report are the maximum over all processors in the MPI communicator eu_mem When Euclid s destructor is called a summary of Euclid s memory usage is printed to std out Default 0 false The statistics are for the processor whose rank in MPICOMM WORLD is 0 printTestData This option is used in our autotest procedures and should not normally be invoked by users The following options are partially implemented but not yet fully functional i e don t use them until further notice sparseA float Drop tolerance for ILU k factorization Default 0 no dropping Entries are treated as zero if their absolute value is less than sparseA max where max is the largest absolute value of any entry in the row Guidance tr
20. and 4 2 above On process 0 the following code will set up the graph for the example problem HYPRE_SStructGraph graph int addindex 9 2 0 2 1 2 2 2 3 3 0 3 1 3 2 3 2th 2 1 1 1 0 1 1 1 2 1 3 9 1 int addentries 9 2 9 9 10 9 11 9 12 9 1 2 HYPRE_SStructGraphCreate MPI_COMM_WORLD grid amp graph HYPRE_SStructGraphSetStencil graph 0 0 stencil Add graph edge entries at grid part boundaries HYPRE_SStructGraphAddEntries graph 0 addindex 0 0 2 addentries 0 0 HYPRE_SStructGraphAddEntries graph O addindex 0 O 2 addentries 1 0 HYPRE_SStructGraphAddEntries graph 0 addindex 1 0 2 addentries 0 0 Add graph edge entries at index 1 2 HYPRE_SStructGraphAddEntries graph O addindex 8 O O addentries 8 1 HYPRE_SStructGraphAddEntries graph 0 addindex 8 1 addentries 8 0 o 4 4 Setting Up the SStruct Matrix The matrix is set up in terms of the graph object described in Section 4 3 above The coefficients associated with each graph entry will typically vary from gridpoint to gridpoint but in the example problem being considered they are as follows over the entire grid except at boundaries see below A 5 la 4 3 4 4 SETTING UP THE SSTRUCT MATRIX 19 On process 0 the following code will set up matrix values associated with the center Sp and south S3 stencil entries in
21. ar structured grid applications users should use the Struct interface described in this chapter This interface will also provide access this is not yet supported to solvers in hypre that were designed for unstructured grid applications and sparse linear systems in general These additional solvers are usually provided via the unstructured grid interface FEI or the linear algebraic interface IJ described in Chapters 5 and 6 Figure 3 1 gives an example of the type of grid currently supported by the Struct interface The interface uses a finite difference or finite volume style and currently supports only scalar PDEs i e one unknown per gridpoint There are four basic steps involved in setting up the linear system to be solved E set up the grid N set up the stencil ow set up the matrix 4 set up the right hand side vector To describe each of these steps in more detail consider solving the 2D Laplacian problem Qi i l Vu f in the domain 3 1 u 0 on the boundary Assume 3 1 is discretized using standard 5 pt finite volumes on the uniform grid pictured in 3 1 and assume that the problem data is distributed across two processes as depicted 10 CHAPTER 3 STRUCTURED GRID SYSTEM INTERFACE STRUCT Figure 3 1 An example 2D structured grid distributed accross two processors 3 1 Setting Up the Struct Grid The grid is described via a global index space i e via integer tuples triples in 3D The
22. b HYPRE_ParVector x int HYPRE_ParaSailsSolve HYPRE_Solver solver HYPRE_ParCSRMatrix A HYPRE_ParVector b HYPRE_ParVector x int HYPRE_ParaSailsStats HYPRE_Solver solver int HYPRE_ParaSailsDestroy HYPRE_Solver solver The accuracy and cost of ParaSails are parameterized by the real thresh and integer nlevels parameters 0 lt thresh lt 1 0 lt nlevels Lower values of thresh and higher values of nlevels lead to more accurate but more expensive preconditioners More accurate preconditioners are also more expensive per iteration The default values are thresh 0 1 and nlevels 1 The parameters are set using HYPRE ParaSailsSetParams Mathematically given a symmetric matrix A the pattern of the approximate inverse is the pattern of A where A is a matrix that has been sparsified from A The sparsification is performed by dropping all entries in a symmetrically diagonally scaled A whose values are less than thresh in magnitude The parameter nlevel is equivalent to m gt 1 Filtering is a post thresholding procedure For more details about the algorithm see 3 The storage required for the ParaSails preconditioner depends on the parameters thresh and nlevels The default parameters often produce a preconditioner that can be stored in less than the space required to store the original matrix ParaSails does not need a large amount of intermediate storage in order to construct the preconditioner 7 4 2 Interface functions A Par
23. by passing the constructed matrix into the solver As described before the 29 30 CHAPTER 7 SOLVERS AND PRECONDITIONERS matrix and solver must be compatible in that the matrix must provide the services needed by the solver Krylov solvers for example need only a matrix vector multiplication Most preconditioners on the other hand have additional requirements such as access to the matrix coefficients 2 Choose parameters for the preconditioner and or solver Parameters are chosen through the Set calls provided by the solver As is true throughout hypre all parameters have reasonable defaults if not chosen Note that in hypre convergence criteria can be chosen after the preconditioner solver has been setup 3 Pass the preconditioner to the solver For solvers that are not preconditioned this step is omitted The preconditioner is passed through the SetPreconditioner call 4 Set up the solver This is just the Setup O routine At this point the solver preconditioner is fully constructed and ready for use Use 1 Set convergence criteria Convergence can be controlled by the number of iterations as well as various tolerances such as relative residual preconditioned residual etc Like all parameters reasonable defaults are used Users are free to change these though care must be taken For example if an iterative method is used as a preconditioner for a Krylov method a constant number of iterations is usually require
24. c Gauss Seidel Jacobi hybrid method Gaussian elimination only for the coarsest level k 3 not recommended if the system on the coarsest level is large womwr Oo HYPRE_BoomerAMGSetGridRelaxPoints int HYPRE_BoomerAMGSetGridRelaxPoints HYPRE_Solver solver int grid_relax_points grid_relax_points i j defines which points are to be relaxed during the j 1 th sweep on the fine grid i 0 the down cycle i 1 the up cycle i 2 and the coarse grid i 3 e g if 34 CHAPTER 7 SOLVERS AND PRECONDITIONERS grid_relax_points 1 0 is 1 all points marked 1 which are in general fine points are relaxed on the first sweep of the down cycle Note grid relax points 3 needs to be 0 always since the concept of coarse and fine points does not exist on the coarsest grid If the user sets it to another value it will be automatically set to 0 and a warning printed unless the direct solver is used HYPRE_BoomerAMGSetRelax Weight int HYPRE_BoomerAMGSetRelaxWeight HYPRE_Solver solver double relax_weight defines the relaxation weights used on each level if weighted Jacobi is used as relaxation method The default relaxation weight is 1 0 on each level HYPRE_BoomerAMGSetIOutDat int HYPRE_BoomerAMGSetI0utDat HYPRE Solver solver int ioutdat where ioutdat determines whether statistics information is generated and printed The information is printed to standard output The following options are possible 0 no output de
25. can preset the row sizes of the matrix in order to reduce the execution time for the matrix specification One can specify the total number of coefficients for each row the number of coefficients in the row that couple the diagonal unknown to Diag unknowns in the same processor domain and the number of coefficients in the row that couple the diagonal unknown to Offd unknowns in other processor domains HYPRE_IJMatrixSetRowSizes ij_matrix sizes HYPRE IJMatrixSetDiag0ffdSizes matrix diag sizes offdiag_sizes Once the matrix has been assembled the sparsity pattern cannot be altered without completely destroying the matrix object and starting from scratch However one can modify the matrix values of an already assembled matrix To do this first call the Initialize routine to re initialize the matrix then set or add to values as before and call the Assemble routine to re assemble before using the matrix Re initialization and re assembly are very cheap essentially a no op in the current implementation of the code 6 2 IJ Vector Interface The following example code illustrates the basic usage of the IJ interface for building vectors MPI_Comm comm HYPRE IJVector ij vector HYPRE ParVector par vector int jlower jupper int nvalues int indices double values HYPRE_IJVectorCreate comm jlower jupper amp ij vector HYPRE IJVectorSet0bjectType ij vector HYPRE_PARCSR HYPRE IJVectorInitialize ij vector
26. ch Center Min neapolis MN 5455 1998 47 48 BIBLIOGRAPHY 10 S Schaffer A semi coarsening multigrid method for elliptic partial differential equations with highly discontinuous and anisotropic coefficients SIAM J Sci Comput 20 1 228 242 1998
27. coefficients that reach outside of the boundary should be set to zero For efficiency reasons hypre does not do this automatically The most natural time to insure this is when the boundary conditions are being set and this is most naturally done after the coefficients on the grid s interior have been set For example during the implementation of the Dirichlet boundary condition on the lower boundary of the grid in Figure 3 1 the south coefficient must be set to zero To do this on process 0 the following code could be used 3 4 SETTING UP THE STRUCT RIGHT HAND SIDE VECTOR 13 3 1 2 1 int ilower 2 int iupper 2 create matrix and set interior coefficients implement boundary conditions for i 0 i lt 12 i values i 0 0 i 3 HYPRE_StructMatrixSetBoxValues A ilower iupper 1 amp i values complete implementation of boundary conditions 3 4 Setting Up the Struct Right Hand Side Vector The right hand side vector is set up similarly to the matrix set up described in Section 3 3 above The main difference is that there is no stencil note that a stencil currently does appear in the interface but this will eventually be removed On process 0 the following code will set up the right hand side vector values HYPRE_StructVector b double values 18 int i HYPRE StructVectorCreate MPI COMM WORLD grid amp b HYPRE StructVectorInitialize b for i 0 i lt
28. ctorDestroy x 6 CHAPTER 2 GETTING STARTED Linear System Interfaces ee YE ms Linear Solvers parers Pe Data Layout Figure 2 1 Graphic illustrating the notion of conceptual interfaces All of these elements are not necessarily in hypre structured composite 2 2 What are conceptual interfaces The top row of Figure 2 1 illustrates a number of conceptual interfaces Generally the conceptual interfaces are denoted by different types of computational grids but other application features might also be used such as geometrical information These conceptual interfaces are intended to represent the way that applications developers naturally think of their linear problem and provide natural interfaces for them to pass the data that defines their linear system into hypre Essentially these conceptual interfaces can be considered convenient utilities for helping a user build a matrix data structure for hypre solvers and preconditioners For example applications that use structured grids such as in the left most interface in the Figure 2 1 typically view their linear problems in terms of stencils and grids On the other hand applications that use unstructured grids and finite elements typically view their linear problems in terms of elements and element stiffness matrices Finally the right most interface is the standard linear algebraic matrix rows columns way of viewing the linear problem T
29. d 2 Solve the system This is just the Solve routine 7 1 SMG SMG is a parallel semicoarsening multigrid solver for the linear systems arising from finite difference finite volume or finite element discretizations of the diffusion equation V DVu 0u f 7 1 on logically rectangular grids The code solves both 2D and 3D problems with discretization stencils of up to 9 point in 2D and up to 27 point in 3D See 10 2 5 for details on the algorithm and its parallel implementation performance SMG is a particularly robust method The algorithm semicoarsens in the z direction and uses plane smoothing The xy plane solves are effected by one V cycle of the 2D SMG algorithm which semicoarsens in the y direction and uses line smoothing 7 2 PFMG 31 7 2 PFMG PFMG is a parallel semicoarsening multigrid solver similar to SMG See 1 5 for details on the algorithm and its parallel implementation performance The main difference between the two methods is in the smoother PFMG uses simple pointwise smoothing As a result PFMG is not as robust as SMG but is much more efficient per V cycle 7 3 BoomerAMG BoomerAMG is a parallel implementation of algebraic multigrid It can be used both as a solver or as a preconditioner The user can choose between various different parallel coarsening tech niques and relaxation schemes See 6 for a detailed description of the coarsening algorithms the interpolation and nueemerical results The
30. e This example and all other examples in this manual are written in C but hypre also supports Fortran See Section 8 2 for details HYPRE_StructGridCreate MPI_COMM_WORLD dim amp grid HYPRE_StructGridSetExtents grid ilower iupper HYPRE_StructGridAssemble grid HYPRE_StructStencilCreate dim stencil_size amp stencil HYPRE_StructStencilSetElement stencil O offsetO HYPRE_StructMatrixCreate MPI_COMM_WORLD grid stencil amp A HYPRE_StructMatrixInitialize A HYPRE_StructMatrixSetBoxValues A ilower iupper nelts elts Avalues HYPRE_StructMatrixAssemble A HYPRE_StructVectorCreate MPI_COMM_WORLD grid amp b HYPRE StructVectorInitialize b 2 2 WHAT ARE CONCEPTUAL INTERFACES HYPRE_StructVectorSetBoxValues b ilower iupper bvalues HYPRE StructVectorAssemble b HYPRE StructVectorCreate MPI COMM WORLD grid amp x HYPRE StructVectorInitialize x HYPRE StructVectorSetBoxValues x ilower iupper xvalues HYPRE StructVectorAssemble x HYPRE StructPFMGCreate MPI COMM WORLD amp solver HYPRE_StructPFMGSetMaxIter solver 50 optional HYPRE_StructPFMGSetTol solver 1 0e 06 optional HYPRE_StructPFMGSetup solver A b x HYPRE_StructVectorGetBoxValues x ilower iupper xvalues HYPRE StructPFMGDestroy solver HYPRE_StructGridDestroy grid HYPRE_StructStencilDestroy stencil HYPRE_StructMatrixDestroy A HYPRE_StructVectorDestroy b HYPRE_StructVe
31. e switched off by setting max_row_sum to 1 0 The default is 0 9 7 3 BOOMERAMG 33 HYPRE_BoomerAMGSetCoarsenType int HYPRE_BoomerAMGSetCoarsenType HYPRE Solver solver int coarsen_type coarsen_type defines the coarsening used The following options are possible 0 CLJP coarsening 1 Ruge Stueben coarsening without boundary treatment 3 Ruge Stueben coarsening with a 3rd second pass on the boundaries 6 Falgout coarsening default HY PRE_BoomerAMGSetMeasureType int HYPRE BoomerAMGSetMeasureType HYPRE Solver solver int measure type measure_type defines whether local measure_type 0 default or global measures measure_type 1 are used within the coarsening algorithm This feature is ignored for the CLJP and the Falgout coarsening HYPRE_BoomerAMGSetNumGridSweeps int HYPRE_BoomerAMGSetNumGridSweeps HYPRE_Solver solver int num_grid_sweeps num_grid sweeps k defines the number of sweeps over the grid on the fine grid k 0 the down cycle k 1 the up cycle k 2 and the coarse grid k 3 HYPRE_BoomerAMGSetGridRelaxType int HYPRE_BoomerAMGSetGridRelaxType HYPRE_Solver solver int grid_relax_type grid_relax_type k defines the relaxation used on the fine grid k 0 the down cycle k 1 the up cycle k 2 and the coarse grid k 3 The following options are possible for grid_relax_type k weighted Jacobi sequential Gauss Seidel very slow Gauss Seidel Jacobi hybrid method default symmetri
32. eeds to be inserted somewhere between the Create and Initialize calls HYPRE_StructMatrixSetSymmetric A 1 Note that symmetric storage may or may not actually be used depending on the underlying storage scheme Currently in hypre symmetric storage is always used when indicated To most efficiently utilize the Struct interface for symmetric matrices notice that only half of the off diagonal coefficients need to be set To do this for the example being considered we simply need to redefine the 5 pt stencil of Section 3 2 to an appropriate 3 pt stencil then set matrix coefficients as in Section 3 3 for these three stencil elements only For example we could use the following stencil 0 1 S2 0 0 1 0 So S1 3 4 This 3 pt stencil provides enough information to recover the full 5 pt stencil geometry and associated matrix coefficients Chapter 4 Semi Structured Grid System Interface SStruct In order to get access to the most efficient and scalable solvers for semi structured grid applications applications with grids that are mostly structured but with some unstructured features users should use the SStruct interface described in this chapter This interface also provides access to solvers in hypre that were designed for unstructured grid applications and sparse linear systems in general These additional solvers are usually provided via the unstructured grid interface FEI or the linear algebraic interface I
33. fault 1 matrix statistics includes information on interpolation operators and matrices generated on each level 2 cycle statistics includes residuals generated during solve phase 3 matrix and cycle statistics 7 4 ParaSails ParaSails is a parallel implementation of a sparse approximate inverse preconditioner using a priori sparsity patterns and least squares Frobenius norm minimization Symmetric positive definite SPD problems are handled using a factored SPD sparse approximate inverse General nonsymmetric and or indefinite problems are handled with an unfactored sparse approximate inverse It is also possible to precondition nonsymmetric but definite matrices with a factored SPD preconditioner ParaSails uses a priori sparsity patterns that are patterns of powers of sparsified matrices ParaSails also uses a post filtering technique to reduce the cost of applying the preconditioner In advanced usage not described here the pattern of the preconditioner can also be reused to generate preconditioners for different matrices in a sequence of linear solves For more details about the ParaSails algorithm see 3 7 4 1 Synopsis include HYPRE_parcsr_1s h int HYPRE_ParaSailsCreate MPI_Comm comm HYPRE_Solver solver int symmetry 7 4 PARASAILS 35 int HYPRE_ParaSailsSetParams HYPRE_Solver solver double thresh int nlevel double filter int HYPRE_ParaSailsSetup HYPRE_Solver solver HYPRE_ParCSRMatrix A HYPRE_ParVector
34. g the vector ready to use The GetObject routine retrieves the built vector object so that it can be passed on to hypre solvers that use the ParVector internal storage format Vector values can be modified in much the same way as with matrices by first re initializing the vector with the Initialize routine Chapter 7 Solvers and Preconditioners There are several solvers available in hypre via different conceptual interfaces see Table 7 1 The procedure for setup and use of solvers and preconditioners is largely the same We will refer to them both as solvers in the sequel except when noted In normal usage the preconditioner is chosen and constructed before the solver and then handed to the solver as part of the solver s setup In the following we assume the most common usage pattern in which a single linear system is set up and then solved with a single righthand side We comment later on considerations for other usage patterns System Interfaces Solvers Struct SStruct FEI IJ Jacobi SMG PFMG BoomerAMG ParaSails Euclid PILUT PCG GMRES Table 7 1 Current solver availability via hypre conceptual interfaces Setup 1 Pass to the solver the information defining the problem In the typical user cycle the user has passed this information into a matrix through one of the conceptual interfaces prior to setting up the solver In this situation the problem definition information is then passed to the solver
35. he following code will set up matrix values associated with the center Sp and south S3 stencil entries in 3 2 3 3 boundaries are ignored here temporarily HYPRE_StructMatrix A double values 36 int stencil_indices 2 0 3 int i HYPRE_StructMatrixCreate MPI_COMM_WORLD grid stencil amp A HYPRE StructMatrixInitialize A for i 0 i lt 36 i 2 values i values i 1 Hs o 1 0 HYPRE_StructMatrixSetBoxValues A ilower 0 iupper 0 2 stencil_indices values HYPRE_StructMatrixSetBoxValues A ilower 1 iupper 1 2 stencil_indices values set boundary conditions HYPRE_StructMatrixAssemble A The Create routine creates an empty matrix object The Initialize routine indicates that the matrix coefficients or values are ready to be set This routine may or may not involve the allocation of memory for the coefficient data depending on the implementation The optional Set routines mentioned later in this chapter and in the Reference Manual should be called before this step The SetBoxValues routine sets the matrix coefficients for some set of stencil entries over the gridpoints in some box Note that the box need not correspond to any of the boxes used to create the grid but values should be set for all gridpoints that this process owns The Assemble routine is a collective call and finalizes the matrix assembly making the matrix ready to use Matrix
36. he second row of Figure 2 1 is a set of linear solver algorithms Each linear solver group requires different information from the user through the conceptual interfaces So the geometric multigrid algorithm GMG listed in the left most box for example can only be used with the left most conceptual interface On the other hand the ILU algorithm in the right most box may be used with any conceptual interface The third row of Figure 2 1 is a list of data layouts or matrix vector storage schemes The relationship between linear solver and storage scheme is similar to that of interface and linear solver 2 3 WHICH CONCEPTUAL INTERFACE SHOULD I USE 7 2 3 Which conceptual interface should I use hypre currently supports four conceptual interfaces e Structured Grid System Interface Struct This interface is appropriate for applica tions whose grids consist of unions of logically rectangular grids with a fixed stencil pattern of nonzeros at each grid point This interface supports only a single unknown per grid point See Chapter 3 for details e Semi Structured Grid System Interface SStruct This interface is appropriate for applications whose grids are mostly structured but with some unstructured features Exam ples include block structured grids composite grids in structured adaptive mesh refinement AMR applications and overset grids This interface supports multiple unknowns per cell See Chapter 4 for details NOTE This is a some
37. ies that are designed specifically for them To run in parallel hypre requires an installation of MPI Configuration of hypre with threads requires an implementation of OpenMP Cur rently only a subset of hypre is threaded hypre currently does not support complex valued systems Chapter 2 Getting Started Before writing any code 1 Choose a conceptual interface see Sections 2 2 and 2 3 Generally the choice is fairly obvious A structured grid interface is clearly inappropriate for an unstructured grid application It is desirable to use a more specific interface if appropriate e g the linear algebraic interface is usable from any type of grid but will involve much more user work and prevent access to some grid type specific preconditioners 2 Choose your desired solver strategy For the typical user this will mean a single Krylov method and a single preconditioner 3 Look up matrix requirements for each solver and preconditioner Each specific solver and preconditioner has requirements from the input matrix This information is provided in several places Chapter 7 the hypre Reference Manual and the hypre header files 4 Choose a matrix class that is compatible with your solvers and preconditioners and your conceptual interface Note that some of the interfaces currently only support one matrix class choice Once the previous decisions have been made it is time to code your application to call hypre 1 Build an
38. led a value will be selected based on the sparsity of the matrix a floating point number specifying the threshold to drop small entries in L and U The default value is 0 0 a non negative integer specifying the desired sparsity of the incom plete factors The default value is 0 a floating point number specifying the threshold used to sparsify the incomplete factors The default value is 0 0 natural default or mmd minimum degree ordering This ordering is used to minimize the number of nonzeros generated in the LU decomposition The default is natural ordering y yes perform row and column scalings before decomposition or n no default falgout ruge or default CLJP coarsening for BoomerAMG an integer specifying the number of pre and post smoothing at each level of BoomerAMG The default is one pre and one post smoothings jacobi Damped Jacobi gs slow sequential Gauss Seidel gs fast Gauss Seidel on interior nodes hybrid or direct The default is hybrid a floating point number between 0 and 1 specifying the damping fac tor for BoomerAMG s damped Jacobi smoother The default value is 0 5 a floating point number between 0 and 1 specifying the threshold used to determine strong coupling in BoomerAMG s coasening The default value is 0 25 a floating point number between 0 and 1 specifying the threshold used to prune small entries in setting up the sparse approximate inverse The default value is 0 0 an
39. lso be coupled to variables at neighboring indices 4 1 Setting Up the SStruct Grid On process 0 the following code will set up the grid shown in the figure the code for processes 1 and 2 is similar HYPRE_SStructGrid grid int int HYPRE_SStructVariable vars 1 int HYPRE_SStructVariable addvars 1 ilower 2 2 iupper 2 2 1 O 1 O 0 3 12 35 HYPRE_SSTRUCT_VARIABLE_CELL addindex 2 1 2 HYPRE_SSTRUCT_VARIABLE_CELL HYPRE_SStructGridCreate MPI_COMM_WORLD 2 3 amp grid HYPRE_SStructGridSetExtents grid O ilower 0 iupper 0 42 SETTING UP THE SSTRUCT STENCIL 17 HYPRE_SStructGridSetExtents grid O ilower 1 iupper 1 HYPRE_SStructGridSetVariables grid 0 1 vars HYPRE_SStructGridAddVariables grid 0 addindex 1 addvars HYPRE_SStructGridAssemble grid The Create routine creates an empty 2D grid object that lives on the MPI_COMM_WORLD com municator The SetExtents routine adds a new box to the grid The SetVariables routine sets the variables on a grid part and the AddVariables routine adds variables to a grid part index in the above example there are two cell centered variables at index 1 2 The Assemble routine is a collective call i e must be called on all processes from a common synchronization point and finalizes the grid assembly making the grid ready to use 4 2 Setting Up the SStruct Stencil The geometry of the discreti
40. ment interface FEI defines a linear solver interface for finite element applications For information on how to use this interface see 4 This chapter describes the iterative methods and preconditioners in the hypre implementation of this interface Solving a linear system from a finite element problem consists of four steps in the FEI 1 Initialize the structure of the finite element data including loading the element connectivity data 2 Load the element or super element stiffness matrices and forcing terms 3 Set solver parameters and solve the linear system 4 Retrieve the solution to the linear system Parameters to the hypre solvers are specified by calling void FEI_parameters int sysHandle int numParams char paramStrings where sysHandle is an identifier for the linear system being solved numParams is the number of parameter strings and paramStrings is an array of null terminated strings with the format parameter name value For example setting the preconditioner can be accomplished by char paramStrings 1 paramStrings 0 char malloc 64 sizeof char strcpy paramStrings 0 preconditioner parasails FEI_parameters sysHandle 1 paramStrings All possible parameters are listed in Table 5 1 A linear system is then solved by calling void FEI_iterateToSolve int sysHandle 21 22 CHAPTER 5 FINITE ELEMENT INTERFACE FEI 5 1 Iterative methods and preconditioners available 5 1 1 Iterative
41. methods 1 Krylov solvers conjugate gradient GMRES TFQMR BiCGSTAB 2 BoomerAMG a parallel algebraic multigrid solver 3 SuperLU direct solver sequential 4 SuperLU direct solver with iterative refinement sequential 5 1 2 Preconditioners 1 diagonal 2 parallel incomplete LU with threshold PILUT 3 another parallel incomplete LU Euclid 4 parallel algebraic multigrid BoomerAMG 5 parallel sparse approximate inverse ParaSails 5 1 ITERATIVE METHODS AND PRECONDITIONERS AVAILABLE Parameter Name Parameter Values solver preconditioner gmresDim maxlIterations tolerance pilutFillin pilutDropTol euclidNlevels euclidThreshold superluOrdering superluScale amgCoarsenType amgNumSweeps amgRelaxType amgRelax Weight amgStrongThreshold parasailsThreshold parasailsNlevels parasailsF ilter parasailsLoadbal parasailsSymmetric parasailsUnSymmetric parasailsReuse cg gmres default bicgstab tfqmr boomeramg superlux diagonal default pilut parasails boomeramg euclid an integer specifying the value of m in restarted GMRES m default value is 50 an integer specifying the maximum number of iterations permitted for CG or GMRES The default value is 1000 a floating point number specifying the termination criterion for CG or GMRES The default value is 1 0E 10 an integer specifying the maximum number of nonzeros kept in the formation of incomplete L and U If this is not cal
42. nditioning is in general more effective than Block Jacobi preconditioning for minimizing total solution time For scaled problems the relative advantage appears to increase as the number of processors is scaled upwards Euclid may also be used to good advantage as a smoother within multigrid methods 7 5 EUCLID 37 7 5 1 Synopsis Euclid is best thought of as an extensible ILU preconditioning framework Extensible means that Euclid can and eventually will time and contributing agencies permitting support many variants of ILU k and ILUT preconditioning The current release includes Block Jacobi ILU k and Parallel ILU k methods Due to this extensibility and also because Euclid was developed independently of the hypre project the methods by which one passes runtime parameters to Euclid preconditioners differ in some respects from the hypre norm While users can directly set options within their code options can also be passed to Euclid preconditioners via command line switches and or small text based configuration files The latter strategies have the advantage that users will not need to alter their codes as Euclid s capabilities are extended The following fragment illustrates the minimum coding required to invoke Euclid precondition ing within hypre application contexts The next subsection provides examples of the various ways in which Euclid s options can be set The final subsection lists the options and provides guidance as
43. ome gridpoint on the grid For exam ple the geometry of the 5 pt stencil for the example problem being considered can be represented in the following way 0 1 Sa 1 0 0 1 0 S1 So So E 3 2 0 1 S3 In 3 2 the 0 0 entry represents the center coefficient and is the Oth entry in the array Sp The 0 1 entry represents the south coefficient and is the 3rd entry in the array S3 And so on On process 0 or 1 the following code will set up the stencil in 3 2 The stencil must be the same on all processes HYPRE StructStencil stencil int offsets 5 2 0 0 1 0 1 0 0 1 10 177 int 8 HYPRE_StructStencilCreate 2 5 amp stencil for s 0 s lt 5 s HYPRE StructStencilSetElement stencil s offsets s The Create routine creates an empty 2D 5 pt stencil object The SetElement routine defines the geometry of the stencil and assigns the array numbers for each of the stencil entries None of the calls are collective calls 3 3 Setting Up the Struct Matrix The matrix is set up in terms of the grid and stencil objects described in Sections 3 1 and 3 2 The coefficients associated with each stencil entry will typically vary from gridpoint to gridpoint but in the example problem being considered they are as follows over the entire grid except at boundaries see below ee ai 3 3 12 CHAPTER 3 STRUCTURED GRID SYSTEM INTERFACE STRUCT On process 0 t
44. ptions that can give you control over the choices it will make You can type configure help to see the list of all of the command line options to configure This is the best resource for information on configure options Below is some additional helpful information Be aware that not all command line options have been tested on all machines and architectures even supported machines and architectures with options Each with option that is listed in configure help also was a without option usually the default but not always Additionally all with options have a 43 44 CHAPTER 8 ADDITIONAL INFORMATION with option pathname This is not a supported feature for all with options however and may have no effect on configuration Compilers If you want to choose a compiler then is it recommended that you choose all C C Fortran compilers Compiler Flags To choose optimization or debug use enable opt default or enable debug For other compiler flags use the with CFLAGS option BLAS By default configure attempts to find the systems native optimized BLAS library The path for this library must be in the user s PATH To specify another BLAS library available use with BLAS pathname or with BLAS link list To configure and compile without BLAS use the without BLAS option Configure automatically generates a file HYPRE_config h that includes all the header files found to be necessary by configure This file ma
45. s as follows integer 8 matrix integer nrows ncols MAX NCOLS 8 3 BUG REPORTING 45 Fortran argument int i integer i double d double precision d int array integer array x double array double precision array char string character string HYPRE_Type object integer 8 object HYPRE_Type object integer 8 object Table 8 1 Conversion from C parameters to Fortran arguments integer rows MAX ROWS cols MAX COLS double precision values MAX COLS integer ierr call HYPRE_IJMatrixSetValues matrix nrows ncols rows cols amp values ierr The Fortran subroutine name is the same unless the name is longer than 31 characters In these situations the name is condensed to 31 characters usually by simple truncation For now users should look at the Fortran drivers in the test directory for the correct condensed names In the future this aspect of the interface conversion will be made consistent and straightforward The Fortran subroutine argument list is always the same as the corresponding C routine except that the error return code ierr is always last Conversion from C parameter types to Fortran argument type is summarized in Table 8 1 Arrays arguments in hypre are always of type int or double and the corresponding Fortran types are simply integer or double precision arrays Note that the Fortran arrays may be indexed in any manner For example an integer array of length N may be declared in fortran as ei
46. structured semi structured interfaces most appropriate for finite difference applications a finite element based unstructured interface and a linear algebra based interface Each interface provides access to several solvers without the need to write new interface code e User options accommodate beginners through experts hypre allows a spectrum of expertise to be applied by users The beginning user can get up and running with a minimal 1 2 CHAPTER 1 INTRODUCTION amount of effort More expert users can take further control of the solution process through various parameters Configuration options to suit your computing system hypre utilizes the GNU Auto conf package to allow simple and flexible installation on a wide variety of computing systems Users can tailor the installation to match their computing system Options include debug and optimized modes the ability to change required libraries such as MPI and BLAS a sequential mode and modes enabling threads for certain solvers On most systems however hypre can be built by simply typing configure followed by make Interfaces in multiple languages provide greater flexibility for applications hypre contains interfaces for both Fortran and C users Assumptions and Limitations hypre is designed for large sparse linear systems on parallel computers Small linear systems systems that are solvable on a sequential computer and dense systems are all better addressed by other librar
47. tObjectType routine sets the underlying matrix object type to HYPRE_PARCSR this is the only object type currently supported The Initialize routine indicates that the matrix coefficients or values are ready to be set This routine may or may not involve the allocation of memory for the coefficient data depending on the implementation The optional SetRowSizes O and SetDiagOffdSizes routines mentioned later in this chapter and in the Reference Manual should be called before this step The SetValues routine sets matrix values for some number of rows nrows and some number of columns in each row ncols The actual row and column numbers of the matrix values to be set are given by rows and cols After the coefficients are set they can be added to with an AddTo routine Each process should set only those matrix values that it owns in the data distribution The Assemble routine is a collective call and finalizes the matrix assembly making the matrix ready to use The GetObject routine retrieves the built matrix object so that it can be passed on to hypre solvers that use the ParCSR internal storage format Note that this is not an expensive routine the matrix already exists in ParCSR storage format and the routine simply returns a handle or pointer to it Although we currently only support one underlying data storage format in the future several different formats may be supported 6 2 IJ VECTOR INTERFACE 27 One
48. ter int HYPRE_BoomerAMGSetMaxIter HYPRE_Solver solver int max_iter max_iter defines the maximal number of iterations allowed The default is 20 HYPRE_BoomerAMGSetTol int HYPRE_BoomerAMGSetTol HYPRE_Solver solver double tol tol defines the tolerance needed for the stopping criterion lb Az y bll gt lt tol The default for tol is 1077 HYPRE_BoomerAMGSetStrongThreshold int HYPRE_BoomerAMGSetStrongThreshold HYPRE_Solver solver double strong_threshold A point i is strongly connected to j if a gt maxj 4 a where the strong threshold O is a value between 0 and 1 Weak connections are usually ignored when determining the next lower level Consequently choosing a larger strong threshold leads in general to smaller coarse grids but worse convergence rates The default value for 0 is 0 25 which appears to be a good choice for 2 dimensional problems A better choice for 3 dimensional problems appears to be 0 5 However the choice of the strength threshold is problem dependent and therefore there could be better choices than the two suggested ones HYPRE_BoomerAMGSetMaxRowSum int HYPRE_BoomerAMGSetMaxRowSum HYPRE_Solver solver double max_row_sum This feature leads to a more efficient treatment of very diagonally dominant portions of the matrix If the absolute row sum of row i weighted by the diagonal is greater than max_row_sum all depen dencies of variable i are set to be weak This feature can b
49. ther of the following integer array N integer array 0 N 1 hypre objects can usually be declared as in the table because integer 8 usually corresponds to the length of a pointer However there may be some machines where this is not the case although we are not aware of any at this time On such machines the Fortran type for a hypre object should be an integer of the appropriate length 8 3 Bug Reporting hypre has an automated bug reporting mechanism in place that may be used as a resource for submitting bugs desired features and documentation problems as well as querying the status of previous reports Access http www casc llnl gov bugs for full bug tracking details or to submit or query a bug report When using the CASC bug reporting site for the first time click on Open a new Bugzilla account under the User login account management heading 46 CHAPTER 8 ADDITIONAL INFORMATION Bibliography 1 S F Ashby and R D Falgout A parallel multigrid preconditioned conjugate gradient al gorithm for groundwater flow simulations Nuclear Science and Engineering 124 1 145 159 September 1996 Also available as LLNL Technical Report UCRL JC 122359 P N Brown R D Falgout and J E Jones Semicoarsening multigrid on distributed memory machines SIAM J Sci Comput 21 5 1823 1834 2000 Special issue on the Fifth Copper Mountain Conference on Iterative Methods Also available as LLNL technical report UCRL JC
50. titute or imply its endorsement recommendation or favoring by the United States Government or the University of California The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or the University of California and shall not be used for advertising or product endorsement purposes UCRL MA 137155 DR Contents 1 Introduction A AA eed E is 1 2 Assumptions and Limitations 2 0 00 0 ee ee 2 Getting Started 2 A Simple Example ssa die e p we ae AAA ee SUA A e 3 Structured Grid System Interface Struct 3 1 Setting Up the Struct Grid sossa ee ee es 3 2 Setting Up the Struct Stencil e a s ssid es 3 3 Setting Up the Struct Matrix e 3 4 Setting Up the Struct Right Hand Side Vector o o e 3 5 Symmetric Matrices ae i Ae aa ee 4 Semi Structured Grid System Interface SStruct 4 1 Setting Up the SStruct Grid 2 ee 4 2 Setting Up the SStruct Stencil 2 ee ee 4 3 Setting Up the SStruct Graph 2 ee ee 4 4 Setting Up the SStruct Matrix e 4 5 Setting Up the SStruct Right Hand Side Vector o o e 5 Finite Element Interface FEI 5 1 Iterative methods and preconditioners available 5 1 1 Iterative methods da V PreconditionerS le Bye ov Meh ee Bee pa CE eee Nk e See ee eee 6 Linear Algebraic System Interface IJ 6 1 I Matrix nterface vii a RS AA a
51. what new interface and should be used with caution as it matures e Finite Element Interface FEI This is appropriate for users who form their linear sys tems from a finite element discretization The interface mirrors typical finite element data structures including element stiffness matrices Though this interface is provided in hypre its definition was determined elsewhere www z ca sandia gov fei See Chapter 5 for details e Linear Algebraic System Interface IJ This is the traditional linear algebraic inter face It can be used as a last resort by users for whom the other grid based interfaces are not appropriate It requires more work on the user s part though still less than building par allel sparse data structures General solvers and preconditioners are available through this interface but not specialized solvers which need more information Our experience is that users with legacy codes in which they already have code for building matrices in particular formats find the IJ interface relatively easy to use See Chapter 6 for details Generally a user should choose the most specific interface that matches their application be cause this will allow them to use specialized and more efficient solvers and preconditioners without losing access to more general solvers CHAPTER 2 GETTING STARTED Chapter 3 Structured Grid System Interface Struct In order to get access to the most efficient and scalable solvers for scal
52. with the primary goal of providing users with advanced parallel preconditioners Issues of robustness ease of use flexibility and interoperability also play an important role 1 1 Features e Scalable preconditioners provide efficient solution on today s and tomorrow s sys tems hypre contains several families of preconditioner algorithms focused on the scalable solution of very large sparse linear systems hypre includes grey box algorithms that use more than just the matrix to solve certain classes of problems more efficiently than general purpose libraries This includes algorithms such as structured multigrid e Suite of common iterative methods provides options for a spectrum of problems hypre provides several of the most commonly used Krylov based iterative methods to be used in conjunction with its scalable preconditioners This includes methods for nonsymmetric systems such as GMRES and methods for symmetric matrices such as Conjugate Gradient e Intuitive grid centric interfaces obviate need for complicated data structures and provide access to advanced solvers hypre has made a major step forward in usability from earlier generations of sparse linear solver libraries in that users do not have to learn complicated sparse matrix data structures Instead hypre does the work of building these data structures for the user through a variety of interfaces each appropriate to different classes of users These include stencil based
53. y be used to see how a compiled version of the library was configured and may also be included by the user in his her own code 8 1 2 Linking to the Library A program linking with hypre must be compiled with I HYPRE_DIR include and linked with L HYPRE_DIR 1ib lhypre library name lhypre library name where HYPRE DIR is the di rectory where hypre is installed Additionally any other libraries to which hypre is linked must also be linked to by the users application For example the BLAS library or PETSc library are often but not always linked in by hypre and would also need to be linked in by the users application It may be useful to reference the Makefile in the test subdirectory This makefile is designed to build test applications that link with and call hypre All include and linking flags that are used by hypre and needed by these test applications get exported to this file by the configure script 8 2 Calling from Fortran Although hypre is written in C a Fortran interface is provided The Fortran interface is very similar to the C interface and can be determined from the C interface by a few simple conversion rules These conversion rules are described below Let us start out with a simple example Consider the following hypre prototype int HYPRE_IJMatrixSetValues HYPRE_IJMatrix matrix int nrows int ncols const int rows const int cols const double values The corresponding Fortran code for calling this routine i
54. y necessary auxiliary structures for your chosen conceptual interface This includes e g the grid and stencil structures for the structured grid interface 2 Build the matrix solution vector and right hand side vector through your chosen conceptual interface Each conceptual interface provides a series of calls for entering information about your problem into hypre 3 Build solvers and preconditioners and set solver parameters optional Some parameters like convergence tolerance are the same across solvers while others are solver specific 4 CHAPTER 2 GETTING STARTED 4 Call the solve function for the solver 5 Retrieve desired information from solver Depending on your application there may be different things you may want to do with the solution vector Also performance information such as number of iterations is typically available though it may differ from solver to solver 2 1 A Simple Example The following code serves as a simple example of the usage of hypre In this example the structured grid interface discussed in Chapter 3 is used to enter the problem into hypre and the PFMG Multigrid solver is used to solve the system Since the structured grid interface currently only supports one underlying matrix class there are no choices to make here If we were using the semi structured grid interface instead then we would have to choose between the SStruct and ParCSR matrix classes depending on the solver we want to us
55. y this in conjunction with rowScale 40 CHAPTER 7 SOLVERS AND PRECONDITIONERS CAUTION If the coefficient matrix A is symmetric this setting is likely to cause the filled matrix F L U T to be unsymmetric This setting has no effect when ILUT factorization is selected rowScale Scale values prior to factorization such that the largest value in any row is 1 or 1 Default 0 false CAUTION If the coefficient matrix A is symmetric this setting is likely to cause the filled matrix F L U I to be unsymmetric Guidance if the matrix is poorly scaled turning on row scaling may help convergence ilut float Use ILUT factorization instead of the default ILU k Here float is the drop tolerance which is relative to the largest absolute value of any entry in the row being factored CAUTION If the coefficient matrix A is symmetric this setting is likely to cause the filled matrix F L U I to be unsymmetric maxNzPerRow int This sets the maximum number of nonzeros that is permitted in any row of F in addition to the number that would result from an ILU 0 factorization A negative value indicates infinity no limit This setting is effective for both ILU k and ILUT factorization methods Default infinity for ILU k 5 for ILUT 7 6 PILUT Parallel Incomplete Factorization PILUT is a parallel preconditioner based on Saad s dual threshold incomplete factorization algo rithm The original version of PILUT was
56. zation stencil is described by an array of integer tuples in 2D triples in 3D each representing a relative offset in index space from some gridpoint on the grid For exam ple the geometry of the 9 pt stencil for the example problem being considered can be represented in the following way 1 1 0 1 1 1 S7 S4 Sg 1 0 0 0 1 0 S1 So So E 4 2 1 1 0 1 d 1 S5 S3 S6 In 3 2 the 0 0 entry represents the center coefficient and is the Oth entry in the array Sp The 0 1 entry represents the south coefficient and is the 3rd entry in the array S3 And so on On process 0 1 or 2 the following code will set up the stencil in 3 2 HYPRE_SStructStencil stencil int S int offsets 9 2 0 0 1 0 1 0 0 1 0 1 1 1 1 1 1 1 1 177 HYPRE_SStructStencilCreate 2 9 amp stencil for s 0 s lt 9 st HYPRE_SStructStencilSetEntry stencil s offsets s 0 18 CHAPTER 4 SEMI STRUCTURED GRID SYSTEM INTERFACE SSTRUCT The Create routine creates an empty 2D 9 pt stencil object The SetEntry routine defines the geometry of the stencil and assigns the array numbers for each of the stencil entries None of the calls are collective calls 4 3 Setting Up the SStruct Graph The graph will represent the nonzero structure of the matrix in Section 4 4 below It is defined in terms of the grid and stencil objects described in Sections 4 1

Download Pdf Manuals

image

Related Search

Related Contents

DS3701CS(C2~C3)  Manual de Instalación Redugax - Comercializadora de Productos  Rockbox user manual  Digital Super Hybrid System Programming Guide    Dell E178WFP User's Manual  adaptat on nstruct ons SM6-36  Polaroid PMP500-4 MP3 Player User Manual  - Succession Wizard  

Copyright © All rights reserved.
Failed to retrieve file