Home

Fortran 90 MP Library User's Guide

image

Contents

1. call elpop Mp_Setup Deallocate storage arrays and exit from BLACS IF ALLOCATED A DEALLOCATE A IF ALLOCATED B DEALLOCATE B IF ALLOCATED C DEALLOCATE C IF ALLOCATED X DEALLOCATE X IF ALLOCATED d_A DEALLOCATE d_A IF ALLOCATED d_B DEALLOCATE d_B IF ALLOCATED d_C DEALLOCATE d_C Check the results IF ERROR lt SORT EPSILON ALPHA and amp P_RANK 0 THEN write Example 2 for BLACS and PBLAS is correct END IF Epee aeon WSsLinG els joreoSesis Cle CALL BLACS_GRIDEXIT CONTXT CALL BLACS_EXIT 0 ND t IMSL Fortran 90 MP Library 4 0 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers 241 Example 3 Distributed Linear Solver with ScaLAPACK The program SCPK_EX3 illustrates solving a system of linear algebraic equations Ax b The right hand side is produced by defining A and y to have random values Then the matrix vector product b Ay is computed The problem size is such that the residuals x y 0 are checked on one process Three temporary files are created and deleted There is usage of the BLACS to define the process grid and provide further information identifying each process Then ScaLAPACK is used to compute the approximate solution X program scpk_ex3 This is Example 3 for Sca
2. TYPE D_OPTIONS IOPT 5 Start loop to integrate and record solution values IDO 1 DO SELECT CASE IDO Define values that determine limits CASE 1 0 ZERO OUT DELTA_T U NPDE 1 1 HALF U NPDE 1 N HALF OPEN FILE PDE_ex07 out UNIT 7 NFRAMES TEND DELTA_T DELTA_T WRITE 7 3I5 4D14 5 NPDE N NFRAMES amp U NPDE 1 1 U NPDE 1 N TO TEND IOPT 1 D_OPTIONS PDE_1D_MG_TIME_SMOOTHING 1D 3 IOPT 2 D_OPTIONS PDE_1D_MG_RELATIVE_TOLERANCE ZERO IOPT 3 D_OPTIONS PDE_1D_MG_ABSOLUTE_TOLERANCE 1D 3 IOPT 4 PDE_1D_MG_MAX_BDF_ORDER IOPT 5 3 Update to the next output point Write solution and check for final point CASE 2 TO TOUT IF TO lt TEND THEN WRITE 7 F10 5 TOUT DO I 1 NPDE 1 292 Chapter 8 Partial Differential Equations IMSL Fortran 90 MP Library 4 0 WRITE 7 4E15 5 U I END DO TOUT MIN TOUT DELTA_T TEND IF TO TEND IDO 3 END IF All completed Solver is shut down CASE 3 CLOSE UNIT 7 EXIT Define initial data values CASE 5 TEMP U 3 U 1 PULSE TEMP U 2 U 1 WHERE TEMP lt 3D 1 or EMP gt 1D 1 U 1 WHERE TEMP lt 1D 1 or EMP gt 3D 1 U 2 WRITE 7 F10 5 TO DO I 1 NPDE 1 WRITE 7 4E15 5 U I END DO
3. forces an equality constraint BND 1 1 F HUGE ONE BND 1 F 1 ZERO BND 2 HUGE ONE CALL Parallel_bounded_LSQ amp A B BND X RNORM W INDEX IPART amp NSETP NSETZ Each processor multiplies its block times the part of the dual corresponding to that partition Y ZERO DO J IPART 1 MP_RANK 1 IPART 2 MP_RANK 1 JSHIFT J IPART 1 MP_RANK 1 1 Y Y ASAVE JSHIFT X J END DO Accumulate the pieces from all the processors Puti SUM AincO Bic on sank ON processor B Y IF MP_NPROCS gt 1 amp CALL MPI_REDUCE Y B M MPI_DOUBLE_PRECISION amp MPI_SUM 0 MP LIBRARY WORLD IERROR IF MP_RANK 0 THEN Compute constraint solution at the root The constraints will have no solution if B M ONE All of thes xample problems have solutions B M B M ONE B B B M END IF Send t IF MP_NPROCS gt 1 amp CALL MPI_BCAST B M MP_LIBRARY WORLD For large problems this printing may n and PRINT IF MP_RANK r T MPI_DOUBL ERROR PR EGS ON Fal CeO he inequality constraint or primal solution to all nodes amp removed amp 256 e Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers IMSL Fortran 90 MP Library 4 0 call show B 1 NP amp Minimal length solution of the constraints Compute residuals of the individual constraints X ZERO DO J IPART 1 MP_RANK 1 IPART 2 MP
4. terations was Y 1 i il SS MES IOPT 3 2 N E1POP MP_SETU CALL E1PSH END IF SET _MAX_IT not enough on a previous iteration N Eve Pa ERATIONS SETUP ET_TOLERANC CALL parallel_bounded_LSQ amp A B BND NSI IMSL Fortran 90 MP Library 4 0 Y RNORM W ETZ IOPT IOPT Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers e 259 IND EX IPART EL amp NS ETP Leset EO The array Y contains the constrained Newton step Update the variables EEK IF mp_rank 0 and PRINT THEN CALL show BND Bounds for the moves CALL SHOW X Developing Solution CALL SHOW RNORM amp Linear problem residual norm END IF This is a safety measure for not taking too many steps ITER ITER 1 IF ITER gt MAXIT EXIT NEWTON_METHOD END DO EWTON_METHOD IF MP_RANK 0 THEN IF ITER lt MAXIT WRITE amp Example 2 for PARALLEL BOUNDED_LSQ is correct END IF See to any errors and shut down MPI MP_NPROCS MP_SETUP Final END 260 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations Contents Subroutine PDE UD IMG is icicsssessceis otsiesseardidiseeivigsseeregesdeeivieiseeeas
5. 1 1 ZERO U NPDE 1 N A ILE PDE_ex03 out UNIT 7 S NINT TEND DELTA_T DELTA_T 3I5 4D14 5 NPDE 1 U NPDE 1 N TO TEND OPTIONS PDE_1D_MG_RELATIVE_TOLERANCE OPTIONS PDE_1D_MG_ABSOLUTE_TOLERANC OPTIONS PDE_1D_MG_TIME_SMOOTHING 1D output point check for final point A A A a IOP IOPT IOPT Update to the nex Write solution an CASE 2 TO TOUT IF TO lt TEND THEN WRITE 7 F10 5 TOUT DO I 1 NPDE 1 WRITE 7 4E15 5 U I END DO TOUT MIN TOUT DELTA_T TEND IF TEND IDO 3 END IF All completed Solver is shut down CASE 3 CLOSE UNIT 7 EXI Define initial data values CASE 5 U 1 EXP U 2 WRITE 7 F10 5 T DO I 1 NPDE 1 WRITE 7 4E15 5 U I END DO fl Fl Fl Or Vs SN FN 5 N TWO EXP A 0 284 Chapter 8 Partial Differential Equations IMSL Fortran 90 MP Library 4 0 Define differential equations CASE 6 D_PDE_1D_MG_C 1 1 ONE D_PDE_1D_MG_R 1 D_PDE_1D_MG_U 1 Evaluate the approximate integral for this t V_1 HALF SUM U 1 1 N 1 U 1 2 N amp U 2 2 N U 2 1 N 1 D_PDE_1D_MG_Q 1 V_1 D_PDE_1D_MG_U 1 Define boundary conditions CASE 7 IF PDE_1D_MG_LEFT THEN Evaluate the
6. Make calls to the VNI error processor while using MPI The error types are WARNING and FATAL An example is a call to a routine that expects a positive value for the INTEGER argument MP_NPROCS MP_SETUP CALL B_Name 0 Finalize MPI and print any error messages The program STOPs by default P_NPROCS MP_SETUP Final ND PROGRAM ig ROUTINE B_Name I E_ INT routine generates an error messag nt types of errors occur at different nodes TYPE 3 B S P d INTEGER I TYPE D wy 10 EUSI name onto the stack E1LPSH B_Name value into the message DLST LN the message for printing EIMES TYPE 2 amp agument should be positive now has value il EDOR CA Prepa CA Lar ye E G G E O lee Mey ee OW ee Se Pop the name off the stack CALL E1POP B_Name Had an invalid argument so RETURN RETUR END IF END SUBROUTINI EJ 312 Chapter 9 Error Handling and Messages The Parallel Option IMSL Fortran 90 MP Library 4 0 Output for Example 2 xxx WARNING 2 on rank 1 torski rd imsl com from B_Name The agument should be positive It now has value 0 FORWARD Calls MP_SETUP B_Name Error Types and Codes 0 0 3 2 xxx FATAL ERROR 2 on rank 0 texas rd imsl com from B_ Name The agument sh
7. case 6 Evaluate partials of g y y Get value of c_j for partials iopt 1 inr 9 call sumag math ichap iget 1 iopt sval 44 e Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 Subtract c_j from diagonals to compute partials for y c_j The linear system is tridiagonal t_diag l n 1 r_diag sval 1 a_diag t_upper l n 1 r_off sval 1 a_off t_lower EOSHIFT t_upper SHIFT 1 DIM 1 cycle Integration_Loop case 7 Compute the factorization iopti 1l s_options s_lin_sol_tri_factor_only zero call lin_sol_tri t_upper t_diag t_lower amp t_sol iopt iopti cycle Integration_Loop case 8 Solve the system iopti 1 s_options s_lin_sol_tri_solve_only zero Move data from the assumed size to assumed shape arrays t_sol 1 n 1 wk ival 1 ival 1 n 1 call lin_sol_tri t_upper t_diag t_lower amp t_sol iopt iopti Move data from the assumed shape to assumed size arrays wk ival 1 ival 1 n 1 t_sol 1 n 1 cycle Integration_Loop case 2 Correct initial value to reach u_1 at t tend u_0 u0 u_O y n 2 u_1 u_0 y n 2 1 Finish up internally in the integrator ido 3 cycle Integration_Loop end select end do Integration_Loop write The equation u_t u_xx with u 0 t 4g ED write reaches the value u_l1 at time tend write Example 4 for LIN_SOL_TRI is correct
8. T 308 Chapter 9 Error Handling and Messages The Parallel Option IMSL Fortran 90 MP Library 4 0 Traceback Option The traceback option is set to ON or OFF by EITRB CALL E1TRB i tset where the traceback option only applies to type i errors if 1 lt i lt 7 If i O the selection applies to all error types For tset 0 the traceback is OFF For tset 1 the traceback is ON The traceback is ON for all error types This routine is provided for compatibility with the previous version of the error processor Guidelines for Writing Error Messages e Error messages should be written in correct and complete sentences e Capitalize the first letter of the message e Type two spaces after the period at the end of the sentence e Use present tense whenever possible e Variable length items included by Ai should be placed at the end of the message without a period Entire messages are limited to 1 024 characters and long variable items in the middle could cause critical parts of the message to be truncated A period at the end could cause confusion if it is interpreted as part of data items e Messages should describe both the observed error condition and the expected condition For example A procedure name is expected but the following entity has been encountered A1 e Whenever possible and especially when it is not obvious the message should provide informa
9. lo AIN 240 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers IMSL Fortran 90 MP Library 4 0 Read the factors into the local arrays CALL ScaLAPACK_READ Atest dat DESC_A d_A CALL ScaLAPACK_READ Btest dat DESC_B d_B Compute the distributed product C A x B ALPHA 1d0 BETA 0d0 TA 1 JA 1 IB 1 JB 1 IC 1 JC 1 d_c 0 CALL pdGEMM amp Get Tie My Ny Ky UNSER CLIN GUN A DESC_A d_B IB JB DESC_B BETA amp Clie TCT we IDSC l Put the product back on the root node caiie CalbAPAC AW Rann EN GEG csi Cla tay ADEC Can me Cle IF MP_RANK 0 THEN Read the residuals and check them for size OPEN UNIT NIN FILE Ctest dat STATUS OLD Read the data by columns DO J 1 N NB READ NIN C 1I L I 1 M L J min N J NB 1 END DO CLOSE NIN STATUS DELETE SIZE_C SUM ABS C C C matmul A B ERROR SUM ABS C SIZE_C Open other temporary files and delete them OPEN UNIT NIN FILE Atest dat STATUS OLD CLOSE NIN STATUS DELETE OPEN UNIT NIN FILE Btest dat STATUS OLD CLOSE NIN STATUS DELETE END IF The processors in use now exit the loop EXET BLOCK ND DO BLOCK a See to any error messages
10. k Generate random rectangular matrices for A and right hand sides b Generate random weights for each of the right hand sides A rand A b rand b w rand w Compute the singular value decomposition S SVD A U U V V g U stx b s_sq s 2 log_lamda log 10 s 1 log_lamda_t log_lamda delta_log_lamda log_lamda log 0 1 s n p 1 Choose lamda to minimize the cross validation weighted square error First evaluate th rror at a grid of points uniform in log_scale cross_validation_error do i l p t s_sq s_sqtexp log_lamda c_lamda i sum w b U lim 1 n x g 1l in 1 k amp spread t DIM 2 NCOPIES k amp one u 1l m 1l in 2 x spread t DIM 2 NCOPIES k 2 DIM 1 log_lamda log_lamda delta_log_lamda end do cross_validation_error IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 195 Compute the grid value and lamda corresponding to the minimum do i 1 k lamda i exp log_lamda_t delta_log_lamda amp sum minloc c_lamda l p i 1 end do Compute the solution using the optimum cross validation parameters x V x g 1 n 1 k spread s DIM 2 NCOPIES k amp spread s_sq DIM 2 NCOPIES k amp spread lamda DIM 1 NCOPIES n Check the residuals using normal equations res A tx b A x x amp spread lamda DIM 1 NCOPIES n x if norm r
11. 0 eee 30 Example 4 Laplace Transform Solution c ccseseeeceeeeeeeeeeeeeeeeesees 31 lin sol SOF A ocaes ide sad eaed causcae sone eaitaadecadi eas bvaidocddaciuadueeaatadedcecioce tostinauwalinaudes 34 Example 1 Solution of Multiple Tridiagonal Systems ceeee 34 Example 2 Iterative Refinement and Use of Partial Pivoting 37 Example 3 Selected Eigenvectors of Tridiagonal Matrices 0 39 Example 4 Tridiagonal Matrix Solving within Diffusion Equations 41 IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers e 1 lin_sol_gen Solves a general system of linear equations Ax b Using optional arguments any of several related computations can be performed These extra tasks include computing the LU factorization of A using partial pivoting representing the determinant of A computing the inverse matrix A7 and solving A x b or Ax b given the LU factorization of A Required Arguments A Input Output Array of size n X n containing the matrix b Input Output Array of size n X nb containing the right hand side matrix x Output Array of size n X nb containing the solution matrix Example 1 Solving a Linear System of Equations This example solves a linear system of equations This is the simplest use of lin_sol_gen The equations are generated using a matrix of random numbers and a solution is obtained corresponding to a random right hand sid
12. a_diag 2 hx 3 a_off hx 6 r_diag 2 hx r_off 1 hx Get integer and floating point option numbers iopt 1 inum call iumag math ichap iget 1 iopt in iopt 1 irnum call iumag math ichap iget 1 iopt inr Set for reverse communication evaluation of the DAE iopt 1 in 26 ival 1 0 Set for use of explicit partial derivatives iopt 2 in 5 ival 2 1 Set for reverse communication evaluation of partials iopt 3 in 29 ival 3 0 Set for reverse communication solution of linear equations iopt 4 in 31 ival 4 0 Storage for the partial derivative array are not allocated or required in the integrator iopt 5 in 34 ival 5 1 Set the sizes of iwk wk for internal checking iopt 6 in 35 190 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 35 41 ival 6 ival 7 integer opti call iumag Reset tolerances atol le 3 sval 1 at iopt 1 in floating poi call sumag Set Set n 11 n ons math ichap iput for integrator rtol le 3 ol sval 2 r 5 nt options math ichap iput 1 iopt sval 6 iopt ival rtol Integrate ODE DA Use dummy external names for g y y and partials DG ido i Integration_l call d2s Find where g y y SPG DJSPG Loop do pg n t tend ido y ypr
13. D_PDE_1D_MG_Q 2 D_PDE_1D_MG_Q 1 Define boundary conditions CASE 7 IF PDE_1D_MG_LEFT THEN D_PDE_1D_MG_BETA ZERO D_PDE_1D_MG_GAMMA D_PDE_1D_MG_DUDX LSE D_PDE_1D_MG_BETA 1 ONE D_PDE_1D_MG_GAMMA 1 ZERO D_PDE_1D_MG_BETA 2 ZERO IF D_PDE_1D_MG_T gt 2D 4 THEN D_PDE_1D_MG_GAMMA 2 12D 1 ELSE D_PDE_1D_MG_GAMMA 2 2D 1 5D3 D_PDE_1D MG T END IF D_PDE_1D_MG_GAMMA 2 D_PDE_1D_MG_GAMMA 2 amp D_PDE_1D_MG_U 2 END IF CASE 8 Factor the banded matrix This is the same solver used internally but that is not required A user can substitute one of their own call dl2crb neq d_pde_ld_mg_a pde_ld_mg_lda pde_ld_mg_iband amp pde_ld_mg_iband d_pde_ld_mg_a pde_ld_mg_lda ipvt rcond work IF rcond lt EPSILON ONE pde_ld_mg_panic_flag 1 CASE 9 Solve using the factored banded matrix call dlfsrb neq d_pde_ld_mg_a pde_ld_mg_lda pde_ld_mg_iband amp pde_ld_mg_iband ipvt d_pde_ld_mg_rhs 1 d_pde_ld_mg_sol END SELECT Reverse communication is used for the problem data CALL PDE_1D_MG TO TOUT IDO U IOPT IOPT D DO gal CONTAINS UNCTION F Z PLICIT NON EAL KIND 1D0 Z F F GAMMA EXP BETA Z END FUNCTION end program A DH A gal Example 6 A Hot Spot Model This example is presented more fully in Verwer et al 1989 The system is a normalized problem relating the temperature u x t of
14. Compu LO i L ENDERNE Comput JSHIF A d JSHIFT END DO Reset TACS LOL TA y B N Newton s method on 0 IPART 2 MP_NPROCS N BND 2 N Y N 0 values after th r amp IND EX N Brown s almost for variable N BN BN BN D 1 1 N 1 D 2 1 N 1 X X N HALF D 2 N X N HUG Cie EPS whic E ONE 1 HAI ILON ONI e the residual function LE B 1 N 1 SUM X X 1 N 1 N 1 G PRODUCT X ank SHOW B amp and PRINT THI EN K max 0 IPART 2 MP_RANK 1 IPART 1 MP_RANK 1 1 AL messages and stopping for FATAL class errors step is taken All variables are positive and bounded below by HALF h has an upper bound of HALF Developing non linear function residual TOR EXIT N ET e the A 1 N 1 ON DO J 1 N 1 IF J lt IPARI Te od gt IPARI T J IPART 1 M TWO ry A N ON th 1 MP_RANK 2 MP_RANK P_RANK 1 1 MAXVAL ABS B 1 N 1 EWTON_M HOD E X IPART 1 MP_RANK 1 I amp lt SORT CHEM CYCLI Fa a oleranc iE IOPT 2 Tes INP ab IF NIRT IOPT CALL IOPT 1 PSILO linear independenc DmOLALON SiC BIS Ons ON EPSILON ON derivatives local to each processor PART 2 MP_RANK 1
15. Read the residuals an OPEN UNIT NIN FILE Read the approximate EAD NIN B B X ww est dat DESC_A d_A Sewa Clanes n DESC UB GLI ed product solution to A x b ISIC IN IEP E INFO root node testida ADES CEB Cl Je d check them for size Xtest dat STATUS OLD solution data CLOSI E NIN STATUS DEL ERROR SUM ABS B SIZ Delete temporary file OPEN UNIT NIN FILE p A S Meese Clee gt SiN Oi GEE CLOSE NIN STATUS DEL ETE OPEN UNIT NIN FILE Btest dat STATUS OLD Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers 243 IMSL Fortran 90 MP Library 4 0 CLOSE NIN STATUS DELETE ti ND IF The processors in use now exit the loop EXLT BLOCK ND DO BLOCK ti See to any error messages call elpop Mp_Setup Deallocate storage arrays and exit from BLACS IF ALLOCATED A DEALLOCATE A IF ALLOCATED B DEALLOCATE B IF ALLOCATED X DEALLOCATE X IF ALLOCATED d_A DEALLOCATE d_A IF ALLOCATED d_B DEALLOCATE d_B IF ALLOCATED IPIV DEALLOCATE IPIV IF ERROR lt SQRT EPSILON ERROR and amp P_RANK 0 THEN Wenn Gt E Example
16. if change_new gt change_old amp xit iterative_refinement change_old change_new Use option to re enter code with factorization saved solve only iopti 2 s_options s_lin_sol_self_solve_A zero end do iterative_refinement write Example 4 for LIN_SOL_SELF is correct end Fatal and Terminal Error Messages See the messages gls file for error messages for 1in_sol_self These error messages are numbered 321 336 341 356 361 376 381 396 lin_sol_Isq Solves a rectangular system of linear equations Ax b in a least squares sense Using optional arguments any of several related computations can be performed These extra tasks include computing and saving the factorization of A using column and row pivoting representing the determinant of A computing the generalized inverse matrix A or computing the least squares solution of IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 17 Ax b or Aly b given the factorization of A An optional argument is provided for computing the following unscaled covariance matrix 1 C A A Least squares solutions where the unknowns are non negative or have simple bounds can be computed with PARALLEL_Nonegative_LSQ and PARALLEL_Bounded_LSQ Chapter 7 These codes can be restricted to execute without MPI Required Arguments A Input Output Array of size m X n containing the matrix b Input Output Array of si
17. n end Fatal Terminal and Warning Error Messages See the messages gls file for error messages for 1in_sol_tri These error messages are numbered 1081 1086 1101 1106 1121 1126 1141 1146 IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 45 Chapter 2 Singular Value and Eigenvalue Decomposition Introduction This chapter describes routines for computing the singular value decomposition for rectangular matrices and the eigenvalue eigenvector decomposition for square matrices Contents MEE e aes eee tas ea cadena aus E cass fv E E E a E A EEEE ET EE EE 48 Example 1 Computing the SVD scassscaraiscsansccassanannisiaiaaieSeteas saraiiasseneaass 48 Example 2 Linear Least Squares with a Quadratic Constraint 50 Example 3 Generalized Singular Value Decomposition c0 52 Example 4 Ridge Regression as Cross Validation with Weighting 54 Vin e e SOLE E csactvads E deentaaseainn 56 Example 1 Computing EIGGNValUeS is ctcsexsescacvearttarecteatec sees tannin denevente 56 Example 2 Eigenvalue Eigenvector Expansion of a Square Matrix 58 Example 3 Computing a few Eigenvectors with Inverse Iteration 59 Example 4 Analysis and Reduction of a Generalized Eigensystem 61 lin ig g Renee ie i ee Be eee reer 62 Example 1 Computing Eigenvalues ccccceesseeseeeeceeeeeeeaeeeeeeeeeeneeeaes 63 Example 2 Complex Polynomial Equation Roo
18. Check the results for orthogonality and small residuals TEMP1 NORM spread EYE n 3 nr p xt p TEMP2 NORM A P X Q NORM A if ALL TEMP1 lt sqrt epsilon one and amp ALL TEMP2 lt sqrt epsilon one then if mp_rank 0 amp write Parallel Example 15 is correct Sme iE See to any error messages and exit MPI mp_nprocs mp_setup Final end Parallel Example 16 A compute intensive single task in this case the singular values decomposition of a matrix is computed and partially reconstructed with matrix products This result is sent back to the root node The node of highest priority not the root is used for the computation except when only the root is available use linear_operators use mpi_setup_int implicit none INCLUDE mpif h This is Parallel Example 16 for SVD integer i j IERROR BES integer parameter n 32 real kind le0 parameter half 5e 1 one le0 zero 0e0 Teteyenll heatiqvel ALSO ye cimemen oa Gol in 6 A SG We Wp C integer k STATUS MPI_STATUS_SIZE l SST Tor MIP ILS IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 223 B mp_nprocs EST 1 BLOCK DO ima Ibi skiey WeloKS mp_setup n one for points inside the circle zero on the outside A zero DO al il ia DO j 1
19. Compute the singular value decomposition call lin_svd a Sp u v g matmul transpose u b S_sq s 2 log_lamda log 10 s 1 log_lamda_t log_lamda delta_log_lamda log_lamda log 0 1 s n p 1 Choose lamda to minimize the cross validation weighted square error First evaluate th rror at a grid of points uniform in log_scale cross_validation_error do i l p t s_sq s_sqtexp log_lamda c_lamda i sum w b matmul u 1l m 1 in g 1 n 1 k amp spread t DIM 2 NCOPIES k amp one matmul u 1l m 1 n 2 amp spread t DIM 2 NCOPIES k 2 DIM 1 log_lamda log_lamda delta_log_lamda end do cross_validation_error Compute the grid value and lamda corresponding to the minimum do i l k lamda i exp log_lamda_t delta_log_lamda amp sum minloc c_lamda l p i 1 end do Compute the solution using the optimum cross validation parameter x matmul v g 1 n 1 k spread s DIM 2 NCOPIES k amp spread s_sq DIM 2 NCOPIES k amp spread lamda DIM 1 NCOPIES n Check the residuals using normal equations res matmul transpose a b matmul a x amp spread lamda DIM 1 NCOPIES n x if sum abs res sum s_sq lt amp sqrt epsilon one then write Example 4 for LIN_SVD is correct end if end IMSL Fortran 90 MP Library 4 0 Chapter 2 Singular Value and Eigenvalue Decomposition 55 Fatal Termin
20. PRO PDE_ld_mg_plot FILENAME filename PAUSE pause F Nee Nee if keyword_set FILENAME then file filename else file res dat if keyword_set PAUSE then twait pause else twait 1 Define floating point variables that will be read from the first line of the data file xl ODO xr ODO tO ODO tlast ODO Open the data file and read in the problem parameters openr lun filename get_lun readf lun npde np nt xl xr t0 tlast Define the arrays for the solutions and grid u dblarr nt npde np g dblarr nt np times dblarr nt Define a temporary array for reading in the data tmp dblarr np t_tmp ODO Read in the data for i 0 nt 1 do begin For each step in time readf lun t_tmp times i t_tmp for k 0 npde 1 do begin For each PDE rmf lun tmp u i k tmp Read in the components end rmf lun tmp g i tmp z Read in the grid end Close the data file and free the unit close lun free_lun lun We now have all of the solutions and grids Delete any window that is currently open while d window NE 1 do WDELETE Open two windows for plotting the solutions and grid window 0 xsize 550 ysize 420 window 1 xsize 550 ysize 420 Plot the grid wset 0 plot xl xr t0 tlast nodata ystyle 1 title Grid Points xtitle X ytitle Time for i 0 np 1 do begin oplot g i times psym 1 end Plot the sol
21. in this chapter gives an introduction on how users should write their codes to use other machines on a network IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 141 Contents Ma Aae bra Opera 142 Matix and Utility FUNGUONS oeenn ane centers 144 OptionallDatl Changes a a e e teeter seer 149 Operators Xa Os A aK aa 150 Opero ies eeevce me cae nant serene eerste eee rne Rey 150 Ope 151 Operos S 153 CO E 154 CON E E E 155 10 A e E E A a a earr Pere errerraes 156 D E A eae Cor TS a O 158 E E A aa 158 E E R 160 E pre e A A A A e a E E N reerrere 160 ppor E E seies geavener as nerec el werte reas nae ieee 161 TITS se se pee Se pacar cere eeNc ear Seca ae eee rece conee ceRecarer sede a T 162 TPE TO BOX e cee reese a a A A ewer ine hn seree 163 SEINE TIN ca jand nec Gace OSng Addo DaEe Band EA Bane aad aaa dana Daarnena aaa Baaop ASR MORd OSA AEA 164 N e EE E E E E E E E E E E S nite cicietetetstetetr dies E E E 165 NORM a E E 166 ORTH e e A E EE O E cee ey settee RE E A 167 RANDEN Ae EREE a ti ttc i er UO ENE me a Bb Re E ee E cee 168 PRANK eran te hone ce cer eae sara a A aa 169 SVD ce sacesiiracnetice sag seater E eae ar atmaeran 170 ODN EY Rea pop eee reste aer nar ee eer crane hoerieeeoncures E neers 171 Overloaded etc for Derived Typ S ssirssiinisarinirnnnani sinian 172 peratomEXxam poles series cese cee ceeeesers ees cee seen esn reves tes eee a
22. records the final order and is used to move the matrix columns to that order This example illustrates the principle of sorting record keys followed by direct movement of the records to sorted order use sort_real_int use rand_gen_int implicit none This is Example 2 for SORT_REAL integer 1 integer parameter n 100 integer ip n real kind le0 a n n x n y n temp n n Generate a random array and matrix of values call rand_gen x call rand_gen temp a reshape temp n n Initialize permutation to the identity do i l n ip i i end do Sort using negative values so the final order is non increasing call sort_real x y iperm ip Final movement of keys and matrix columns y x ip l n a a ip l1 n Check the results if count y l n 1 lt y 2 n 0 then write Example 2 for SORT_REAL is correct end if end Fatal and Terminal Error Messages See the messages gls file for error messages for sort_real These error messages are numbered 561 567 581 587 136 Chapter 5 Utilities IMSL Fortran 90 MP Library 4 0 show Print rank 1 or rank 2 arrays of numbers in a readable format Required Argument x Input Rank 1 or rank 2 array containing the numbers to be printed Example 1 Printing an Array Array of random numbers for all the intrinsic data types are printed For REAL KIND 1 the default value of E0 ra
23. 14 418 EON 7 45 12 003 2 5 amp EOIR 6 012 EPONE WePS Te 0 018 3 0 amp Ta TS ys 54973 2ra OSL 11 967 2 5 amp 10 901 9 015 Dir rg LO ZSL 4 536 1 925 amp 10 602 0 06 1 85 10 453 4 419 1 576 amp 10 304 8 895 ETN 14 055 10 509 1 5 amp 14 194 6 783 bosny 14 331 3 054 1 7 amp 14 469 0 672 Deed ap 14 607 4 398 1 75 amp ESO y E25 OD y 025 5 15s 129 8 067 0 5 amp 16 457 4 134 Ong HG pepe sho O 0Lb98 1 1 amp 17 914 3 735 17 7 spline_data 1 3 reshape data 3 ndata spline_data 4 one Define the knots for the tensor product data fitting problem Use the data limits to the knot sequences kptx 1 ndegree minval spline_data 1 kptx nbkpt ndegree 1 nbkpt maxval spline_data 1l elta bkptx nbkpt bkptx ndegree ngrid 1 T2080 Assign the degr of the polynomial and the knots for x pointer_bkpt gt bkptx knotsx d_spline_knots ndegree pointer_bkpt bkpty 1l ndegree minval spline_data 2 bkpty nbkpt ndegreetl nbkpt maxval spline_data 2 delta bkpty nbkpt bkpty ndegree ngrid 1 b Assign the degr of the polynomial and the knots for y pointer_bkpt gt bkpty knotsy d_spline_knots ndegree pointer_bkpt Fit the data and obtain the coefficients coeff surface_fitting spline_data knotsx knotsy delta bkptx nbkpt bkptx 1 nvalues 1 x okptx 1 i delta i 1 nvalues d
24. 218 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 integer i integer parameter m 128 n 8 nr 1 real kind 1d0 parameter one 1d0 zero 0d0 Tead Ucatinvel ell0 VAG Siaine COs ipic oi over 2 xn amp Vay ik ies GM sw wm elta ox aay sin im ide Setup for MPI Create a priority order list Force the problem to work on the fastest non root machine mp_nprocs mp_setup m MPI_ROOT_WORKS false Generate an array of equally spaced points on the interval 1 1 delta_x 2 real m 1 kind one X one i delta_x i 0 m 1 Get the constant PI 2 from IMSL Numerical Libraries povorki iD CONSE TA ETAZ Compute data values on the grid y 1 1 exp x cos pi_over_2 x Fill in the least squares matrix for the Chebyshev polynomials A 0 1 one BOC lvl ee do i 2 n Peg pat IN SS ESENE alail 1h Alap a 25 dl end do Compute the generalized inverse of the least squares matrix Compute the series coefficients using the generalized invers as smoothing formulas sion gala YAS S aM oka W Evaluate residuals using backward recurrence formulas u zero v zero domain Oe Ww Qos war celal i il v au u Ww end do Compute residuals at the grid Wises Exo a os om ower Bo iy Check that n 2 sign changes in the residual curve occur 26 Oise o lt
25. A set of parallel benchmark programs is shown in Table D These main programs call Fortran 90 box data type functions in single and double precision They compare our parallel allocation algorithm to a scalar sequential method The main program reads single lines of input NSIZE NTIMES NRACKS PREC ROOT_WORKS Description QUIT t IMSL Fortran 90 MP Library 4 0 o Stop Two initial lines of output echo the Description field whether or not the root is working and the number of processors in the MPI communicator The parameters NSIZE NIRIES and NRACKS appear in the summary tables The parameter PREC has values 1 2 or 3 The choice depends on whether the user wants precision of single double or Appendix D Benchmarking or Timing Programs e D 5 both versions timed The array functions return a 7x 2 summary table of values The 1 6 1 and 1 6 2 elements of this array represent the results and parameters of the benchmark for the parallel and non parallel versions The 7 1 and 7 2 elements of this array represent the ratio of the parallel to the scalar times and a first order approximation to the variation in the ratio Parallel Box Version Scalar Box Equivalent 1 Average time Average time 2 Standard deviation Standard deviation 3 Total Seconds Total Seconds 4 nsize nsize 5 nracks nracks 6 ntries ntries 7 Parallel Scalar Ratio Variation in Ratio As an example the progra
26. C rand C d rand d endif Form the normal equations for the rectangular system OS ae EC ip oa We ee A Compute the solution for Ax b OS IN cabs lo Check the results err norm b A x x norm A norm b if ALL err lt sqrt epsilon one AND MP_RANK 0 amp wue Copa Perelik memole Sy ike CORECT See to any error messages and quit MPI mp_nprocs mp_setup Final end Parallel Example 6 use linear_operators use mpi_setup_int implicit none This is Parallel Example 6 for box data types operators and UN CeO So integer parameter m 64 n 32 nr 4 real kind 1le0 one le0 zero 0e0 rere mae mee ale katate Os ree Chimene oa meri srt gS C mela impel grates real kind le0 dimension n n nr A cov real kind le0 dimension n 1 nr Op X Setup for MPI mp_nprocs mp_setup Generate a random rectangular matrix and right hand side if mp_rank 0 then C rand C d rand d endif Form the normal equations for the rectangular system AOSV ae C hee ee d COVE area CHON A COV COVE CONG Compute the least squares solution x sb el Compare with solution obtained using the inverse matrix ene norm x COV x D norm cov Check the results if ALL err lt sqrt epsilon one and mp_rank 0 amp write Parallel Example 6 is correct IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions T
27. Salen he y7 8 1k Ib ete Se Umeda Glam D Kam ete coan if mp_rank 0 amp write Parallel Example 11 is correct end if See to any error messages and exit MPI mp_nprocs mp_setup Final end IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 219 Parallel Example 12 This illustrates a surface fitting problem using radial basis functions and a box data type It is of interest because this problem fits three component functions of the same form in a space of dimension two The racks of the box represent the separate problems for the three coordinate functions The coefficients are obtained with the ix operator When the least squares fitting process requires more elaborate software it may be necessary to send the data to the nodes compute and send the results back to the root See Parallel Example 18 for more details Any number of nodes can be used use linear_operators use mpi_setup_int implicit none This is Parallel Example 12 for sib 4 NORM IE amme gk Ojasicaicois integer i j nrack integer parameter m 128 n 32 k 2 n_eval 16 nr 3 real kind 1d0 parameter one 1d0 delta_sqr 1d0 Seu Kinner AGm im wwe JoGm mie ein ie joer mie el ik iy iaue Setup for MPI mp_nprocs mp_setup Generate a random set of data and center points in k 2 space if mp_rank 0 then p rand p q rand q Comp
28. V x diag D 1 k if err lt sqrt epsilon one and amp norm res abs d 1 lt sqrt epsilon one then write Example 3 for LIN_EIG_SELF operators is correct end if end Operator_ex28 use linear_operators implicit none This is Example 4 using operators for LIN_EIG_SELF integer parameter n 64 real kind le0 parameter one 1d0 real kind le0 dimension n n A B C D n lambda n amp S n vb_d X res Generate random self adjoint matrices A rand A A A t A B rand B B B t B Add a scalar matrix so B is positive definite B B norm B EYE n Get th igenvalues and eigenvectors for B S EIG B V vb_d For full rank problems convert to an ordinary self adjoint problem All of thes xamples are full rank if S n gt epsilon one then D one sqrt S C diag D x vb_d tx A x vb_d x diag D C Co t 2t 2 Get th igenvalues and eigenvectors for C lambda EIG C v X 198 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 Compute and normalize the generalized eigenvectors X UNIT vb_d x diag D x X res A x X B x X x diag lambda Check the results if norm res norm A norm B lt amp sqrt epsilon one then write Example 4 for LIN_EIG_SELF operators is correct end if
29. ido must be re allocated Then re enter fast__2dft The next return from fast_2dft has the output value ido 1 The variables required for the transform and its inverse are saved in w Thereafter when the routine is entered with ido 1 and for the same values of m and n the contents of w will be used for the working variables The expensive initialization step is avoided The optional arguments ido and work_array must be used together work_array w Output Input Complex array of rank 1 used to store working variables and values between calls to fast_2dft The value for size w must be at least as large as the value ido for the value of ido lt 0 iopt iopt Input Output Derived type array with the same precision as the input array used for passing optional data to fast_2dft The options are as follows IMSL Fortran 90 MP Library 4 0 Chapter 3 Fourier Transforms 87 Packaged Options for fast_2dft Option Prefix Option Name Option Value GZ fast_2dft_scan_for_NaN 1 fast_2dft_near_power_of_2 2 fast_2dft_scale_forward 3 C Zz fast_2dft_scale_inverse 4 iopt IO _options _fast_2dft_scan_for_NaN _dummy Examines each input array entry to find the first value such that isNaN x i j true See the isNaN function Chapter 6 Default Does not scan for NaNs iopt IO _options _fast_2dft_near_power_of_2 _dummy Nearest powers of 2 gt m and gt
30. implicit none This is Example 3 for LIN_SOL_TRI IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 39 integer i j nopt integer parameter n 128 k n 4 ncoda 1 lda 2 real kind le0 parameter s_one le0 s_zero 0e0 real kind le0 A lda n EVAL k real kind 1le0 d n b n d_t 2 n k c_t 2 n k perf ratio amp b_t 2 n k y_t 2 n k eval_t k res n k temp logical small type s_options iopt 2 s_options 0 s_zero This flag is used to get the k largest eigenvalues small false Generate the main diagonal and the co diagonal of the tridiagonal matrix call rand_gen b call rand_gen d A 1 1 b A 2 1 d Use Numerical Libraries routine for the calculation of k largest eigenvalues CALL EVASB N K A LDA NCODA SMALL EVAL EVAL_T EVAL Use DNFL tridiagonal solver for inverse iteration calculation of eigenvectors factorization_choice do nopt 0 1 Create k tridiagonal problems one for each inverse iteration system Bot tls hy LEK spread b DIM 2 NCOPIES k c_t 1 n 1 k EOSHIFT b_t 1 n 1 k SHIFT 1 DIM 1 d_t l in 1 k spread d DIM 2 NCOPIES k amp spread EVAL_T DIM 1 NCOPIES n Start the right hand side at random values scaled downward to account for the expected blowup in the solution do i l k call rand_gen y_t 1l n i end do Do two iterations for the ei
31. 6 s_ d_ PDE_1D_MG_RELATIVE_TOLERANCE 7 s_ d_ PDE_1D_MG_ABSOLUTE_TOLERANCE 8 s_ d_ PDE_1D_MG_MAX_BDF_ORDER 9 s_ d_ PDE_1D_MG_REV_COMM_FACTOR_SOLVE 10 s_ d_ PDE_1D_MG_NO_NULLIFY_STACK 11 IOPT IO PDE_1D_MG_CART_COORDINATES Use the value 0 in Equation 2 This is the default IOPT IO PDE_1D_MG_CYL_COORDINATES Use the value m 1 in Equation 2 The default value is m 0 IOPT IO PDE_1D_MG_SPH_COORDINATES Use the value m 2 in Equation 2 The default value is m 0 IOPT IO _OPTIONS PDE_1D_MG_TIME_SMOOTHING TAU This option resets the value of the parameter T2 0 described above The default value is T 9 IOPT IO _OPTIONS PDE_1D_MG_SPATIAL_SMOOTHING KAP This option resets the value of the parameter K 2 0 described above The default value is K 2 IOPT IO _OPTIONS PDE_1D_MG_MONITOR_SMOOTHING ALPH This option resets the value of the parameter amp 2 0 described above The default value is amp 9 01 IOPT IO _OPTIONS PDE_1D_MG_RELATIVE_TOLERANCE RTOL This option resets the value of the relative accuracy parameter used in DASPG The default value is RTOL 1E 2 for single precision and RTOL 1D 4 for double precision IOPT IO _OPTIONS PDE_1D_MG_ABSOLUTE_TOLERANCE ATOL This option resets the value of the absolute accuracy parameter used in 274 Chapter 8 Partial Differential Equations IMSL Fortr
32. DIM 1 NCOPIES n If the factorization method is Cyclic Reduction and perf_ratio is larger than one method is already Gaussian and perf_ratio is checked a perf_ratio Eliminatio t the end epsilon s_one if perf_ratio lt s_one iopt nopt 1 re solve using Gaussian n sum abs res 1 n 1 k sum abs EVAL_T 1 k Elimination If the he loop exits amp amp 5 n exit factorization_choice s_options s_lin_sol_tri_use_Gauss_elim s_zero end do factorization_choice if perf_ratio lt s_one then write end if end Example 3 for LIN_SOL_TRI is correct Example 4 Tridiagonal Matrix Solving within Diffusion Equations The normalized partial differential equation u t u du El ot dx XX is solved for values of O lt x lt 7 and t gt 0 A boundary value problem consists of choosing the value such that the equation IMSL Fortran 90 MP Library 4 0 u 0 t uo u x t u Chapter 1 Linear Solvers 41 is satisfied Arbitrary values and are used for illustration of the solution process The one parameter equation u x t uy 0 The variables are changed to v x t u x t uo that v 0 t 0 The function v x t satisfies the differential equation The one parameter equation solved is therefore v x1 t Ug 0 To solve this equation for uo use the standard technique of the varia
33. EIG A Use Horner s method for evaluation of the complex polynomial and size gauge at all roots f one fg one do i l n f f E b i fg fg abs E abs b i end do Check for small errors at all roots if norm f fg lt sqrt epsilon one then write Example 2 for LIN_EIG_GEN operators is correct end if end Operator_ex31 use linear_operators implicit none This is Example 3 using operators for LIN_EIG_GEN integer parameter n 32 k 2 real kind le0 parameter one le0 zero 0e0 real kind le0 a n n b n k x n k h complex kind le0 dimension n n W T e n z n k type s_options iopti 2 A rand A b rand b iopti 1l s_lin_eig_gen_out_tri_form iopti 2 s_lin_eig_gen_no_balance 200 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 Compute the Schur decomposition of the matrix call lin_eig_gen a vVEw amp tri t iopt iopti Choose a value so that Ath I is non singular h one Solve for Ath I x b using the Schur decomposition z W hx b Solve intermediate upper triangular system with implicit additive diagonal h I This is the only dependence on h in the solution process z T h EYE n ix Z Compute the solution It should be the same as x but will not be exact due to rounding erro
34. ETA NPD E DUDX NPDE U DUDX NPD EFT Example 1 GAMMA NPD p Chapter 8 Partial Differential Equations 279 IF LEFT THEN BETA 1 1D0 BETA 2 0D0 GAMMA 1 0D0 GAMMA 2 U 2 ELSE BETA 1 0D0 BETA 2 1D0 GAMMA 1 U 1 1D0 GAMMA 2 0D0 END IF END SUBROUTINE Example 2 Inviscid Flow on a Plate This example is a first order system from Pennington and Berzins 1994 The equations are u v uu VUu Wy w u implying that uu vu uy u 0 t v 0 t 0 u co t u xp t 1 t 20 u x 0 1 v x 0 0 x 20 Following elimination of W there remain NPDE 2 differential equations The variable is not time but a second space variable The integration goes from t 0 to t 5 It is necessary to truncate the variable Xat a finite value say max R 25 In terms of the integrator the system is defined by letting m 0 and 1 0 y 0 clena of Le f2 on The boundary conditions are satisfied by 2 E exp een v p 0 4 sma x We use N 10 51 61 grid points and output the solution at steps of At 0 1 Rationale This is a non linear boundary layer problem with sharply changing conditions near t O The problem statement was modified so that boundary conditions are continuous near 0 Without this change the underlying integration software DASPG cannot solve th
35. Non parallel parallel averages and variation 2 5444E 00 Double precision benchmark of parallel i Date of benchmark Y Mo D H M S 1996 12 23 3 9129E 01 Root not working Number of Processors 4 1 6985D 00 9 8576D 01 8 4923D 00 5 0000D 01 5 0000D 00 5 0000D 00 Nn nn BW NN 4 0372D 00 2 3836D 02 2 0186D 01 5 0000D 01 5 0000D 00 5 0000D 00 Non parallel parallel averages and variation 2 3770D 00 1 2392D 01 and non parallel i 10 16 48 10 16 18 Average St Dev Total Seconds Size Racks per box Repeats Average St Dev Total Seconds Size Racks per box Repeats Table C Performance Summary Box operator i IMSL Fortran 90 MP Library 4 0 Appendix D Benchmarking or Timing Programs e D 7 Below is a list of the performance evaluation programs that time the box data computations using parallel and non parallel resources Number Program Units Function Timed 1 time_parallel_i f90 s_parallel_i_bench f90 d_parallel_i_bench f90 2 time_parallel_ix f90 s_parallel_ix_bench f90 d_parallel_ix_bench 90 3 time_parallel_xi f90 s_parallel_xi_bench f90 d_parallel_xi_bench 90 4 time_parallel_x f90 s_parallel_x_bench f90 d_parallel_x_bench f90 5 time_parallel_tx f90 L s_parallel_tx_bench f90 d_parallel_tx_bench f90 6 time_parallel_xt f9 L s_parallel_xt_bench f90 d_parallel_xt_bench f90 7 time_parallel_hx f90 s_
36. UNIT x Check the results b ATEMP x x err dot_product x 1 n 1 b 1 n 1 e k If any result is not accurate quit with no printing if abs err lt sqrt epsilon one E 1 then write Example 3 for LIN_SOL_SELF operators is correct end if end Operator_ex08 use linear_operators implicit none This is Example 4 for LIN_SOL_SELF using operators and functions integer parameter m 8 n 4 real kind 1e0 one 1e0 zero 0e0 real kind 1d0 d_zero 0d0 integer ipivots n m 1 real kind le0 A m n b m 1 F n m ntm amp g n m 1 h n m 1 real kind 1e0 change_new change_old real kind 1d0 c m 1 D m n y ntm 1 type s_options iopti 2 Generate a random matrix and right hand side A rand A b rand b Save double precision copies of the matrix and right hand side D A c b Fill in augmented matrix for accurately solving the least squares problem using iterative refinement F zero F 1l m 1 m EYE m 178 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 F 1 m m 1 A F m 1 1 m t A Start solution at zero y d_zero change_old huge one Use packaged option to save the factorization iopti 1l s_lin_sol_self_save_factors iopti 2 0 iterative_refinement do g l m 1 c 1l m 1 y l im 1 D x y mt l mtn 1 g m tl mtn
37. a D 0 O P F U NPDI BE ex08 out UNIT t determine limits E 1 N XMAX 7 RA END D WRI 4D1 U NPDE 1 1 U NP SIGSQ SIGMA 2 E 1 to its maximum allowed value IOPT 1 PDE ELTA _T D 4 5 NPD ELTA_T N NFRAME Gy DE 1 N 1D_MG_MAX_BDF_ORD IOPT 2 5 294 e Chapter 8 Partial Differential Equations 0 TEND Illustrate allowing the BDF order to increase ER IMSL Fortran 90 MP Library 4 0 IOPT 3 D_OPTIONS PDE_1D_MG_TIME_SMOOTHING 5D 3 IOPT 4 D_OPTIONS PDE_1D_MG_RELATIVE_TOLERANCE ZERO IOPT 5 D_OPTIONS PDE_1D_MG_ABSOLUTE_TOLERANCE 1D 2 Update to the next output point Write solution and check for final point CASE 2 TO TOUT IF TO lt TEND THEN WRITE 7 F10 5 TOUT DO I 1 NPDE 1 WRITE 7 4E15 5 U I END DO TOUT MIN TOUT DELTA_T TEND IF TEND IDO 3 END IF All completed Solver is shut down CASE 3 CLOSE UNIT 7 EXIT Define initial data values CASE 5 U 1 MAX U NPDE 1 E ZERO Vanilla European Call U 1 U NPDE 1 Asset or nothing Call WHERE U 1 lt E U 1 ZERO on
38. array Chapter 5 Utilities 137 Rank 1 options show_significant_digits_is_7 I options 2 show_starting_index_is 3 options 1 The starting value call show s_x amp end REAL with 7 digits natural indexing IOPT options Optional Arguments text CHARACTER Input CHARACTER LEN string used for labeling the array image buffer Output CHARACTER LEN string used for an internal write buffer With this argument present the output is converted to characters and packed The lines are separated by an end of line sequence The length of buffer is estimated by the line width in effect time the number of lines for the array iopt iopt Input Derived type array with the same precision as the input array used for passing optional data to the routine Use the REAL KIND 1E0 precision for output of INTEGER arrays The options are as follows Packaged Options for show Prefix is blank Option Name Option Value show_significant_digits_is_4 show_significant_digits_is_7 show_significant_digits_is_16 show_line_width_is_44 show_line_width_is_72 show_end_of_line_sequence_is show_starting_index_is Oloolu aD Oo B WIN Fr show_starting_row_index_is m O show_starting_col_index_is show line_width_is_128 iopt IO
39. derivative 2 point bkpt i ndegree type gt value zero end do Make the slope zero and value non negative at right constraints nbkptin 1l spline_constraints amp derivative 1 point bkpt nord type value zero constraints nbkptin 2 spline_constraints amp derivative 0 point bkpt nbkptintndegree type gt value zero coeff spline_fitting data spline_data knots break_points amp constraints constraints covariance sigma_squared Compute value first two derivatives and the variance values spline_values 0 xdata break_points coeff root_variance spline_values 0 xdata break_points coeff amp covariance sigma_squared derivatl spline_values 1 xdata break_points coeff derivat2 spline_values 2 xdata break_points coeff call show reshape xdata derivat2 root_variance ndata 3 amp The x values 2 nd derivatives and square root of variance See that differences are relatively small and the curve has the right shape and signs diff norm values ydata norm ydata if all values gt zero and all derivatl lt epsilon zero amp and diff lt tol then write Example 2 for SPLINE_FITTING is correct end if end IMSL Fortran 90 MP Library 4 0 Chapter 4 Curve and Surface Fitting with Splines 105 Example 3 Splines Model a Random Number Generator The function g x exp x 2 l lt x lt
40. i 0 ndata E x x i 0 and ndata IMSL Fortran 90 MP Library 4 0 Chapter 4 Curve and Surface Fitting with Splines 101 Our program checks the term const appearing in the maximum truncation error term error const xAx at a finer grid USE spline_fitting_int USE show_int USE norm_int implicit none This is Example 1 for SPLINE_FITTING Natural Spline Interpolation using cubic splines Use the function exp x 2 2 to generate samples integer i integer parameter ndata 24 nord 4 ndegree nord l amp nbkpt ndata 2 ndegree ncoeff nbkpt nord nvalues 2 ndata real kind le0 parameter zero 0e0 one le0 half 5e 1 real kind le0 delta_x 0 15 delta_xv 0 4 delta_x parameter real kind le0 target spline_data 3 ndata bkpt nb ycheck nvalues coeff ncoeff xvalues nvalues real kind le0 pointer pointer_b xdata ndata yvalues nvalues ydata ndata amp kpt amp amp diff kpt type s_spline_knots break_points type s_spline_constraints constraints 2 xdata ee delta_x i 1 ndata ydata exp hale xdatar 2 xvalues 0 03 i 1 delta_xv i 1 nvalues ycheck Sa Aue har al spline_data 1 xdata spline aata 2 i ydata spline_data 3 one Define the knots for the interpolation problem bkpt l ndegree i delta_x i ndegree 1 bkpt nord nbkpt ndegree xdata bkpt nb
41. one 1 0e0 zero 0 0e0 IMSL Fortran 90 MP Library 4 0 Chapter 5 Utilities e 129 integer irndi n i_out 3 pt 2 hidden_message n real kind le0 x n y n type s_options iopti 2 s_options 0 zero character 34 message returned_messag This is the message to be hidden message SAVE YOURSELF WE ARE DISCOVERED Start the generator with a known seed iopti 1 s_options s_rand_gen_generator_seed zero iopti 2 s_options 123 zero call rand_gen x iopt iopti Save the state of the generator call rand_gen x istate_out i_out Get random integers call rand_gen y irnd irndi Hide text using collating sequence subtracted from integers do i l n hidden_message i irndi i ichar message i i end do Reset generator to previous state and generate the previous random integers call rand_gen x irnd irndi istate_in i_out Subtract hidden text from integers and convert to character do i l n returned_message i i char irndi i hidden_message i end do Check the results if returned_message message then write Example 2 for RAND_GEN is correct end if end Example 3 Generating Strategy with a Histogram We generate random integers but with the frequency as in a histogram with npins slots The generator is initially used a large number of times to demonstrate that it is making choices with
42. spread lambda 1 n Check the results if sum abs res sum abs A sum abs B lt amp sqrt epsilon one then write Example 4 for LIN_EIG_SELF is correct end if end if end Fatal Terminal and Warning Error Messages See the messages gls file for error messages for 1in_eig_self These error messages are numbered 81 90 101 110 121 129 141 149 lin_eig_gen Computes the eigenvalues of an n xX n matrix A Optionally the eigenvectors of A or A are computed Using the eigenvectors of A gives the decomposition AV VE where V is ann x n complex matrix of eigenvectors and E is the complex diagonal matrix of eigenvalues Other options include the reduction of A to upper triangular or Schur form reduction to block upper triangular form with 2 x 2 or unit sized diagonal block matrices and reduction to upper Hessenberg form 62 Chapter 2 Singular Value and Eigenvalue Decomposition IMSL Fortran 90 MP Library 4 0 Required Arguments A Input Output Array of size n X n containing the matrix E Output Array of size n containing the eigenvalues These complex values are in order of decreasing absolute value The signs of imaginary parts of the eigenvalues are in no predictable order Example 1 Computing Eigenvalues The eigenvalues of a random real matrix are computed These values define a complex diagonal matrix E Their correctness is checked by obtaining the
43. x 1 where b t 1 2 for t gt 2x 10 and 0 2 5x10 r for 0 lt 1 lt 2x10 Rationale This is a non linear problem The example shows the model steps for replacing the banded solver in the software with one of the user s choice Reverse communication is used for the interface to the problem data and the linear solver Following the computation of the matrix factorization in DL2CRB we declare the system to be singular when the reciprocal of the condition number is smaller than the working precision This choice is not suitable for all problems Attention must be given to detecting a singularity when this option is used program PDE_1D_MG_EX05 Flame propagation model USE pde_ld_mg_int USE ERROR_OPTION_PACKET USE Numerical_Libraries ONLY amp dl2crb dlfsrb IMPLICIT NONE IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations 287 INTEGER PARAMETER NPDE 2 N 40 NEQ NPDE 1 N INTEGER I IDO NFRAMES IPVT NEQ Define array space for the solution real kind 1d0 U NPDE 1 N TO TOUT Define work space for the banded solver real kind 1d0 WORK NEQ RCOND real kind 1d0 ZERO 0D0 ONE 1D0 DELTA_T 1D 4 amp TEND 6D 3 XMAX 1D0 BETA 4D0 GAMMA 3 52D6 TYPE D_OPTIONS IOPT 1 Start loop to integrate and record solution values IDO 1 DO SE H ECT CASE IDO
44. y x A y em ox A y S A stxs x Compute the matrix expression D B A C D B A hx C DSB h A x C Operator a l a Compute the inverse matrix for square non singular matrices or the Moore Penrose generalized inverse matrix for singular square matrices or rectangular matrices The operation may be read inverse or generalized inverse and the results are in a precision and data type that matches the operand The operator can be applied to any rank 2 or rank 3 array IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 151 Required Operand This operator requires a single operand Since this is a unary operation it has higher Fortran 90 precedence than any other intrinsic array operation Modules 152 e Chapter 6 Operators and Generic Functions The Parallel Option Use the appropriate one of the modules operation_i or linear_operators Optional Variables Reserved Names This operator uses the routines lin_sol_gen or lin_sol_lsq See Chapter 1 Linear Solvers lin_sol_gen and lin_sol_1sq The option and derived type names are given in the following tables Option Names for i use_lin_sol_gen_only use_lin_sol_lsq_only i_options_for_lin_sol_gen i options for lin s l Isg skip_error_processing Derived Type s_options s_inv_options s_options s_inv_iptions_once d_options d_inv_options d_options Examples Compute
45. Also see operator_ex08 Chapter 6 use lin_sol_self_int use rand_gen_int implicit none This is Example 4 for LIN_SOL_SELF integer i integer parameter m 8 n 4 real kind le0 parameter one 1 0e0 zero 0 0e0 real kind 1d0 parameter d_zero 0 0d0 integer ipivots n m 1 real kind le0 a m n b m 1 w m n nt m ntm amp g nt m 1 ey 1 r real kind 1e0 change_new change_old real kind 1d0 c m 1 d m n y n m 1 type s_options Llopti 2 s_options 0 zero Generate a random matrix call rand_gen w a reshape w m n Generate a random right hand side call rand_gen b 1 m 1 Save double precision copies of the matrix and right hand side 16 Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 d a c b Fill in augmented system for accurately solving the least squares problem f zero do i l m f i i one end do f l m m l a f m 1 1 m transpose a Start solution at zero y d_zero change_old huge one Use packaged option to save the factorization iopti 1 s_options s_lin_sol_self_save_factors zero iterative_refinement do g 1l m 1 c 1l m 1 y l im 1 matmul d y mt l m tn 1 g m l mtn 1 matmul transpose d y 1 m 1 call lin_sol_self f g h amp pivots ipivots iopt iopti Ya Rech yy change_new sum abs h Exit when changes are no longer decreasing
46. CALL elpop Mp_Setup Check the differences in the two solutions Unique solutions l may differ in the last bits due to rounding IF MP_RANK 0 THE ERROR SUM ABS X Y SUM Y IF ERROR lt sqrt EPSILON ERROR write amp Example 2 for PARALLEL NONNEGATIVE_LSQ is correct OPEN UNIT NIN FILE Atest dat STATUS OLD CLOSE NIN STATUS Delete Exit from using this process grid CALL BLACS_GRIDEXIT CONTXT CALL BLACS_EXIT 0 END PARALLEL_BOUNDED_LSQ Solve a linear least squares system with bounds on the unknowns Usage Notes CALL PARALLEL BOUNDED _LSO amp A B BND X RNORM W INDEX IPART amp NSETP NSETZ IOPT IOPT Required Arguments A 1 M Input Output Columns of the matrix with limits given by entries in the array IPART 1 2 1 max 1 MP_NPROCS On output A is replaced by the product QA where Qis an orthogonal matrix The value SIZE A 1 defines the value of M Each processor starts and exits with its piece of the partitioned matrix 252 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers IMSL Fortran 90 MP Library 4 0 B 1 M Input Output Assumed size array of length M containing the right hand side vector b On output b is replaced by the product Q b Ag where Qis the orthogonal matrix applied to A and g isa set of
47. Define differential equations ERO ERO CASE 6 D_PDE_1D_MG_C ZERO D_PDE_1D_MG_C 1 1 ONE D_PDE_1D_MG_C 2 2 ONE D_PDE_1D_MG_R D_PDE_1D_MG_U D_PDE_1D_MG_R 1 D_PDE_1D_MG_R 1 D_PDE_1D_MG_Q 1 100D0 D_PDE_1D_MG_U 1 D_PDE_1D_MG_U 2 D_PDE_1D_MG_0Q 2 D_PDE_1D_MG_Q 1 Define boundary conditions CASE 7 D_PDE_1D_MG_BETA ZERO D_PDE_1D_MG_GAMMA D_PDE END SELECT Reverse communication is used for the problem data CALL PDE_1D_MG TO TOUT IDO U IOPT IOPT END DO CONTAINS FUNCTION PULSE Z real kind 1d0 PI ACOS ON PULSE HALF END FUNCTION end program Z PULSE SIZE Z ONE COS 10D0 PI Z Fl Example 8 Black Scholes The value of a European call option 5 1 expiration date T satisfies the asset or nothing payoff c s T s s2 e 0 s lt e Scholes differential equation __1D_MG_U with exercise price and Prior to expiration 5 1 is estimated by the Black IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations 293 2 2 oO oO 2 C Egs Cys FrSC 1C C FEF 2 s g r 0 sc rc 0 The parameters in the model are the risk free interest rate F and the stock volatility o The boundary conditions are c 0 r and cs st alsa This development is described in Wilmott et al 1995 pages 41 57
48. LOL Pl alfa Lye N N 1 dl wolrafalu a lwfrole Z lin_sol_lsq_no_sing_mess iopt IO _options _lin_sol_lsq_set_small Small Replaces with Small if a diagonal term of the matrix R is smaller in magnitude than the value Small A solution is approximated based on this replacement in either case Default the smallest number that can be reciprocated safely iopt IO _options _lin_sol_lsq_save_QR _dummy Saves the factorization of A Requires the optional arguments pivots and trans if the routine is used for solving further systems with the same matrix This is the only case where the input arrays A and b are not saved For efficiency the diagonal reciprocals of the matrix R are saved in the diagonal entries of A gt iopt IO _options _lin_sol_lsq_solve_A _dummy Uses the factorization of A computed and saved to solve Ax b iopt IO _options _lin_sol_lsq_solve_ADJ _dummy Uses the factorization of A computed and saved to solve A D iopt IO _options _lin_sol_lsq_no_row_pivoting _dummy Does no row pivoting The array pivot s if present satisfies pivots i ifori 1 min m n iopt IO _options _lin_sol_lsq_no_col_pivoting _dummy Does no column pivoting The array pivots if present satisfies pivots i min m n ifori 1 min m n iopt IO _options _lin_sol_lsq_scan_for_NaN _dummy Examines each input array entry to
49. Set Ax y The vector x generates y Note the use of EOSHIFT and array operations to compute the matrix product n distinct ones as one array operation y lin lin d lin lin x lin l n Fog c 1l1 n 1 n EOSHIFT x 1 n 1 n SHIFT 1 DIM 1 amp b 1 n 1 n EOSHIFT x 1 n 1 n SHIFT 1 DIM 1 Compute the solution returned in y The input values of c d b and y are overwritten by lin_sol_tri Check for any error messages call lin_sol_tri c d b y Check the size of the residuals y x They should be small relative to the size of values in x res x l in l in y lin 1 n err sum abs res sum abs x l1 n 1 n if err lt sqrt epsilon one then write Example 1 for LIN_SOL_TRI is correct end if end Optional Arguments NCOLS n Input Uses arrays C 1 n 1 1 k D 1 n 1 k and B 2 n 1 k as the upper main and lower diagonals for the input tridiagonal matrices The right hand sides and solutions are in array y 1 n 1 k Note that each of IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 35 these arrays are rank 2 Default n size D 1 2 NPROB k Input The number of systems solved Default k size D 2 iopt iopt Input Derived type array with the same precision as the input matrix Used for passing optional data to the routine The options are as follows Packaged Options for 1in_sol_tri lin_sol_tri_set_small
50. TOLERANCE Replaces the default rank tolerance for using a column from EPSILON TOLERANCE to TOLERANCE Increasing the value of TOLERANCE will cause fewer columns to be increased from their constraints and may cause the minimum residual RNORM to increase IOPT 10 _OPTIONS PBLSQ_SET_MIN_RESIDUAL RESID Replaces the default target for the minimum residual vector length from 0 to RESID Increasing the value of RESID can result in fewer iterations and thus increased efficiency The descent in the optimization will stop at the first point where the minimum residual RNORM is smaller than RESID Using this option may result in the dual vector not satisfying its optimality conditions as noted above IOPT IO PBLSQ_SET MAX ITERATIONS IOPT I0 1 NEW_MAX_ITERATIONS Replaces the default maximum number of iterations from 3 N to NEW_MAX_ITERATIONS Note that this option requires two entries in the derived type array 254 e Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers IMSL Fortran 90 MP Library 4 0 Algorithm Subroutine PARALLEL_BOUNDED_LSQ solves the least squares linear system Ax b lt x lt using the algorithm BVLS found in Lawson and Hanson 1995 pages 279 283 The new steps involve updating the dual vector and exchange of required data using MPI The optional changes to default tolerances minimum resi
51. This is Example 2 for FAST_2DFT integer i integer parameter n 8 k 15 integer ip n n order k real kind le0 parameter one le0 two 2e0 zero 0e0 real kind le0 delta_t real kind le0 rn 3 s n t n temp n n new_order k complex kind le0 a b c a_trend n n 3 b_trend n n 1 amp f n n r n n x n n x_trend 3 1 complex kind le0 dimension n n Generate random data for planar trend rn rand rn a rn 1 b rn 2 c rn 3 Generate the frequency components of the g 1 1 rand g 1 1 g 1 1 rand g 1 1 Invert g into the harmonic series h call c_fast_2dft inverse_in g g zero h zero harmonic series Non zero random amplitudes given on two edges of the square domain in time domain inverse_out h Compute sampling interval delta_t two n s on i 1 delta_t i 1 n t on i 1 delta_t i 1 n Make up data set as a linear trend plus harmonics amp h x a b spread s dim 2 ncopies n 4 c spread t dim 1 ncopies n 4 Define least squares matrix data for a planar trend a_trend 1 1 one a_trend 1 2 reshape spread s dim a_trend 1 3 reshape spread t dim b_trend 1 1 reshape x n n Solve for a linear trend call lin_sol_lsq a_ Compute harmonic residuals Ea x reshape ma Transform harmonic residuals call c_fast_2dft forward_in r ip i i 1 n
52. U NPDE 1 1 ZERO U NPDE 1 N XMAX OPEN FILE PDE_ex02 out UNIT 7 NFRAMES NINT TEND DELTA_T DELTA_T WRITE 7 3I5 4D14 5 NPDE N NFRAMES amp U NPDE 1 1 U NPDE 1 N TO TEND DX1 XMAX N2 DX2 DX1 N1 IOPT 1 D_OPTIONS PDE_1D_MG_RELATIVE_TOLERANCE ZERO IOPT 2 D_OPTIONS PDE_1D_MG_ABSOLUTE_TOLERANCE TOL IOPT 3 D_OPTIONS PDE_1D_MG_TIME_SMOOTHING 1D 3 Update to the next output point Write solution and check for final point CASE 2 TO TOUT IF TO lt TEND HEN WRITE 7 F10 5 TOUT DO I 1 NPDE 1 WRITE 7 4E15 5 U I END DO TOUT MIN TOUT DELTA_T TEND IF TO TEND IDO 3 END IF All completed CASE Solver is shut down 3 CLOSE UNIT 7 EXIT IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations 281 Define initial data values CASE 5 U NPDE ZERO U 1 ONE DO I 1 N1 U NPDE 1 1 I 1 DX2 END DO DO I N1 1 N U NPDE 1 1 I N1 DX1 END DO WRITE 7 F10 5 TO DO I 1 NPDE 1 WRITE 7 4E15 5 U TI END DO Define differential equations CASE 6 D_PDE_1D_MG_C ZERO D_PDE_1D_MG_C 1 1 ONE D_PDE_1D_MG_C 2 1 D_PDE_1D_MG_U 1 D_PDE_1D_MG_R 1 D_PDE_1D_MG_U 2 D
53. Wm the cross validation squared error C A is given by 2 T mC A 4 Yw a X A b k 1 With the SVD A USV and product g U Tb this quantity can be written as 2 n 2 by Las ay j l J k 1 z s ipar a ma 8 2 This expression is minimized See Golub and Van Loan 1989 Chapter 12 for more details In the Example 4 code mC A at p 10 grid points are evaluated using a log scale with respect to A 0 1s lt A lt 10s Array operations and intrinsics are used to evaluate the function and then to choose an approximate minimum Following the computation of the optimumA the regularized solutions are computed Also see operator_ex24 Chapter 6 use lin_svd_int use rand_gen_int implicit none This is Example 4 for LIN_SVD integer i integer parameter m 32 n 16 p 10 k 4 real kind 1d0 parameter one 1d0 real kind 1d0 log_lamda log_lamda_t delta_log_lamda real kind 1d0 a m n b m k w m k g m k t n s n amp s_sq n u m m v n n y m max n k amp 54 Chapter 2 Singular Value and Eigenvalue Decomposition IMSL Fortran 90 MP Library 4 0 c_lamda p k lamda k x n k res n k Generate random rectangular matrices for A and right hand sides b call rand_gen y a reshape y m n call rand_gen y b reshape y m k Generate random weights for each of the right hand sides call rand_gen y w reshape y m k
54. appearing in the hard copy documentation has been deleted from the on line documentation causing a skip in page numbering before the first page of the next chapter for instance Chapter 8 of this on line manual ends on page 299 and Chapter 9 begins on page 301 Numbering Pages When you refer to a page number in the PDF online documentation be aware that the page number in the PDF online documentation will not match the page number in the original document A PDF publication always starts on page 1 and supports only one page numbering sequence per file Copying text Click the fl button and drag to select and copy text Viewing Multiple Online Manuals Select Open from the File menu and open the PDF file you need Select Cascade from the Window menu to view multiple files Resizing the Bookmark Area in Windows Drag the double headed arrow that appears on the area s border as you pass over it Resizing the Bookmark Area in UNIX Click and drag the button J that appears on the area s border at the bottom of the vertical bar Jumping to Topics Throughout the text of this manual references to subroutines examples tables or other sections appear in green color underline style to indicate that you can jump to them To return to the page from which you jumped use the return back icon on the toolbar Note If you zoomed in or out after jumping to a topic you will return to the previous zoom view s before returning to the page from wh
55. eigenvalue as an explicit shift Called by lin_eig_self A rational QR algorithm for computing eigenvalues of real symmetric tri diagonal matrices Called by lin_svd and lin_eig_self A real tri diagonal multiple system solver Uses both cyclic reduction and Gauss elimination Similar in function to lin_sol_tri Constrained weighted least squares fitting of B splines to discrete data with covariance matrix and constraints at points See Chapter 5 See Chapter 7 See Chapter 7 See Chapter 5 See Chapter 5 See Chapter 4 See Chapter 4 L6a L6c N1 N1 N1 N6oalb Ela E2a D4c Dla3b R3 D4c D4c D2a2a Klalal IMSL Fortran 90 MP Library 4 0 spline_support surface_fairing lin_sol_lsq_con lin_sol_lsq_ing least_proj_distance band_accumulation band_solve house_holder Parallel_nonnegative_lsq Parallel_bounded_lsq ScaLAPACK_ READ ScaLAPACK WRITE pde_ld_mg B spline function and derivative evaluation package Constrained weighted least squares fitting of tensor product B splines to discrete data with covariance matrix and constraints at points Routines for constrained linear least squares based on a least distance dual algorithm Routines to accumulate and solve banded least squares problem using Householder transformations Routines for solving a large least squares system with non negative constraints using parallel computing Routines for s
56. end if end Operator_ex29 use linear_operators implicit none This is Example 1 using operators for LIN_EIG_GEN integer parameter n 32 real kind 1d0 parameter one 1d0 real kind 1d0 err real kind 1d0 dimension n n A complex kind 1d0 dimension n E E_T V n n Generate a random matrix A rand A Compute only the eigenvalues E EIG A Compute the decomposition A V V values obtaining eigenvectors T EIG A W V Use values from the first decomposition vectors from the second decomposition and check for small residuals err norm A x V V x diag E amp norm A norm E if err lt sqrt epsilon one then write Example 1 for LIN_EIG_GEN operators is correct end if end IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option e 199 Operator_ex30 use linear_operators implicit none This is Example 2 using operators for LIN_EIG_GEN integer i integer parameter n 12 real kind 1d0 parameter one 1d0 zero 0d0 complex kind 1d0 dimension n a n n b e f fg b rand b Define the companion matrix with polynomial coefficients in the first row A zero A EOSHIFT EYE n SHIFT 1 DIM 2 a 1 1 b Compute complex eigenvalues of the companion matrix E
57. handled according to the PRINT and STOP attributes set by the user IMSL routines must handle all informational errors of types to 4 that are returned to them by other IMSL routines that they reference j User Default IMSL Routine PRINT STOP PRINT STOP 1 NO NO NO NO Dy NO NO NO NO 3 YES NO NO NO 4 YES YES YES YES 5 YES YES 6 YES YES F YES YES Error Types and Attributes Seven error types are defined Each error type has associated PRINT and STOP attributes These flags have default settings YES or NO and may be set by the user The purpose of having multiple error types is to provide independent control default and user defined for errors of different types In the parallel version a STOP attribute of YES means that after all messages are sent to the root node for printing the root node will broadcast STOP after printing the entire suite of messages if any node has a STOP attribute of YES Then MPI will be finalized if it has ever been initialized and the STOP executed To avoid shutting down MPI all processors must have their STOP attributes set to NO after printing error messages Error Control Control is provided for error handling by a stack with four values for each level The values are routine name error type error code and an attribute flag that selects either the user PRINT and STOP attribut
58. l ndegree pi bkpt nbkpt ndegreetl nbkpt pi bkpt nord nbkpt ndegree piti delta i 0 ngrid 1 Assign the degr of the polynomial and the knots pointer_bkpt gt bkpt knotsx d_spline_knots ndegree pointer_bkpt knotsy knotsx Fit a data surface for each coordinate Set default regularization parameters to zero and compute turned residuals of the individual points These are r in DATA 4 do j 1 3 data spline_data j OPTIONS OPTIONS OPTIONS OPTIONS d_options surface_fitting_tol_least 1d surface_fitting_residuals 1 d_options surface_fitting_thinness zero 2 d_options surface_fitting_flatness zero OPTIONS 3 d_options surface_fitting_smallness zero 4 5 5 coeff j surface_fitting data knotsx knotsy amp IOPT OPTIONS end do Evaluate the function at a grid of points insid latitude and longitude covering the sphere just truncation and rounding errors delta pi nvalues 1 the rectangle of once Add the sum of squares They should equal A 2 but will not due to x pi twoti delta i 1 nvalues y two x values zero do j 1 3 values valuest surface_values 0 0 x y knotsx knotsy coeff 3 2 end do values values A 2 Compute the R M S error sizev norm pack values values values if sizev lt TOLERANCE then write Example 2 for SURFACE_FITTING
59. transform mdata m Input Uses the sub array in first dimension of size m for the numbers Default value m size x 1 ndata n Input Uses the sub array in the second dimension of size n for the numbers Default value n size x 2 kdata k Input Uses the sub array in the third dimension of size k for the numbers Default value k size x 3 ido ido Input Output Integer flag that directs user action Normally this argument is used only when the working variables required for the transform and its inverse are 92 e Chapter 3 Fourier Transforms IMSL Fortran 90 MP Library 4 0 saved in the calling program unit Computing the working variables and saving them in internal arrays within fast_3dft is the default This initialization step is expensive There is a two step process to compute the working variables just once The general algorithm for this usage is to enter fast_3dft with ido Q A return occurs thereafter with ido lt 0 The optional rank 1 complex array w with size w gt ido must be re allocated Then re enter fast_3dft The next return from fast_3dft has the output value ido 1 The variables required for the transform and its inverse are saved in w Thereafter when the routine is entered with ido 1 and for the same values of m and n the contents of w will be used for the working variables The expensive initialization step is avoided The optional arguments ido and work
60. 0 Chapter 6 Operators and Generic Functions The Parallel Option 213 Parallel Example 8 This example similar to Parallel Example 3 shows the box data type used while obtaining an accurate solution of several linear least squares systems Computation of the residuals for the box data type is executed in parallel Only the root node performs the factorization and update step during iterative refinement use linear_operators use mpi_setup_int implicit none TPNCILUD EN impate shy This is Parallel Example 8 All nodes share in just part of the work integer parameter m 8 n 4 nr 4 real kind le0 one le0 zero 0e0 real kind 1d0 d_zero 0d0 integer ipivots n tm 1 ierror nrack real kind le0 A m n nr b m 1 nr F n m ntm nr amp San 1 iwe 5 an Alp ialie real kind le0 change_new nr change_old nr weSeull Neslinel ikl etm pine D mip ia ie 37 Gaarii il imi type s_options TLGyoe a 2 INSGEUp Eo MPAs mp_nprocs mp_setup Generate a random matrix and right hand side if mp_rank 0 then A rand A b rand b endif Save double precision copies of the matrix and right hand side D A c b Fill in augmented matrix for accurately solving the least squares problem using iterative refinement F zero do nrack 1 nr F 1l m 1 m nrack EYE m enddo iP Loin ier ile 8 AAS in erie Lei 8 oie JN l Sran
61. 1 Compute the generalized eigenvalues ALPHA EIG A B B D d_beta See if singular DAE system is detected if isNaN ALPHA then write Example 3 for LIN_GEIG_GEN operators is correct end if Clean up allocated option arrays for good housekeeping IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 203 deallocate d_eig_options end Operator_ex36 use linear_operators implicit none This is Example 4 for LIN_GEIG_GEN using operators integer parameter n 32 real kind 1d0 parameter one 1d0 zero 0d0 real kind 1d0 a n n b n n beta n err complex kind 1d0 alpha n v n n Generate random matrices for both A and B A rand A B rand B Set the option a larger tolerance than default for lin_sol_lsq allocate d_eig_options 6 d_eig_options 1l options_for_lin_geig_gen d_eig_options 2 4 d_eig_options 3 d_lin_geig_gen_for_lin_sol_lsq d_eig_options 4 2 d_eig_options 5 d_options d_lin_sol_lsq_set_small amp sqrt epsilon one norm B 1 d_eig_options 6 d_lin_sol_lsq_no_sing_mess Compute the generalized eigenvalues alpha EIG A B B D beta W V Check the residuals err norm A x V x diag beta B x V x diag alpha 1 amp norm A 1 norm beta 1 norm B 1 norm alpha 1 if err lt sqrt epsilon one then w
62. 161 The code now updates the dual vector W of Step 2 page 161 The remaining new steps involve exchange of required data using MPI Example 1 Distributed Linear Inequality Constraint Solver The program PNLSQ_EX1 illustrates the computation of the minimum Euclidean length solution of an m x n system of linear inequality constraints Gy 2 h The solution algorithm is based on Algorithm LDP page 165 166 loc cit The rows of E G h are partitioned and assigned random values When the minimum Euclidean length solution to the inequalities has been calculated the residuals r Gy h 2 0 are computed with the dual variables to the NNLS problem indicating the entries of r that are precisely zero The fact that matrix products involving both EF and E T are needed to compute the constrained solution y and the residuals 7 implies that message passing is required This occurs after the NNLS solution is computed PROGRAM PNLSQ_EX1 processors Use Parallel_nonnegative_LSQ to solve an inequality constraint problem Gy gt h This algorithm uses Algorithm LDP of Solving Least Squares Problems page 165 The constraints are allocated to the Dy GOW S in COLUMNAS OE cae array AN cece ne USE PNLSQ_INT US MPI_SETUP_INT IMSL Fortran 90 MP Library 4 0 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers 247 US US RAND_INT SHOW_INT El mi IMPLICIT NON
63. 35 n ival 7 41 11 n Set integer options call iumag math ichap iput 6 iopt ival Reset tolerances for integrator atol le 3 rtol le 3 sval 1 atol sval 2 rtol iopt 1 inr 5 Set floating point options call sumag math ichap iput 1 topt sval Integrate ODE DAE Use dummy external names for g y y and partials ido 1 Integration_Loop do call d2spg n t tend ido y ypr dgspg djspg iwk wk Find where g y y goes It only goes in one place here but can vary where divided differences are used for partial derivatives iopt 1 in 27 call iumag math ichap iget 1 iopt ival Direct user respons select case ido case 1 4 This should not occur write Unexpected return with ido ido stop case 3 Reset options to defaults This is good housekeeping but not required for this problem in in call iumag math ichap iput 50 in ival inr inr call sumag math ichap iput 20 inr sval exit Integration_Loop case 5 Evaluate partials of g y t_y y t_ypr y ypr t_g r_diag t_y r_off EOSHIFT t_y SHIFT 1 amp EOSHIFT r_off t_y SHIFT 1 amp a_diag t_ypr a_off EOSHIFT t_ypr SHIFT 1 amp EOSHIFT a_off t_ypr SHIFT 1 Move data from the assumed size to assumed shape arrays do i l n wk ival 1 i 1 t_g i end do cycle Integration_Loop
64. 90 MP Library 4 0 x _ pde_ d_mg_x t pde_ ld_mg_t yi pde_ld_mg_u j du ox _ pde_ld_mg_beta j B x t u u ul _ pde_ d_mg_dudx j _ pde_ d_mg_ gamma j y x t u j 1 NPDE uy The value xr xr and the logical flag pde_1d_mg_L KE T TRUI E for X XL It has the value pde_ld_mg_LEFT FALSE for If any of the functions cannot be evaluated set pde_1d_mg_ires 3 Otherwise do not change its value IDO 8 This value is assigned by the integrator requesting the calling program to prepare for solving a banded linear system of algebraic equations This value will occur only when the option for reverse communication solving is set in the array IOPT with option PDE_1D_MG_REV_COMM_FACTOR_SOLVE The matrix data for this system is in Band Storage Mode described in the section Reference Material for the IMSL Fortran Numerical Libraries PDE_1D_MG_IBAND Half band width of linear system PDE_1D_MG_LDA The value 3 PDE_1D_MG_IBAND 1 with NEQ NPDE 1 N _PDE_1D_MG_A Array of size PDE_1D_MG_LDA by NEQ holding the problem matrix in Band Storage Mode PDE_1D_MG_PANIC_FLAG Integer set to a non zero value only if the linear system is detected as singular IDO 9 This value is assigned by the integrator requesting the calling program to solve a linear system with the matrix defined as noted with IDO 8 _PDE_1D_
65. Cyclic Reduction and perf_ratio is larger than one re solve using Gaussian Elimination If the method is already Gaussian Elimination the loop exits and perf_ratio is checked at the end perf_ratio norm res l in 1 k 1 amp norm EVAL_T 1 k 1 amp epsilon s_one 5 n if perf_ratio lt s_one exit factorization_choice iopt nopt 1 s_lin_sol_tri_use_Gauss_elim end do factorization_choice if perf_ratio lt s_one then write Example 3 for LIN_SOL_TRI operators is correct end if end IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 189 Operator_ex20 use lin_sol_tri_int use Numerical_Libraries implicit none This is Example 4 using operators for LIN_SOL_TRI integer parameter n 1000 ichap 5 iget 1 iput 2 amp inum 6 irnum 7 real kind le0 parameter zero 0e0 one 1le0 integer i ido in 50 inr 20 iopt 6 ival 7 amp iwk 35 n real kind le0 hx pi_value t u_0 u_l atol rtol tend wk 41 11 n y n ypr n a_diag n amp a_off n r_diag n r_off n t_y n t_ypr n amp t_g n t_diag 2 n 1 t_upper 2 n 1 amp t_lower 2 n 1 t_sol 2 n 1 type s_options iopti l s_options 0 zero Define initial data t 0e0 u_O one u_l 0 5 tend one Initial values for the variational equation y one ypr zero pi_value const pi hx pi_value n 1
66. Cyclic Reduction did not get an accurate solution It is an exceptional event when Gaussian Elimination is required if norm x_sol x_save 1 norm x_save 1 amp lt sqrt epsilon d_one exit factorization_choice iopt nopt 1 s_lin_sol_tri_use_Gauss_elim end do factorization_choice Check on accuracy of solution err norm x l in l in x_save 1 norm x_save 1 if err lt sqrt epsilon d_one then write Example 2 for LIN_SOL_TRI operators is correct end if end Operator_ex19 use linear_operators use lin_sol_tri_int use rand_int use Numerical_Libraries implicit none This is Example 3 using operators for LIN_SOL_TRI integer i nopt integer parameter n 128 k n 4 ncoda 1 lda 2 real kind le0 parameter s_one le0 s_zero 0e0 real kind le0 A lda n EVAL k type s_options iopt 2 real kind le0 d n b n d_t 2 n k c_t 2 n k perf_ratio amp b_t 2 n k y_t 2 n k eval_t k res n k logical small This flag is used to get the k largest eigenvalues small false Generate the main diagonal and the co diagonal of the tridiagonal matrix b rand b d rand d A 1 1 b A 2 1 d Use Numerical Libraries routine for the calculation of k largest eigenvalues CALL EVASB N K A LDA NCODA SMALL EVAL EVAL T EVAL 188 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP
67. Example 2 Matrix Inversion and Determinant This example computes the inverse and determinant of A a random matrix Tests are made on the conditions and Also see operator_ex02 use lin_sol_gen_int use rand_gen_int implicit none This is Example 2 for LIN_SOL_GEN integer i integer parameter n 32 real kind 1e0 parameter one 1 0e0 zero 0 0e0 real kind 1e0 err real kind le0 A n n b n 0 inv n n x n 0 res n n amp y n 2 determinant 2 inv_determinant 2 Generate a random matrix call rand_gen y A reshape y n n IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 5 Compute the matrix inverse and its determinant call lin_sol_gen A b x nrhs 0 amp ainv inv det determinant Compute the determinant for the inverse matrix call lin_sol_gen inv b x nrhs 0 amp det inv_determinant Check residuals A times inverse Identity res matmul A inv do i l n res i i res i i one end do lt sqrt epsilon one abs determinant 2 then err sum abs res sum abs a if err lt sqrt epsilon one then if determinant 1 inv_determinant 1 and amp abs determinant 2 inv_determinant 2 amp lt abs determinant 2 sqrt epsilon one then write Example 2 for LIN_SOL_GEN is correct end if end if end Example 3 Solving a System with Iterative Refinement This example com
68. Fortran 90 MP Library users are for the most part shielded from the complexity of MPI it is desirable for some users to learn this important topic Users should become familiar with any referenced MPI routines and the documentation of their usage MPI routines are not discussed here because that is best found in the above references The Fortran 90 MP Library algorithm for allocating the racks of the box to the processors consists of creating a schedule for the processors followed by communication and execution of this schedule The efficiency may be improved by using the nodes according to a specific priority order This order can reflect information such as a powerful machine on the network other than the user s work station or even complex or transient network behavior The Fortran 90 MP Library allows users to define this order including using a default A setup function establishes an order based on timing matrix products of a size given by the user Parallel Example 4 illustrates this usage Getting Started with Modules MPI_setup_int and MPI_node_int The MPI_setup_int and MPI_node_int modules are part of the Fortran 90 MP Library and not part of MPI itself Following a call to the function MP_SETUP the module MPI_node_int will contain information about the number of processors the rank of a processor the communicator for Fortran 90 MP Library and the usage priority order of the node machines Since MPI_node_int is used by MPI_se
69. J 1 N B I END DO CLOSE NIN ELSE No resources are used where this array is not saved ALLOCATE A M 0 END IF Define the matrix descriptor This includes the right hand side as an additional column The row block size on each processor is arbitrary but is chosen here to match the column block size DESCEN 0 1 COND aM NPP DNatle eDiets OP Oran Mis Read the data by rows IOPT 1 ScaLAPACK_READ_BY_ROWS CALL ScaLAPACK_READ Atest dat DESC_A amp d_A IOPT IOPT Broadcast the right hand side to all processors JSHIFT NP IPART 1 1L 1 IF K gt 0 B d_A JSHIFT IF MP_NPROCS gt 1 amp CALL MPI_BCAST B M MPI_DOUBLE_PRECISION L 1 amp MP_LIBRARY_WORLD IERROR Adjust the partition of columns to ignore the last column which is the right hand side It is now moved to B IMSL Fortran 90 MP Library 4 0 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers 251 IPART 2 min N IPART 2 Solve the constrained distributed problem C B CALL Parallel_Nonnegative_LSQ amp d_A B X RNORM W INDEX IPART Solve the problem on one processor with data saved orea er OSS Cheeky IPART 2 0 IPART 2 1 N MP_NPROCS 1 Since all processors execute this code all arrays must be allocated in the main program CALL Parallel_Nonnegative_LSQ amp Av Creo ee RNORM AEN ENDEX meals ZANT See to any errors
70. LIN_SOL_SELF is correct end if end Optional Arguments NROWS n Input Uses array A 1 n 1 n for the input matrix Default n size A 1 NRHS nb Input Uses the array b 1 n 1 nb for the input right hand side matrix Default nb size b 2 Note that b must be a rank 2 array 10 Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 pivots pivots Output Input Integer array of size n that contains the individual row interchanges in the first n locations Applied in order these yield the permutation matrix P Location n 1 contains the number of the first diagonal term no larger than Small which is defined on the next page of this chapter det det 1 2 Output Array of size 2 of the same type and kind as A for representing the determinant of the input matrix The determinant is represented by two numbers The first is the base with the sign or complex angle of the result The second is the exponent When det 2 is within exponent range the value of the determinant is given by the expression abs det 1 det 2 det 1 abs det 1 If the matrix is not singular abs det 1 radix det otherwise det 1 0 and det 2 huge abs det 1 ainv ainv Output Array of the same type and kind as A 1 n 1 n It contains the inverse matrix A when the input matrix is nonsingular iopt iopt Input Derived type array with the same precision as the input matrix used for pas
71. Library 4 0 Use Fortran 90 MP Librarytridiagonal solver for inverse iteration calculation of eigenvectors factorization_choice do nopt 0 1 Create k tridiagonal problems one for each inverse iteration system bee then Lek spread b DIM 2 NCOPIES k c_t 1 n 1 k EOSHIFT b_t 1 n 1 k SHIFT 1 DIM 1 d_t l in 1 k spread d DIM 2 NCOPIES k amp spread EVAL_T DIM 1 NCOPIES n Start the right hand side at random values scaled downward to account for the expected blowup in the solution y_t rand y_t Do two iterations for the eigenvectors do i l 2 y_t lin 1 k y_t 1l n 1 k epsilon s_one cati tin sol stra c t dit bit Yet E iopt iopt iopt nopt 1 s_lin_sol_tri_solve_only end do Orthogonalize the eigenvectors This is the most intensive part of the computing y_t lin 1 k ORTH y_t 1l n 1 k See if the performance ratio is smaller than the value one If it is not the code will re solve the systems using Gaussian Elimination This is an exceptional event It is a necessary complication for achieving reliable results res 1l n 1 k spread d DIM 2 NCOPIES k y_t l in 1 k amp spread b DIM 2 NCOPIES k amp EOSHIFT y_t 1l n 1 k SHIFT 1 DIM 1 amp EOSHIFT spread b DIM 2 NCOPIES k y_t 1 n 1 k SHIFT 1 amp y_t 1 n 1 k spread EVAL_T 1 k DIM 1 NCOPIES n If the factorization method is
72. NUMROC N NB MYCOL 0 NPCOL 242 e Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers IMSL Fortran 90 MP Library 4 0 LDA_B NUMROC N MB MYROW 0 NPROW TDA _B 1 ALLOCATE d_A LDA_A TDA Aje GL BODA B TDA B pe IPIV LDA_A MB A root process is used to create the matrix data for the test IF MP_RANK 0 THEN ALLOCATE A N N B N CALL RANDOM NUMBER A X N CALL RANDOM_NUMBER X Compute the correct result PRS A OPEN UNIT NIN FIL Write the data by columns DO J 1 N NB WRITE NIN A I END DO CLOSE NIN OPEN UNIT NIN FILE B MATMUL A X SIZE_X SUM ABS X Atest dat STATUS UNKNOWN L I 1 N L J min N J NB 1 Btest dat STATUS UNKNOWN Write the data by columns WRITE NIN B I I CLOSE NIN END IF ESC_A 1 CONTXT N ESC_B 1 CONTXT N HS ClX DESC EB le Meier 1 N Define the descriptor for the global matrices N MB NB 0 0 LDA_A iL Wig INS Op CDA 18 Read the factors into the local arrays CALL ScaLAPACK_READ At CALL ScaLAPACK_READ Bt Compute the distribut TA 1 JA 1 IB 1 JB 1 CALL pdGESV amp ig ip Clay WA wy D d_B IB JB DESC_B Put the result on the Call ScaLAPACK_WRITE X IF MP_RANK 0 THEN
73. O object oriented 141 one dimensional smoothing check list 96 optional argument vi optional data iv vi optional subprogram arguments vi ordinary eigenvectors example 47 69 orthogonal decomposition 50 factorization 22 matrix iii orthogonalized 40 59 P parametric linear systems with scalar change 68 parametric systems 68 partial pivoting 34 38 PBLAS 231 permutation 136 Petzold 43 piece wise polynomial 96 97 piecewise linear Galerkin 43 pivoting partial 2 5 11 row and column 18 22 symmetric 9 polar decomposition 29 38 polynomial degree 96 printing an array example 123 137 IMSL Fortran 90 MP Library printing arrays 137 private message files 126 PV_WAVE 275 Q QR algorithm 50 58 double shifted 66 QR decomposition 156 R radial basis functions 24 random complex numbers transforming an array 79 86 91 random numbers 126 real numbers sorting 134 record keys sorting 136 reduction array of black and white 30 regularizing term 37 Reid ii required arguments v vi reverse communication 43 ridge regression 47 54 cross validation example 47 54 Rodrigue 37 row and column pivoting 18 22 row vector heavily weighted 25 S ScaLAPACK contents 231 232 data types 231 232 definition of library 231 interface modules 233 reading utility block cyclic distributions 233 Schur form 62 68 self adjoint eigenvalue problem 61 linear system 16 matrix 9 12 58 61 eigenvalues 14 56 62 tridia
74. Option Prefix Option Value lin_sol_tri_set_jolt lin_sol_tri_scan_for_NaN Zz lin_sol_tri_factor_only a AIN Zz lin_sol_tri_solve_only s d c z_ lin_sol_tri_use_Gauss_elim 6 iopt IO _options _lin_sol_tri_set_small Small Whenever a reciprocation is performed on a quantity smaller than Small it is replaced by that value plus 2 x jolt Default 0 25 x epsilon iopt IO _options _lin_sol_tri_set_jolt jolt Default epsilon machine precision iopt IO _options _lin_sol_tri_scan_for_NaN _dummy Examines each input array entry to find the first value such that isNaN C i j or isNaN D i j or isNaN B i j or isNaN Y i j true See the isNan function Chapter 6 Default Does not scan for NaNs iopt IO _options _lin_sol_tri_factor_only _dummy Obtain the LU factorization of the matrices Aj Does not solve for a solution Default Factor the matrices and solve the systems iopt IO _options _lin_sol_tri_solve_only _dummy Solve the systems A x y using the previously computed LU 36 Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 factorization Default Factor the matrices and solve the systems iopt IO _options _lin_sol_tri_use_Gauss_elim _dummy The accuracy numerical stability or efficiency of the cyclic reduction algorithm may be inferior to the use of LU factorization with partial
75. Then DOUBLE PRECISION matrix products C AB where A and B are N by N matrices are computed at each node and the elapsed time is recorded These elapsed times are sorted and the contents of MPI_NODE_PRIORITY permuted in accordance with the shortest times yielding the highest priority All the nodes in the communicator MP LIBRARY WORLD are timed The array MPI_NODE_PRIORITY is then broadcast from the root to the remaining nodes of MP_LIBRARY_WORLD using the routine MPI_Bcast Timing matrix products to define the node priority is relevant because the effort to compute C is comparable to that of many linear algebra computations of similar size Users are free to define their own node priority and broadcast the array MPI_NODE_PRIORITY tothe alternate nodes in the communicator To print any IMSL Fortran 90 MP Library error messages that have occurred at any node and to finalize MPI use the function call MP_SETUP Final Case of the string Final is not important Any error messages pending will be discarded after printing on the root node This is triggered by popping the name MP_SETUP from the subprogram stack or returning to Level 1 in the stack Users can obtain error messages by popping the stack to Level and still continuing with MPI calls This requires executing call elpop MP_SETUP To continue on after summarizing errors execute call elpsh MP_SETUP More details about th
76. This grid is returned to the user equally spaced but can be updated as desired provided the values are increasing Required Provide initial values for all components of the system at the grid of values U NPDE 1 j j 1 N tthe optional step of updating the initial grid is performed then the initial values are evaluated at the updated grid IDO 6 This value is assigned by the integrator requesting data for the differential equations Following this evaluation the integrator is re entered Evaluate the terms of the system of Equation 2 A default value of m 0 is assumed but this can be changed to one of the other choices m 1 or 2 Use the optional argument IOPT for that purpose Put the values in the arrays as indicated2 x _ pde_ ld_mg_x t _pde_ ld_mg_t u pde_ld_mg_u j au ox _ pde_ld_mg_ce j k C x t u U P uy _ pde_ d_ mg _dudx j _ pde_ d_mg_ r j rj x t u U _ pde_ld_mg_q j i q x t u uy j k 1 NPDE If any of the functions cannot be evaluated set pde_1d_mg_ires 3 Otherwise do not change its value IDO 7 This value is assigned by the integrator requesting data for the boundary conditions as expressed in Equation 3 Following the evaluation the integrator is re entered 2 The assign to equality A b used here and below is read the expression b is evaluated and then assigned to the location a 270 Chapter 8 Partial Differential Equations IMSL Fortran
77. This is a Parallel Example 18 for SURFACE_FITTING or tensor product B splines approximation Fit x y z parametric functions for points on the surface of a sphere of radius A Random values of latitude and longitude are used to generat data The functions are evaluated at a rectangular grid in latitude and longitude and checked so they lie on the surface of the spher integer i j ierror status MPI_STATUS_SIZE integer parameter ngrid 5 nord 8 ndegree nord 1l amp nbkpt ngrid 2 ndegree ndata 400 nvalues 50 NOPT 4 real kind 1d0 parameter zero 0d0 one 1d0 two 2d0 real kind 1d0 parameter TOLERANCE 1d 3 real kind 1d0 target spline_data 4 ndata 3 bkpt nbkpt amp coeff ngridtndegree 1 ngridtndegree 1 3 delta sizev amp pi A x nvalues y nvalues values nvalues nvalues amp data 4 ndata real kind 1d0 pointer pointer_bkpt cyose GaSe talc CC Onsiteacuinizs pmcleh Galtaallole lt 3 Che type d_spline_knots knotsx knotsy type d_options OPTIONS NOPT Sebup stor AMENE MP_NPROCS MP_SETUP BLOCK DO This program needs at least three nodes plus a root to execute INS nenny lS e rror messages may print if mp_nprocs lt 4 then Callies tls Gi MPmaNPROES call elmes 5 1 Parallel Example 18 requires FOUR nodes amp to execute Number of nodes is now I11 EXIT BLOCK endif l Ger cha Con
78. This value is obtained using the date routine CALL DATE_AND_TIME VALUES values and converting values 5 8 to milliseconds The LCM generator initializes the sequence x using the following recurrence m lt mxXk mod huge 1 2 The default value of k 16807 Using the optional argument iopt and the packaged option number _rand_gen_LCM_modulus k can be given an alternate value The option number _rand_gen_generator_seed can be used to set the initial value of m instead of using the asynchronous value given by the system clock This is illustrated in Example 2 If the default choice of m results in an unsatisfactory starting sequence or it is necessary to duplicate the sequence then it is recommended that users set the initial seed value to one of their own choosing Resetting the seed complicates the usage of the routine This software is based on Fushimi 1990 who gives a more elaborate starting sequence for the xt The starting sequence suggested by Fushimi can be used with the option number _rand_gen_use_Fushimi_start Fushimi s starting process is more expensive than the default method and it is equivalent to starting in another place of the sequence with period 2 Additional Examples Example 2 Seeding Using and Restoring the Generator use rand_gen_int implicit none This is Example 2 for RAND_GEN integer 1 integer parameter n 34 p 521 real kind le0Q parameter
79. Z H R A DELTA ONE A Z EXP DELTA ON D_MG_GAMMA D_PDE_1D_MG_DUDX ONE he problem data IOPT IOPT E Z ONE END FUNCTIO end program Example 7 Traveling Waves This example is presented more fully in Verwer et al 1989 The system is a normalized problem relating the interaction of two waves u x t and v x moving in opposite directions The waves meet and reduce in amplitude due to the non linear terms in the equation Then they separate and travel onward with reduced amplitude IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations 291 u u 100uv v v 100uv 0 5 lt x lt 0 5 0 lt t lt 0 5 u x 0 0 5 1 cos 10nx x e 0 3 0 1 and 0 otherwise v x 0 0 5 1 cos 10nx x 0 1 0 3 and 0 otherwise u v Q at both ends t 2 0 Rationale This is a non linear system of first order equations program PDE_1D_MG_EX07 Traveling Waves USE pde_1d_mg_int USE error_option_packet IMPLICIT NONE CRI T INTEGER PARAMETER NPDE 2 N 50 INTEGER I IDO NFRAMES Define array space for the solution real kind 1d0 U NPDE 1 N TEMP N TO TOUT real kind 1d0 ZERO 0D0 HALF 5D 1 amp ONE 1D0 DELTA_T 5D 2 TEND 5D 1 PI
80. a rank 2 Euclidean length subroutine to compute the lengths of the nonzero columns which are then normalized to have lengths of value one IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 171 The subroutine carefully avoids overflow or damaging underflow by rescaling the sums of squares as required There are no reserved names Example Normalize a set of random vectors A UNIT RAND A Overloaded etc for Derived Types To assist users in writing compact and readable code the IMSL Fortran 90 MP Library provides overloaded assignment and logical operations for the derived types s_options d_options s_error and d_error Each of these derived types has an individual record consisting of an integer and a floating point number The components of the derived types in all cases are named idummy followed by rdummy In many cases the item referenced is the component idummy This integer value can be used exactly as any integer by use of the component selector character Thus a program could assign a value and test after calling a routine s_epack 1 idummy 0 call lin_sol_gen A b x epack s_epack if s_epack 1 idummy gt 0 call error_post s_epack Using the overloaded assignment and logical operations this code fragment can be written in the more readable form s_epack 1 0 call lin_sol_gen A b x epack s_epack if s_epack 1 gt 0 call error_post s_e
81. alc x 1 t 4 WA IMOLIN END IF P_NPROCS MP_SETUP Final end program IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations 299 Chapter 9 Error Handling and Messages The Parallel Option Introduction This chapter describes the error handling system used from within the IMSL MPI REQUIRED Fortran 90 MP Library Errors of differing types may need to be reported from several nodes We have developed an error processor that uses MPI when it is appropriate for communication of error messages to the root node which then does the printing to an open output unit We encourage users to include this error processor in their own applications that use MPI for distributed computing VNI started its development with the IMSL FORTRAN error processor see Aird and Howell 1992 in use with the Fortran Numerical Libraries This was influenced by early work see Fox Hall and Schryer 1978 from Bell Laboratories PORT Library Linked data structures have replaced fixed size tables within the routines Now applications may avoid jumbling lines of error text output if different threads and nodes generate independent errors Users are not required to be aware of any difference in the use of the two versions Each version is packaged into a separate library file A user can safely call or link with the newer version for all applications even though their codes might not be using MPI code A drawback is that the code is l
82. approximate integral for this t A second integral is needed at the edge V_1 HALF SUM U 1 1 N 1 U 1 2 N amp U 2 2 N U 2 1 N 1 MID HALF U 2 2 N U 2 1 N 1 V_2 HALF SUM MID EXP MID amp U 1 1 N 1 U 1 2 N U 2 2 N U 2 1 N 1 D_PDE_1D_MG_BETA ZERO D_PDE_1D_MG_GAMMA G ONE D_PDE_1D_MG_T V_1 V_2 V_1 ONE 2 amp r D_PDE_1D_MG_U ELSE D_PDE_1D_MG_BETA ZERO D_PDE_1D_MG_GAMMA D_PDE_1D_MG_DUDX 1 END IF END SELECT Reverse communication is used for the problem data CALL PDE_1D_MG TO TOUT IDO U IOPT IOPT END DO CONTAINS FUNCTION G z t IMPLICIT NONE REAL KIND 1d0 Z T G G FOUR Z TWO TWO EXP A EXP T 2 G G ONE EXP A ONE ONE TWO A amp EXP TWO A 1 EXP A EXP T END FUNCTION end program Example 4 A Model in Cylindrical Coordinates This example is from Blom and Zegeling 1994 The system models a reactor diffusion problem _ ABrT T T r yex ae ATT T 0 z 0 T 1 z 0 z gt 0 T r 0 0 0 lt r lt 1 B 10 y 1 0 1 The axial direction Z is treated as a time coordinate The radius F is treated as the single space variable IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations 285 Rationale This is a non lin
83. covariance matrix associated with the coefficients of the spline 100 Chapter 4 Curve and Surface Fitting with Splines IMSL Fortran 90 MP Library 4 0 T 6 euwen The argument G is an optional output parameter from the function spline_fitting described below When the square root of the variance function is computed the arguments DERIVATIVE and C are not used iopt iopt Input This optional argument of derived type _options is not used in this release spline_fitting Weighted least squares fitting by B splines to discrete One Dimensional data is performed Constraints on the spline or its derivatives are optional The spline function its derivatives or the square root of its variance function are evaluated after the fitting Required Arguments data data 1 3 Input Output An assumed shape array with size data 1 3 The data are placed in the array data 1 i x data 2 i y anddata 3 i 0 i 1 ndata If the variances are not known but are proportional to an unknown value users may set data 3 i 1 i 1 ndata knots knots Input A derived type _spline_knots that defines the degree of the spline and the breakpoints for the data fitting interval Example 1 Natural Cubic Spline Interpolation to Data The function g x exp x 2 is interpolated by cubic splines on the grid of points x i 1 Ax i 1 ndata Those natural conditions are d g dx f x glxi
84. do i 0 ngrid 1l C ngridt i 1 surface_constraints point piti delta pi amp type periodic piti delta pi end do if mp_rank 0 then Send the data to a node Co JH 3 call mpi_send spline_data j 4 ndata amp MPI_DOUBLE_PRECISION j j MP_LIBRARY_WORLD ierror enddo do i 1 3 Receive the coefficients back call mpi_recv coeff i ngridtndegree 1 2 amp MPI_DOUBLE_PRECISION i i MP LIBRARY WORLD amp status ierror enddo else if mp_rank lt 4 then Receive the data from the root call mpi_recv data 4 ndata MPI_DOUBLE_PRECISION 0 amp mp_rank MP_LIBRARY_WORLD status ierror OPTIONS 1 d_options surface_fitting_thinness zero OPTIONS 2 d_options surface_fitting_flatness zero OPTIONS 3 d_options surface_fitting_smallness zero OPTIONS 4 surface_fitting_residuals Compute the coefficients at this node coeff mp_rank surface_fitting data knotsx knotsy amp CONSTRAINTS C IOPT OPTIONS Sends the Cock ilcrents back sto the root call mpi_send coeff mp_rank ngridtndegree 1 2 amp MPI_DOUBLE_PRECISION 0 mp_rank MP_LIBRARY_WORLD IERROR end if Evaluate the function at a grid of points inside the rectangle of latitude and longitude covering the sphere just once Add the sum of squares They should equal A 2 but will not due to truncation and rounding e
85. e D 9 index 2 2DFT Discrete Fourier Transform 86 3 3DFT Discrete Fourier Transform 91 A Aasen s method 11 12 accuracy estimates of eigenvalues example 47 69 Adams ii adjoint eigenvectors example 47 69 adjoint matrix iii ainv optional argument vi ANSI ii 164 165 argument v arguments optional subprogram vi array function one dimensional smoothing 97 two dimensional smoothing 98 bidiagonal matrix 50 BLACS 231 block cyclic decomposition reading writing utility 231 Blocking Output 165 boundary value problem 42 Brenan 43 B spline 95 Cc Campbell 43 IMSL Fortran 90 MP Library changing messages 125 Chebyshev polynomials 19 Cholesky algorithm 12 decomposition 9 61 74 factorization 154 method 13 combining Fortran 90 and FORTRAN 77 routines viii companion matrix 67 computing eigenvalues example 47 56 the rank of A 26 the SVD 47 48 computing eigenvalues example 47 63 condition number 70 convolutions real or complex periodic sequences 84 covariance matrix 13 18 21 cross validation with weighting example 47 54 cyclic reduction 1 34 35 37 cyclical 2D data linear trend 88 cyclical data linear trend 82 D DASPG routine 43 data fitting polynomial 18 two dimensional 24 data optional vi de Boor 95 decomposition singular value 26 derived type function one dimensional smoothing 96 two dimensional smoothing 98 derived types one dimensional smoothing 96 determ
86. end if end 118 Chapter 4 Curve and Surface Fitting with Splines nvalues is correct IMSL Fortran 90 MP Library 4 0 Example 3 Constraining Some Points using a Spline Surface This example illustrates the use of discrete constraints to shape the surface The data fitting problem of Example 1 is modified by requiring that the surface interpolate the value one at x y 0 The shape is constrained so first partial derivatives in both x and y are zero at x y 0 These constraints mimic some properties of the function g x y The size of the residuals at a grid of points and the residuals of the constraints are checked USE surface_fitting_int USE rand_int USE norm_int implicit none This is Example 3 for SURFACE_FITTING tensor product B splines approximation f x y Use the function exp x 2 y 2 on the square 0 2 x 0 2 for samples The spline order is nord and the number of cells is ngrid 1 2 There are ndata data values in the square Constraints are put on the surface at 0 0 Namely 0 0 1 _x 0 0 0 f_y 0 0 0 integer i integer parameter ngrid 9 nord 4 ndegree nord 1l amp nbkpt ngrid 2 ndegree ndata 2000 nvalues 100 NC 3 real kind 1d0 parameter zero 0d0 one 1d0 two 2d0 real kind 1d0 parameter TOLERANCE 1d 3 real kind 1d0 target spline_data 4 ndata bkpt nbkpt amp coeff ngridtndegree 1 ngridtnde
87. find the first value such that isNaN a i j true See the isNaN function Chapter 6 Default The array is not scanned for NaNs iopt IO _options _lin_eig_use_QR _dummy Uses a rational QR algorithm to compute eigenvalues Accumulate the IMSL Fortran 90 MP Library 4 0 Chapter 2 Singular Value and Eigenvalue Decomposition 57 eigenvectors using this algorithm Default the eigenvectors computed using inverse iteration iopt IO _options _lin_eig_skip_Orth _dummy If the eigenvalues are computed using inverse iteration skips the final orthogonalization of the vectors This will result in a more efficient computation but the eigenvectors while a complete set may be far from orthogonal Default the eigenvectors are normally orthogonalized if obtained using inverse iteration iopt IO _options _lin_eig_use_Gauss_elim _dummy If the eigenvalues are computed using inverse iteration uses standard elimination with partial pivoting to solve the inverse iteration problems Default the eigenvectors computed using cyclic reduction iopt IO _options _lin_eig_self_set_perf_ratio perf_ratio Uses residuals for approximate normalized eigenvectors if they have a performance index no larger than perf_ratio Otherwise an alternate approach is taken and the eigenvectors are computed again Standard elimination is used instead of cyclic reduction or the standard QR algorithm is used as a backup proce
88. in a specified manner as was required in FORTRAN 77 However much software exists in FORTRAN 77 that relies on this previous memory model of computation IMSL Fortran 90 MP Library 4 0 Introduction e vii viii e Introduction Example 4 in Chapter 1 Linear Solvers of 1in_sol_gen illustrates how the various libraries work together In this example which evaluates the matrix exponential to solve a linear constant matrix system of ordinary differential equations routines from both libraries are used The interface for EVCRG and other routines in the FORTRAN 77 IMSL MATH LIBRARY and STAT LIBRARY products are provided by use of the IMSL Fortran 90 MP Library module Numerical_Libraries This module is invoked with the statement Use Numerical_Libraries near the first line of the program unit Even for users who choose to continue with just the FORTRAN 77 IMSL routines we strongly recommend the use of this module It can show type mismatches missing arguments and other silly mistakes before they become dangerously hidden in an application Interface blocks for the Fortran 90 codes are individually provided The interface for this FORTRAN 77 routine shows that the arrays A EVAL and EVEC containing input and output for EVCRG are assumed size The alternate arrays in this example are assumed shape IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers Introduction This chapter desc
89. information on changing the contents of the message file and information on how to create and access a message file for a private application Changing Messages In order to change messages two files are required An editable message glossary messages gls supplied with this product A source program prepmess f used to generate an executable which builds messages daf from messages gls To change messages first make a backup copy of messages gls Use a text editor to edit messages gls The format of this file is a series of pairs of statements message_number lt nnnn gt message message string Note that neither of these lines should begin with a tab The variable lt nnnn gt is an integer message number see below for ranges and reserved message numbers The message string is any valid message string not to exceed 255 characters If a message line is too long for a screen the standard Fortran 90 concatenation operator with the line continuation character amp may be used to wrap the text Most strings have substitution parameters embedded within them These may be in the following forms i lt n gt for an integer substitution where n is the nth integer output in this message x lt n gt for single precision real number substitution where n is the nth real number output in this message d lt n gt for double precision real number substitution where n is the nth double precision number output
90. interchange ip i and ip k i 1 n The matrix defined by the array assignment that permutes the rows A l n 1 n A ip 1 n 1 n requires no pivoting for maintaining numerical stability Now the optional argument iopt and the packaged option number _lin_sol_gen_no_pivoting can be safely used for increased efficiency during the LU factorization of A det det 1 2 Output Array of size 2 of the same type and kind as A for representing the determinant of the input matrix The determinant is represented by two numbers The first is the base with the sign or complex angle of the result The second is the exponent When det 2 is within exponent range the value of this expression is given by abs det 1 det 2 det 1 abs det 1 If the matrix is not singular abs det 1 radix det otherwise det 1 0 and det 2 huge abs det 1 ainv ainv Output Array of the same type and kind as A 1 n 1 n It contains the inverse matrix A7 when the input matrix is nonsingular iopt iopt Input Derived type array with the same precision as the input matrix used for passing optional data to the routine The options are as follows IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 3 Packaged Options for 1in_sol_gen Option Prefix Option Name Option Value lin_sol_gen_set_small 1 lin_sol_gen_save_LU 2 lin_sol_gen_solve_A 3 dC lin
91. l 0 lxl21 is an unnormalized probability distribution This function is similar to the standard Normal distribution with specific choices for the mean and variance except that it is truncated Our algorithm interpolates g x with a natural cubic spline f x The cumulative distribution is approximated by precise evaluation of the function a x J Oat Gauss Legendre quadrature formulas IMSL 1994 pp 621 626 of order two are used on each polynomial piece of f t to evaluate g x cheaply After normalizing the cubic spline so that g 1 we may then generate random numbers according to the distribution f x g x The values of x are evaluated by solving g x u 1 lt x lt 1 Here u is a uniform random sample Newton s method for a vector of unknowns is used for the solution algorithm Recalling the relation q x u f x 1 lt x lt 1 we believe this illustrates a method for generating a vector of random numbers according to a continuous distribution function having finite support use spline_fitting_int use linear_operators use Numerical_Libraries implicit none This is Example 3 for SPLINE_FITTING Use splines to generate random almost normal numbers The normal distribution function has support 1 1 and is zero outside this interval The variance is 0 5 integer i niterat integer parameter iweight 1 nfix 0 nord 4 ndata 50 integer parameter
92. lin_sol_sel1f using the appropriate options to obtain the Cholesky factorization The option and derived type names are given in the following tables Option Name for cHoL Option Value use_lin_sol_gen_only 4 use_lin_sol_lsq_only 5 Derived Type Name of Unallocated Array s_options s_options d_options d_chol_options d_options s_chol_options s_chol_options_once d_chol_options_once 154 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 Example Compute the Cholesky factor of a positive definite symmetric matrix B A tx A R CHOL B B R tx R COND Compute the condition number of a rectangular matrix A The condition number is the ratio of the largest and the smallest positive singular values s1 S rank A or huge A whichever is smaller Required Argument This function requires one argument This argument must be a rank 2 or rank 3 array For rank 3 arrays each rank 2 array section for fixed third subscript is a separate problem In this case the output is a rank 1 array of condition numbers for each problem Modules Use the appropriate one of the modules cond_int or linear_operators Optional Variables Reserved Names 2 This function uses Lin_sol_svd see Chapter 1 Linear Solvers lin_sol_svd to compute the singular values of A The option and derived type names a
93. linear algebraic equations in a least squares sense It computes the factorization of A known as the singular value decomposition This decomposition has the following form A USV The matrices U and V are orthogonal The matrix S is diagonal with the diagonal terms non increasing See Golub and Van Loan 1989 Chapters 5 4 and 5 5 for further details 28 Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 Example 2 Polar Decomposition of a Square Matrix A polar decomposition of an n X n random matrix is obtained This decomposition satisfies A PQ where P is orthogonal and Q is self adjoint and positive definite Given the singular value decomposition A USV the polar decomposition follows from the matrix products P UV and Q VSV g3 3 66 This example uses the optional arguments u s and v then array intrinsic functions to calculate P and Q Also see operator_ex14 Chapter 6 use lin_sol_svd_int use rand_gen_int implicit none This is Example 2 for LIN_SOL_SVD integer i integer parameter n 32 real kind 1qd0 parameter one 1 0d0 zero 0 0d0 real kind 1d0 a n n b n 0 ident n n p n n q n n amp s_d n u_d n n v_d n n x n 0 y n n Generate a random matrix call rand_gen y a reshape y n n Compute the singular value decomposition call lin_sol_svd a b x nrhs 0 s s_d amp u u_d v v_d Compute the left orthogonal fa
94. nquad nord 1 2 ndegree nord 1 integer parameter nbkpt ndatat t2 ndegree ncoeff nbkpt nord integer parameter last nbkpt ndegree n_samples 1000 integer parameter limit 10 real kind le0 dimension n_samples fn rn x alpha_x beta_x INTEGER LEFT_OF n_samples real kind le0 parameter one le0 half 5e 1 zero 0e0 two 2e0 real kind le0 parameter delta_x two ndata 1 real kind le0O parameter gqalpha zero qbeta zero domain two real kind le0O qx nquad qxi nquad qw nquad qxfix nquad real kind le0 alpha_ beta_ quad 0 ndata 1 106 Chapter 4 Curve and Surface Fitting with Splines IMSL Fortran 90 MP Library 4 0 real kind le0 target xdata ndata ydata ndata coeff ncoeff amp spline_data 3 ndata bkpt nbkpt real kind le0 pointer pointer_bkpt type s_spline_knots break_points type s_spline_constraints constraints 2 Approximate the probability density function by splines xdata onet i 1 delta_x i 1 ndata ydata exp half xdata 2 spline_data 1l xdata spline_data 2 ydata spline_data 3 one bkpt one i nord delta_x i 1 nbkpt Assign the degr of the polynomial and the knots pointer_bkpt gt bkpt break_points s_spline_knots ndegree pointer_bkpt Define the natural derivatives constraints constraints 1 spline_constraints amp derivative 2 point bkpt nord type amp value on
95. on the imaginary parts if count abs d abs e gt eta variation 0 then write Example 4 for LIN_EIG_GEN operators is correct end if end Operator_ex33 use linear_operators implicit none This is Example 1 using operators for LIN_GEIG_GEN integer parameter n 32 real kind 1d0 parameter one 1d0 real kind 1d0 A n n B n n beta n beta_t n err complex kind 1d0 alpha n alpha_t n V n n Generate random matrices for both A and B A rand A B rand B Compute the generalized eigenvalues alpha EIG A B B D beta Compute the full decomposition once again A V B V values and check for any error messages alpha_t EIG A B B D beta_t W V Use values from the first decomposition vectors from the second decomposition and check for small residuals err norm A x V x diag beta B x V x diag alpha 1 amp norm A 1 norm beta 1 norm B 1 norm alpha 1 if err lt sqrt epsilon one then write Example 1 for LIN_GEIG_GEN operators is correct end if end Operator_ex34 use linear_operators implicit none This is Example 2 using operators for LIN_GEIG_GEN integer parameter n 32 real kind 1d0 parameter one 1d0 zero 0d0 real kind 1d0 err alpha n complex kind 1d0 dimension n n A B C D V Generate random matrices for bot
96. optional items Illustration Example 3 in Chapter 2 Singular Value and Eigenvalue Decomposition of 1in_eig_self a new definition for a small diagonal term is passed to lin_sol_self There is one line of code required for the change and the new tolerance iopt 1 d_options d_lin_sol_self_set_small epsilon one abs d i 3 The internal processing of option numbers stops when Option_number or when IO gt size iopt This sends a signal to each routine having this optional argument that all desired changes to default values of internal parameters have been made This implies that the last option number is the value zero or the value of size iopt matches the last optional value changed 4 To add more options replace I0 with 10 n where n is the number of items required for the previous option Go to Step 2 Option numbers can be written in any order and any selected set of options can be chosen to be changed from the defaults They may be repeated Example 3 in Chapter 1 Linear Solvers of 1in_sol_self uses three and then four option numbers for purposes of computing an eigenvector associated with a known eigenvalue Combining Fortran 90 and FORTRAN 77 Routines Users will often want to combine FORTRAN 77 application software with IMSL Fortran 90 MP Library routines This section deals with the rules that a programmer must follow to accomplish this Fortran 90 arrays are no longer required to be stored
97. other problems in consecutive racks of the box We use parallelism of an underlying network of processors when computing these disjoint problems In addition to the operators ix xi i and x additional operators t h tx hx xt and xh are provided for complex matrices Since the transpose matrix is defined for complex matrices this meaning is kept for the defined operations In order to write one defined operation for both real and complex matrices use the conjugate transpose in all cases This will result in only real operations when the data arrays are real For sums and differences of vectors and matrices the intrinsic array operations and are available It is not necessary to have separate defined operations A parsing rule in Fortran 90 states that the result of a defined operation involving two quantities has a lower precedence than any intrinsic operation This explains the parentheses around the next to last line containing the sub expression A x y found in the example Users are advised to always include parentheses around array expressions that are mixed with defined operations or whenever there is possible confusion without them The next to last line of the example results in computing the residual associated with the solution namely r b Ay Ideally this residual is zero when the system has a unique solution It will be computed as a non zero vector due to rounding errors and co
98. output matrix is transformed to upper Hessenberg form H which is block upper triangular The dimensions of the blocks are either 2 x 2 or unit sized Nonzero subdiagonal values of H determine the size of the blocks If the optional argument v v is passed by the calling program unit then the array V contains an orthogonal matrix Q such that AQ Q H 0 Requires the simultaneous use of option _lin_eig_no_balance Default The matrix is reduced to diagonal form IMSL Fortran 90 MP Library 4 0 Chapter 2 Singular Value and Eigenvalue Decomposition 65 iopt IO _options _lin_eig_gen_out_tri_form _dummy The output matrix is transformed to upper triangular form T If the optional argument v v is passed by the calling program unit then the array V contains a unitary matrix W such that AW WT 0 The upper triangular matrix T is returned in the optional argument tri T The eigenvalues of A are the diagonal entries of the matrix T They are in no particular order The output array E is blocked with NaNs using this option This option requires the simultaneous use of option _lin_eig_no_balance Default The matrix is reduced to diagonal form iopt IO _options _lin_eig_gen_continue_with_V _dummy As a convenience or for maintaining efficiency the calling program unit sets the optional argument v v to a matrix that has transformed a problem to th
99. product B splines to least squares fit a data set historically due to Ferguson Reset regularization and constrain the surface to be non negative Surface is fit twice Transpose a distributed matrix in place Compute product of distributed matrices Solve a distributed linear system with ScaLAPACK Solve a large system of linear inequalities Solve a large linear least squares system with non negativity constraints Solve a large system with linear equality and inequality constraints Solve a large non linear equation with bounded least squares as step control Solve an electrodynamics model PDE problem Solve for inviscid flow on a plate a model PDE problem Solve a population dynamics simulation an integro differential PDE problem Solve a model PDE problem in cylindrical coordinates Solve a flame propagation model PDE problem Solve a hot spot model PDE problem Solve for interacting waves a model PDE problem Solve the Black Scholes PDE for a European call option Study many values of a parameter found in example pde_ex1 Use several processes and MPI for communicating results N N NNN eao o v o amp v o IMSL Fortran 90 MP Library 4 0 Appendix C References References Adams et al Adams Jeanne C W S Brainerd J T Martin B T Smith and J L Wagener 1992 Fortran 90 Handbook Complete ANSI ISO Reference McGraw Hill Book Co New York Aird and Howell Aird Thomas J
100. rand_gen separately calculate rate size X ticks per sec average 104 x 50 3 6 138 889 numbers sec 0 139 million numbers sec Fortran 90 Codes FORTRAN 77 Codes Number Program Units Timed Timed 1 time_dft f 90 fast_dft EECeL Eiteb s_dft_bench f90 d_dft_bench f 90 dfftcf dfftcb 2 time_eig_gen f90 lin_eig_gen e8crg de8crg s_eig_gen_bench f90 d_eig_gen_bench f90 3 time_eig_self f90 lin_eig_self e5csf de5csf s_eig_self_bench f90 d_eig_self_bench f90 4 time_geig_gen f90 lin_geig_gen g8crg dg8crg s_geig_gen_bench f90 d_geig_gen_bench f90 5 time_inv_chol f90 lin_sol_self 1l2nds dl2nds s_inv_chol_bench f90 d_inv_chol_bench f90 6 time_inv_gen f90 lin_sol_gen l2nrg dl2nrg s_inv_gen_bench f90 d_inv_gen_bench f90 7 time_inv_lsq f90 lin_sol_lsq lsgrr dlsgrr s_inv_lsq_bench f90 d_inv_lsq_bench f90 8 time_inv_self f90 lin_sol_self lftsf lfssf s_inv_self_bench f90 dlftsf dlfssf d_inv_self_bench f90 9 time_rand_gen f90 rnun drnun s_inv_rand_bench f90 d_inv_rand_bench f90 Table B Fortran 90 and FORTRAN 77 Comparisons IMSL Fortran 90 MP Library 4 0 Appendix D Benchmarking or Timing Programs e D 3 Number 10 Program Units Paes sol chol fo s_inv_sol_chol f90 d_inv_sol_chol f90 time_sol_gen f90 s_sol_gen_bench f90 d_sol_gen_bench f90 sol_lsq f90 s_sol_lsq_bench f90 d_sol_lsq_bench 90 time_sol_self f90 s_sol_self_bench f 90 d_sol_self_bench f9
101. rank linear systems of equations e general and symmetric positive definite banded linear systems of equations e general and symmetric positive definite tri diagonal linear systems of equations e condition number estimation and iterative refinement for LU and Cholesky factorization e matrix inversion e full rank linear least squares problems e orthogonal and generalized orthogonal factorizations orthogonal transformation routines e reductions to upper Hessenberg bidiagonal and tridiagonal form e reduction of a symmetric definite generalized eigenproblem to standard form e the self adjoint or Hermitian eigenproblem e the generalized self adjoint or Hermitian eigenproblem and e the non symmetric eigenproblem ScaLAPACK routines are available in four data types single precision real double precision real single precision complex and double precision complex At present the non symmetric eigenproblem is only available in single and double precision More background information and user documentation is available on the World Wide Web at location http www netlib org scalapack slug scalapack_slug html For users with rank deficiency or simple constraints in their linear systems or least squares problem we have routines for e full or deficient rank least squares problems with non negativity constraints e full or deficient rank least squares problems with simple upper and lower bound constraints These
102. real kind le0 A n n real kind 1d0 B n n real kind le0 external s_NaN real kind 1d0 external d_NaN Assign NaNs to both A and B A s_Nan le0 B d_Nan 1d0 Check that NaNs are noted in both A and B if isNan A and isNan B then write Example 1 for NaN is correct end if end Optional Arguments There are no optional arguments for this routine IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 165 Description The bit pattern used for single precision is transfer not 0 1 For double precision the bit pattern for single precision is replicated by assigning the temporary integer array i 1 2 not 0 and then using the double precision bit pattern transfer i x for the output value Fatal and Terminal Error Messages This routine has no error messages NORM Compute the norm of a rank 1 or rank 2 array For rank 3 arrays the norms of each rank 2 array in dimension 3 are computed Required Arguments The first argument must be an array of rank 1 rank 2 or rank 3 An optional second position argument can be used that provides a choice between the norms lh and I 2 If this optional argument with keyword type is not present the norm is computed The and norms are likely to be less expensive to compute than the l norm Use of the option number _reset_default_norm will switch the de
103. sce c n 1 1 se c xX u_c lin n 1 1 u_c x x vic lin n 1 1 vic x The columns of v_c and v_s have the same span They are equivalent by taking the signs of the largest magnitude values positive do i l n sc_c i sign one v_c sum maxloc abs v_c 1l n i i sc_s i sign one v_s sum maxloc abs v_s 1l n i i end do vic vic x diag sc_c u_c u_c x diag sc_c v_s v_s x diag sc_s u_S u_s x diag sc_s 194 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 In this form of the GSVD the matrix X can be unstable if D is ill conditioned X v_d x diag one s_d x v_c Check residuals for GSVD A X u_c diag c_l c_n and l B X u_s diag s_l S_n erri norm D lin x X u_c x diag C s_d 1 err2 norm D nt1l x X u_s x diag S s_d 1 if erri lt sqrt epsilon one and amp rr2 lt sqrt epsilon one then write Example 3 for LIN_SVD operators is correct end if end Operator_ex24 use linear_operators implicit none This is Example 4 using operators for LIN_SVD integer i integer parameter m 32 n 16 p 10 k 4 real kind 1d0 parameter one 1d0 real kind 1d0 log_lamda log_lamda_t delta_log_lamda real kind 1d0 a m n b m k w m k g m k t n s n amp s_sq n u m m v n n c_lamda p k amp lamda k x n k res n k
104. show_significant_digits_is_4 iopt IO show_significant_digits_is_7 iopt IO show_significant_digits_is_16 These options allow more precision to be displayed The default is 4D for each value The other possible choices display 7D or 16D iopt IO show_line_width_is_44 iopt IO show_line_width_is_72 138 Chapter 5 Utilities IMSL Fortran 90 MP Library 4 0 iopt IO show_line_width_is_128 These options allow varying the output line width The default is 72 characters per line This allows output on many work stations or terminals to be read without wrapping of lines iopt IO show_end of_line_sequence_is The sequence of characters ending a line when it is placed into the internal character buffer corresponding to the optional argument IMAGE buffer The value of iopt I0 1 idummy is the number of characters These are followed starting at iopt 10 2 idummy by the ASCII codes of the characters themselves The default is the single character ASCII value 10 or New Line iopt IO show_starting_index_is This are used to reset the starting index for a rank 1 array to a value different from the default value which is 1 iopt IO show_starting_row_index_is iopt IO show_starting_col_index_is These are used to reset the starting row and column indices to values different from their defaults each 1 Description The show routine is a generic subroutine interface to separate low
105. spread x DIM 1 NCOPIES size A 1 A and D B spread x DIM 2 NCOPIES size B 2 These array products are not as easy to read as the defined operations using DIAG and matrix multiply but their use results in a more efficient code Modules Use the appropriate module diag_int or linear_operators Optional Variables Reserved Names This function has neither packaged optional variables nor reserved names IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 157 Example Compute the singular value decomposition of a square matrix A S SVD A U U V V Then reconstruct A USV A U x diag S xt V DIAGONALS Extract a rank 1 array whose values are the diagonal terms of a rank 2 array argument The size of the array is the smaller of the two dimensions of the rank 2 array When the argument is a rank 3 array the result is a rank 2 array consisting of each separate set of diagonals Required Argument This function requires one argument and the argument must be a rank 2 or rank 3 array The output is a rank 1 or rank 2 array respectively Modules Use the appropriate one of the modules diagonals_int or linear_operators Optional Variables Reserved Names This function has neither packaged optional variables nor reserved names Example Compute the diagonals of the matrix product RR r x DIAGONALS R xt R EIG Compute the eigenvalue eigenv
106. string is embedded in any message a new line immediately starts e The routines E1ST lt L A I R D C Z gt are called before calling E1MES to issue an error message The values defined by these routines are discarded after the reference to E1MES e The function reference N1RCD i returns the error code If i 0 the code for the current level is returned if i 1 the code for the most recently called routine last pop is returned e Likewise NIRTY i returns the error type e The function reference IERCD returns N1RCD 1 if NIRTY 1 is 1 to 4 and O otherwise e The INTEGER functions IERCD N1RCD and N1RTY return current information about the status of an error if the stack is not empty In the scalar version of the error message code this stack was always kept with at least one name pushed on it In the parallel version of the error message library this is not so due to the need for synchronization of error printing If a call to ITERCD N1RCD or N1RTY is being made to handle the occurrence of an error in a top level routine then the programmer should first call E1PSH ROUTINE_NAME before the call to the subprogram in question Here ROUTINE_NAME can be any name After the call to TERCD N1RCD or N1RTY the programmer should make a call to E1POP ROUTINE_NAME This is not an issue for code bracketed between calls to MP_SETUP and MIP Sis eile asking y IL
107. the box SVD function a code is given that computes the singular value decomposition and the reconstruction of the random matrix box A Using the computed factors R US vT Mathematically R A but this will be true only approximately due to rounding errors The value units_of_error IIA RII IIAlle shows the merit of this approximation USE linear_operators USE mpi_setup_int n 3 k 16 integer param r real dimension n n k MP_NPROCS MP_SETUP A U V R S n k units_of_error k Set up MPI A rand A S SVD A U U V V R U x diag S xt V units_of_error norm A R S 1 1 k epsilon A MP_NPROCS MP_SI ETUP Final end IMSL Fortran 90 MP Library 4 0 Shut down MPI Chapter 6 Operators and Generic Functions The Parallel Option 145 Parallelism Using MPI General Remarks MPI REQUIRED The central theme we use for the computing functions of the box data type is that of delivering results to a distinguished node of the machine One of the design goals was to shield much of the complexity of distributed computing from the user The nodes are numbered by their ranks Each node has rank value MP_RANK There are MP_NPROCS nodes so MP_RANK 0 1 MP_NPROCS 1 The root node has MP_RANK 0 Most of the elementary MPI material is found in Gropp Lusk and Skjellum 1994 and Snir Otto Huss Lederman Walker and Dongarra 1996 Although
108. these two lines WRITE 7 F10 5 TO DO I 1 NPDE 1 WRITE 7 4E15 5 U I END DO Define differential equations CASE 6 XVAL D_PDE_1D_MG_X D_PDE_1D_MG_C ONE D_PDE_1D_MG_R D_PDE_1D_MG_DUDX XVAL 2 SIGSQ HALF D_PDE_1D_MG_Q R SIGSQ XVAL D_PDE_1D_MG_DUDX R D_PDE_1D_MG_U Define boundary conditions CASE 7 IF PDE_1D_MG_LEFT THEN D_PDE_1D_MG_BETA ZERO D_PDE_1D_MG_GAMMA D_PDE_1D_MG_U ELSE D_PDE_1D_MG_BETA ZERO D_PDE_1D_MG_GAMMA D_PDE_1D_MG_DUDX 1 ONE END IF END SELECT Reverse communication is used for the problem data CALL PDE_1D_MG TO TOUT IDO U IOPT IOPT END DO end program Example 9 Electrodynamics Parameters Studied with MPI This example described above in Example 1 is from Blom and Zegeling 1994 and 7 E The system parameters P are varied using uniform random IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations 295 numbers The intervals studied are 0 1 lt lt 0 2 0 1 lt p lt 0 2 and 10 lt lt 20 Using N 21 grid values and other program options the elapsed time parameter v x t x 11 4 are sent to the root node This information is v x t values and the value written on a file The final summary includes the minimum value of and the maximum and average time per integration per node x 1 t 4 Rationale This is a non linear simulation problem Using at least two integrating processors and MPI a
109. users should follow a check list 1 Choose the degree of the piece wise polynomials spline function and their knots Use the IMSL DNFL derived type s_spline_knots or d_spline_knots to define this data for use as an argument to the fitting routine These derived types are discussed below 2 Choose the constraints that the spline function must satisfy Use the generic derived type function spline_constraints for defining this optional information to be passed to the fitting routine This derived type is discussed below 3 Define the data values to be fit These are triples of independent and dependent variable values xiyi i 1 ndata and uncertainty Each dependent variable value requires an estimate of its uncertainty O 4 Use the array function spline_fitting to compute the coefficients of the B spline 5 With the coefficients obtained in the previous step the array function spline_values evaluates the spline its derivatives or the square root of its variance The Derived Types s_knots and d_knots The user defines the polynomial degree of the B spline which is one less than its order and the knots or breakpoints for this set of data We have packaged the derived types type _spline_knots integer spline_degree real kind pointer _knots end type Here the _ is either s_ or d_ for single or double precision respectively The definition of these derived types are in the module M
110. x Computes the Discrete Fourier See Chapter 3 Jib Transform DFT of a rank 3 complex array x Detect an IEEE NaN not a number See Chapter 6 R1 Computes the eigenvalues of an n x n See Chapter 2 D4a2 matrix A Optionally the eigenvectors of D4a4 AorA are computed Using the eigenvectors of A gives the decomposition AV VE where V is an n x n complex matrix of eigenvectors and E is the complex diagonal matrix of eigenvalues Other options include the reduction of A to upper triangular or Schur form reduction to block upper triangular form with 2 x 2 or unit sized diagonal block matrices and reduction to upper Hessenberg form Appendix A List of Subprograms and GAMS Classification A 1 lin_eig_self Computes the eigenvalues of a self See Chapter 2 D4al adjoint matrix A Optionally the D4a3 eigenvectors can be computed This gives the decomposition A VDV where V is an n X n orthogonal matrix and D is a real diagonal matrix lin_geig_gen Computes the generalized eigenvalues of See Chapter 2 D4b1 an n X n matrix pencil Av Bv D4b2 Optionally the generalized eigenvectors D4b4 are computed If either of A or B is nonsingular there are diagonal matrices o and B and a complex matrix V computed such that AVB BVa lin_sol_gen Solves a general system of linear See Chapter 1 D2al equations Ax b Using optional D2c1 arguments any of several related computations can be performed These extra tasks inclu
111. xt Py amp lt sqrt epsilon one then if norm A P x Q norm A amp IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 183 lt sqrt epsilon one then write Example 2 for LIN_SOL_SVD operators is correct end if end if end Operator_ex15 use linear_operators implicit none This is Example 3 for LIN_SOL_SVD integer i j k integer parameter n 32 real kind le0 parameter half 0 5e0 one le0 zero 0e0 real kind le0 dimension n n A S n U V C Fill in value one for points inside the circle zero on the outside A zero DO i l n DO j l n if i n 2 2 j n 2 2 lt n 4 2 A i j one END DO END DO Compute the singular value decomposition S SVD A U U V V How many terms to the nearest integer match the circle k count S gt half C U 1 k x diag S 1 k xt V 1 k if count int C A 0 0 then write Example 3 for LIN_SOL_SVD operators is correct end if end Operator_ex16 use linear_operators implicit none This is Example 4 operators for LIN_SOL_SVD integer i j k integer parameter m 64 n 16 real kind le0 parameter one le0 zero 0e0 real kind le0 g m s m t n 1 a m n f n U_S m m amp V_S n n S_S n real kind le0 delta_g delta_t rms oldrms Compute collocation equation
112. 0 A m n b m 1 x n 1 y m n err a random matrix and right hand side rand_gen y reshape y m n rand_gen b 1 m 1 Comput call Check th column v err ii Ea wW end end the least squares solution matrix of Ax b lin_sol_svd A b x at the residuals are orthogonal to the ectors of A sum abs matmul transpose A b matmul A x sum abs A err lt sqrt epsilon one then rite Example 1 for LIN_SOL_SVD is correct if Optional Arguments MROWS m Input Uses array A 1 m 1 n for the input matrix Default m size A 1 NCOLS n Input Uses array A 1 m 1 n for the input matrix Default n size A 2 NRHS nb Input Uses the array b 1 1 nb for the input right hand side matrix Default nb size b 2 Note that b must be a rank 2 array RANK k Output Number of singular values that are at least as large as the value Smail It will satisfy k lt min m n u u Output Array of the same type and kind as A 1 m 1 n It contains the m x m orthogonal matrix U of the singular value decomposition Output Array of the same precision as A 1 m 1 n This array is real even when the matrix data is complex It contains the m x n diagonal matrix S in a rank 1 array The singular values are nonnegative and ordered non increasing n ll 0 IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 27 v v Ou
113. 0 iopt 6 ival 7 amp iwk 35 n real kind le0 hx pi_value t u_0 u_l atol rtol sval 2 amp tend wk 41 11 n y n ypr n a_diag n amp a_off n r_diag n r_off n t_y n t_ypr n amp t_g n t_diag 2 n 1 t_upper 2 n 1 amp t_lower 2 n 1 t_sol 2 n 1 type s_options iopti 2 s_options 0 zero character 2 pi 1l pi Define initial data t 0 0e0 u_0 1 u_l 0 5 tend one Initial values for the variational equation y one ypr zero pi_value const pi hx pi_value n 1 a_diag 2 hx 3 a_off hx 6 r_diag 2 hx r_off 1 hx Get integer option numbers iopt 1 inum call iumag math ichap iget 1 iopt in Get floating point option numbers iopt 1 irnum call iumag math ichap iget 1 iopt inr jra Set for reverse communication evaluation of the DA iopt 1 in 26 ival 1 0 Set for use of explicit partial derivatives iopt 2 in 5 ival 2 1 Set for reverse communication evaluation of partials iopt 3 in 29 ival 3 0 Set for reverse communication solution of linear equations iopt 4 in 31 ival 4 0 Storage for the partial derivative array are not allocated or required in the integrator IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 43 iopt 5 in 34 ival 5 1 Set the sizes of iwk wk for internal checking iopt 6 in 35 ival 6
114. 0 time_svd f90 s_svd_bench f90 d_svd_bench f90 time_tri f90 s_tri_bench f90 d_tri_bench f90 time_mult f 90 s_mult_bench 90 d_mult_bench f90 time_ Fortran 90 Codes Fortran 77 Codes Timed Timed lin_sol_self lftds lfsds dlftds dlfsds lin_sol_gen Erg IESrG dftrg dlfsrg lin_sol_lsq l2rrv dl2rrv lin_sol_self lftsf lfssf dltist vdittsst lin_svd lsvrr dlsvrr lin_sol_tri ister disler matmul D E Table B continued Fortran 90 and FORTRAN 77 Comparisons Notes on the comparable problems 1 Perform forward and backward DFT of a random complex sequence of size NSIZE 2 Compute eigenexpansion of a random real matrix of dimension NSIZE X NSIZE 3 Compute eigenexpansion of a random symmetric real matrix of D 4 e Appendix D Benchma dimension NSIZE X NSIZE Compute generalized eigenexpansion of a random matrix pencil of dimension NSIZE X NSIZE Compute the inverse of a positive definite real matrix of dimension NSIZE X NSIZE Uses Cholesky method Compute the inverse of a general real random matrix of dimension NSIZE X NS1ZE Uses LU factorization rking or Timing Programs IMSL Fortran 90 MP Library 4 0 10 11 12 13 14 15 16 17 Compute the generalized inverse of a general real random matrix of dimension 2 X NSIZE X NS1ZE Uses QR factorization for Fortran 90 and SVD for FORTRAN 77
115. 0 MP Library 4 0 r CALL PD __1D_MG TO TOUT IDO U amp initial_conditions IC_01 amp PD E_system_definition PDE _01 amp ry boundary_conditions BC_01 END DO ry ROUTIN IMPLICIT NO re 01 NPD ay r This is the initial data for N ai INTEGER NPD F NPTS REAL KIND 1 U 1 1D0 SUBRO UTINI ry 0 U NPD U 2 0D0 D ROUTINE PD s the diff 01 T ntial IMPLICIT NO r N ah INTEGER NPD IR ES REA KIND NPDE NP r 0 T X Q NPD EAL KIND 1 TA 17 19 za OD0 C 1 1 P DUDX R U 1 OpZ 1 R 1 NPTS X NPD E EPS 0 143D0 TWO EPS U Example E 1 NPTS E U quation for U NPDI E R NPD DUDX C 1 IRI Q Example 1 R DUDX NPDE E P 0 1743D0 amp r 2D0 THR E 3D0 L 1D0 C 2 2 1D0 U 2 THR EXP Z Q 2 Q 1 SUBROUTIN i ry ROUTIN IRES IMPLICIT BC_ NONI O1 T B ETA ry TEGER NPD E IR OGICAL L 1 U N EF I L REAL KIND 1 E PD r T EXP TWO Z GAMMA These are the boundary conditions for DO T B
116. 1 D tx y l m 1 call lin_sol_self F g h amp pivots ipivots iopt iopti y hot y change_new norm h Exit when changes are no longer decreasing if change_new gt change_old amp xit iterative_refinement change_old change_new Use option to re enter code with factorization saved solve only iopti 2 s_lin_sol_self_solve_A end do iterative_refinement write Example 4 for LIN_SOL_SELF operators is correct end Operator_ex09 use linear_operators use Numerical_Libraries implicit none This is Example 1 for LIN_SOL_LSQ using operators and functions integer i integer parameter m 128 n 8 real kind 1d0 parameter one 1d0 zero 0d0 real kind 1d0 A m O n c O n pi_over_2 x m y m amp u m v m w m delta_x CHARACTER 2 PI 1 Generate a random grid of points and transform to the interval 1 1 x rand x x x 2 one Get the constant PI 2 from IMSL Numerical Libraries PI pi pi_over_2 DCONST PI 2 Generate function data on the grid y exp x cos pi_over_2 x Fill in the least squares matrix for the Chebyshev polynomials A 0 one A 1 x IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 179 do i 2 n A i 2 x A i 1 A i 2 end do Solve for the series coefficients c A ix y Generate an equally spaced grid on the
117. 10d0 K END DO D DO ti OPEN UNIT NIN FILE test dat STATUS UNKNOWN Write the data by columns DO J 1 N NB WRITE NIN A I L I 1 M L J min N J NB 1 END DO CLOSE NIN END cle IF MP_RANK 0 THEN DEALLOCATE A ALLOCATE A N M END IF Define the descriptor for the global matrix ESC_A 1 CONTXT M 7 WIR INE O apy O E4 Read the matrix into the local arrays CALL ScaLAPACK_READ test dat DESC_A d_A To transpose write the matrix by rows as the first step This requires an option since the default is to write by columns IOPT 1 ScaLAPACK_WRITE_BY_ROWS CALLY ScalAPACKAWRETE G2 LE Sle DAC a a DES CEAS CA LOLs ROEA Resize the local storage and read the transpose matrix DEALLOCATE d_A LDA NUMROC N MB YROW 0 NPROW TDA NUMROC M NB ViC Oly anO aN SOl ALLOCATE d_A LDA TDA Reshape the descriptor for the transpose of the matrix The number of rows and columns are swapped ESCPAS VAT ae NIL XSI ENEN ay ENEO Orpen AW 2 m CALL ScaLAPACK_READ TEST DAT DESC_A d_A IF MP_RANK 0 HEN Open the used files and delete when closed OPEN UNIT NIN FILE test dat STATUS OLD CLOSE NIN S
118. 2 IMSL Fortran 90 MP Library 4 0 trend b_trend x_ 2 ncopies n n n 1 ncopies n n n trend tmul a_trend x_trend n n forward_out f Chapter 3 Fourier Transforms 89 Sort the magnitude of the transform call s_sort_real abs reshape f n n amp temp iperm ip The dominant frequencies are output in ip 1 k Sort these values to compare with the original frequency order call s_sort_real real ip 1 k new_order order 1 n i i 1 n order n 1 k n 1 i nt 1 k Check the results if count order int new_order 0 then write Example 2 for FAST_2DFT is correct end if end Example 3 Several 2D Transforms with Initialization In this example the optional arguments ido and work_array are used to save working variables in the calling program unit This results in maximum efficiency of the transform and its inverse since the working variables do not have to be precomputed following each entry to routine fast_2dft use fast_2dft_int implicit none This is Example 3 for FAST_2DFT integer i j integer parameter n 256 real kind le0 parameter one le0 zero 0e0 real kind le0 r n n err complex kind le0 a n n b n n c n n The value of the array size for work is computed in the routine fast_dft as a first step integer ido_value complex kind le0 allocatable work Fill
119. 2 min m n that contains data for the construction of the orthogonal decomposition det det 1 2 Output Array of size 2 of the same type and kind as A for representing the products of the determinants of the matrices Q P and R The determinant is represented by two numbers The first is the base with the sign or complex angle of the result The second is the exponent When det 2 is within exponent range the value of this expression is given by abs det 1 det 2 det 1 abs det 1 If the matrix is not singular abs det 1 radix det otherwise det 1 0 and det 2 huge abs det 1 ainv ainv Output Array with size n x m of the same type and kind as A 1 m 1 n It contains the generalized inverse matrix A cov cov Output Array with size n x n of the same type and kind as A 1 m 1 n It contains the unscaled covariance matrix C A Tay iopt iopt Input Derived type array with the same precision as the input matrix used for passing optional data to the routine The options are as follows 20 Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 Packaged Options for 1in_sol_1sq Option Prefix Option Name Option Value z lin_sol_lsq_set_small Zz lin_sol_lsq_save_OR z lin_sol_lsq_solve_A Zz lin_sol_lsq_solve_ADJ lin_sol_lsq_no_row_pivoting lin_sol_lsq_no_col_pivoting z lin_sol_lsq_scan_for_NaN nan Ja
120. 2 a i j one end do end do Compute the singular value decomposition call lin_sol_svd a b x nrhs 0 amp S S U U V V How many terms to the nearest integer exactly match the circle c zero k count s gt half do i l k c c spread u 1l n i 2 n spread v 1l in i 1 n s i if count int c a 0 0 exit end do if i lt k then write Example 3 for LIN_SOL_SVD is correct end if end 30 Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 Example 4 Laplace Transform Solution This example illustrates the solution of a linear least squares system where the matrix is poorly conditioned The problem comes from solving the integral equation fe r at s 1 e g s The unknown function f t 1 is computed This problem is equivalent to the numerical inversion of the Laplace Transform of the function g s using real values of t and s solving for a function that is nonzero only on the unit interval The evaluation of the integral uses the following approximate integration rule Tis renee yrs fea 0 t The points t 4 are chosen equally spaced by using the following The points fs yf are computed so that the range of g s is uniformly sampled This requires the solution of m equations i 8 s 8 m 1 for j 1 n andi 1 m Fortran 90 array operations are used to solve for the collocation points s as a single series of ste
121. 3 for BLACS and ScaLAPACK solver is correct tf zZ abla ERRE OMS NE ES PROCESS EGEE CALL BLACS_GRIDEXIT CONTXT CALL BLACS_EXIT 0 ND t 244 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers IMSL Fortran 90 MP Library 4 0 Parallel Constrained Least Squares Solvers Usage Notes Solving Constrained Least Squares Systems The routine PARALLEL_NONNEGATIVE_LSQ is used to solve dense least squares systems These are represented by Ax b where A is an mxn coefficient data matrix b is a given right hand side m vector and X is the solution 7 vector being computed Further there is a constraint requirement x20 The routine PARALLEL_BOUNDED_LSQ is used when the problem has lower and upper bounds for the solution lt x lt B By making the bounds large individual constraints can be eliminated There are no restrictions on the relative sizes of Mand n When M is large these codes can substantially reduce computer time and storage requirements compared with using a routine for solving a constrained system and a single processor The user provides the matrix partitioned by blocks of columns A A Agl A An individual block of the partitioned matrix say A is located entirely on the processor with rank MP_RANK p 1 where MP_RANK is packaged in the module MPI_SETUP_INT This module and the function MP_SETUP defines the Fortran 90 MP Li
122. 37 Table A Examples and Corresponding Operators Operator_ex01 use linear_operators implicit none This is Example 1 for LIN_SOL_GEN with operators and functions integer parameter n 32 real kind le0 one 1 0e0 err real kind le0 dimension n n A b x Generate random matrices for A and b A rand A b rand b IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 173 Compute the solution matrix of Ax b x A ix b Check the results err norm b A x x norm A norm x norm b if err lt sqrt epsilon one amp write Example 1 for LIN_SOL_GEN operators is correct end Operator_ex02 use linear_operators implicit none This is Example 2 for LIN_SOL_GEN using operators and functions integer parameter n 32 real kind le0 one le0 err det_A det_i real kind le0 dimension n n A inv Generate a random matrix A rand A Compute the matrix inverse and its determinant inv i A det_A det A Compute the determinant for the inverse matrix det_i det inv Check the quality of both left and right inverses err norm EYE n A x inv norm EYE n inv x A cond A if err lt sqrt epsilon one and abs det_A det_i one lt amp sqrt epsilon one amp write Example 2 for LIN_SOL_GE ope
123. 4 Convolutions using Fourier Transforms In this example we compute sums a n C ajb _ k 90 n 1 Il J The definition implies a matrix vector product A direct approach requires about n operations consisisting of an add and multiply An efficient method consisting of computing the products of the transforms of the a and b 84 Chapter 3 Fourier Transforms IMSL Fortran 90 MP Library 4 0 then inverting this product is preferable to the matrix vector approach for large problems The example is also illustrated in operator_ex37 Chapter 6 using the generic function interface FFT and IFFT use fast_dft_int use rand_gen_int implicit none This is Example 4 for FAST_DFT integer j integer parameter n 40 real kind le0 one le0 real kind le0O err real kind le0Q dimension n x y yy n n complex kind le0 dimension n a b c d e f Generate two random complex sequence a and b call rand_gen x call rand_gen y a x b y Compute the convolution c of a and b Use matrix times vector for test results yy 1 1 y do j 2 n yy 2 j She n 1 j 1 yy 1 3 yy n 3 1 end do c matmul yy x Transform the a and b sequences into d and e call c_fast_dft forward_in a amp forward_out d call c_fast_dft forward_in b amp forward_out e Invert the product d e call c_fast_dft inverse_in
124. 47 Chapter 3 Fourier TransformS cccccsssssseeeeeeeeeeeeeeenseeeeeeeeeeeeeeeseeeeeeneees 79 Chapter 4 Curve and Surface Fitting with Splines ccccccsssssssseeeee 95 Ghapt r 5 Utilities sissisodan 123 Chapter 6 Operators and Generic Functions The Parallel Option 141 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers 231 Chapter 8 Partial Differential Equations ccccsssseeeeeeeeeeeeeeeeeeeeeeees 265 Chapter 9 Error Handling and Messages The Parallel Option 301 Appendix A List of Subprograms and GAMS Classification A 1 Appendix B List of Examples sssini aaan B 1 Appendix Gi Referentes aiisesisistsatscentensicacisadicacdenricnadeniisandeaateniieanisandeadseadnn C 1 Appendix D Benchmarking or Timing Programs 0eeeeeeeeeeeees D 1 Introduction The IMSL Fortran 90 MP Library The IMSL Fortran 90 MP Library consists of numerical algorithms using Fortran 90 language constructs including Fortran 90 array data types One feature of the design is that the default use is as simple as the problem statement Complicated professional quality mathematical software is hidden from the casual or beginning user The IMSL Fortran 90 MP Library draws upon subroutines in the IMSL FORTRAN 77 Numerical Libraries products for software activities such as error processing and additional functionality We emphasize that users who have calls to IMSL FORTRAN 7
125. 5 T I END DO ZOUT MIN ZOUT DELTA_Z ZEND IF ZO ZEND IDO 3 END IF All completed Solver is shut down CASE 3 CLOSE UNIT 7 EXI Define initial data values CASE 5 T 1 ZERO WRITE 7 F10 5 ZO DO I 1 NPDE 1 WRITE 7 4E15 5 T I END DO Define differential equations CASE 6 D_PDE_1D_MG_C 1 1 ONE 286 Chapter 8 Partial Differential Equations IMSL Fortran 90 MP Library 4 0 D_PDE_1D_MG_R 1 BETA D_PDE_1D_MG_DUDX 1 D_PDE_1D_MG_Q 1 GAMMA EXP D_PDE_1D_MG_U 1 amp ONE EPS D_PDE_1D_MG_U 1 Define boundary conditions CASE 7 IF PDE_1D_MG_LEFT THEN D_PDE_1D_MG_BETA ONE D_PDE_1D_MG_GAMMA ZERO ELSE D_PDE_1D_MG_ BETA ZERO D_PDE_1D_MG _GAMMA D_PDE_1D_MG_U 1 END IF END SELECT Reverse communication is used for the problem data The optional derived type changes the internal model to use cylindrical coordinates CALL PDE_1D_MG Z0 ZOUT IDO T IOPT IOPT END DO end program Example 5 A Flame Propagation Model This example is presented more fully in Verwer et al 1989 The system is a normalized problem relating mass density u x t and temperature v 2t u Uy uf v v v uf v where f z yexp B z B 4 y 3 52 x 10 O0 lt x lt 1 0 lt t lt 0 006 u x 0 1 v x 0 0 2 u v 0 x 0 u 0 v D t
126. 5a INCLUDE mpif h INTEGER PARAMETER MP 500 NP 400 NP 1 N MP REAL KIND 1D0 PARAMETER ZERO 0D0 ONE 1D0 REAL KIND 1D0 ALLOCATABLE amp ING Spey Sp Wein SECS Mey ASAD 718 REAL KIND 1D0 RNORM INTEGER ALLOCATABLE INDEX IPART INTEGER K L DN J JSHIFT IERROR LOGICAL PRINT false l Srta zor Meg MP_NPROCS MP_SETUP DN N max 1 max 1 MP_NPROCS 1 ALLOCATE IPART 2 max 1 MP_NPROCS Spread constraint rows evenly to the processors PART 1 1 1 DO L 2 MP_NPROCS IPART 2 L 1 IPART 1 L 1 DN IPART 1 L IPART 2 1L 1 1 D DO PART 2 MP_NPROCS N H m Define the constraint data using random values K max 0 IPART 2 MP_RANK 1 IPART 1 MP_RANK 1 1 ALLOCATE A M K ASAVE M K X N W N amp B M Y M INDEX N The use of ASAVE can be removed by regenerating the data for A after the return from Parallel_nonnegative_LSQ A rand A ASAVE A IF MP_RANK and PRINT amp CALL SHOW IPART amp Partition of the constraints to be solved Set the right hand side to be one in the last component zero elsewher B ZERO B M ONE Solve the dual problem CALL Parallel_nonnegative_LSQ amp A B X RNORM W INDEX IPART Each processor multiplies its
127. 7 Libraries routines will continue to have their codes function as they did using earlier FORTRAN 77 compilers Users of the IMSL Fortran 90 MP Library benefit by a standard MPI Message Passing Interface environment This is needed to accomplish parallel computing within parts of Chapter 6 9 Gray shading in the documentation cues the reader when this is an issue If parallel computing is not required then the MP Library suite of dummy MPI routines can be substituted for standard MPI routines All requested MPI routines called by the MP Library are in this dummy suite Warning messages will appear if a code or example requires more than one process to execute Typically users need not be aware of the parallel codes Note that a standard MPI environment is not part of the IMSL Fortran 90 MP Library The standard includes a library of MPI Fortran and C routines MPI include files usage documentation and other run time utilities The library routines which begin on page 1 outline usage instructions for a suite of mathematical software written in Fortran 90 These routines are used with computer systems that support a standard Fortran 90 compiler A basic library of numerical routines is provided for common applications Users with linear solver application can turn directly to page 1 In addition high level operators and functions are described in Chapter 6 Operators and Generic Functions The Parallel Option For informati
128. 70 Chapter 2 Singular Value and Eigenvalue Decomposition IMSL Fortran 90 MP Library 4 0 Fatal Terminal and Warning Error Messages See the messages gls file for error messages for 1in_eig_gen These error messages are numbered 841 858 861 878 881 898 901 918 lin_geig_gen Computes the generalized eigenvalues of an n x n matrix pencil Av ABv Optionally the generalized eigenvectors are computed If either of A or B is nonsingular there are diagonal matrices a and B and a complex matrix V all computed such that AVB BVa Required Arguments A Input Output Array of size n X n containing the matrix A B Input Output Array of size n X n containing the matrix B alpha Output Array of size n containing diagonal matrix factors of the generalized eigenvalues These complex values are in order of decreasing absolute value beta Output Array of size n containing diagonal matrix factors of the generalized eigenvalues These real values are in order of decreasing value Example 1 Computing Generalized Eigenvalues The generalized eigenvalues of a random real matrix pencil are computed These values are checked by obtaining the generalized eigenvectors and then showing that the residuals AV BVap are small Note that when the matrix B is nonsingular B J the identity matrix When B is singular and A is nonsingular some diagonal entries of B are essentially zero This corresponds to infinite eig
129. 95 redesigned for distributed memory parallel computers It is written in a Single Program Multiple Data SPMD style using explicit message passing for communication Matrices are laid out in a two dimensional block cyclic decomposition Using High Performance Fortran HPF directives Koelbel et al 1994 and a static p Xq processor array and following declaration of the array A this is illustrated by INTEGER PARAMETER N 500 P 2 Q 3 MB 32 NB 32 HPFS PROCESSORS PROC P Q HPFS DISTRIBUTE A cyclic MB cyclic NB ONTO PROC Our integration work provides modules that describe the interface to the ScaLAPACK library We recommend that users include these modules when using ScaLAPACK or ancillary packages including BLACS and PBLAS For the job of distributing data within a user s application to the block cyclic decomposition required by ScaLAPACK solvers we provide a utility that reads data from an external file and arranges the data within the distributed machines for a computational step Another utility writes the results into an external file The data types supported for these utilities are integer single precision real double precision real single precision complex and double precision complex IMSL Fortran 90 MP Library 4 0 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers 231 A ScaLAPACK library normally includes routines for e the solution of full
130. A b rand b endif Compute the least squares solution matrix of Ax b S SVD A U U V V Gf UW vite AO x SW ote hagon o 6 Ulsan sy 8 Check the results err norm A tx b A x x norm A norm x if ALL err lt sqrt epsilon one then if mp_rank 0 amp write Parallel Example 14 is correct end if See to any error messages and quit MPI mp_nprocs mp_setup Final end Parallel Example 15 A Polar Decomposition of several matrices are computed The box data type and the SVD function are used Orthogonality and small residuals are checked to verify that the results are correct use linear_operators use mpi_setup_int implicit none This is Parallel Example 15 using operators and 222 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 functions for a polar decomposition integer parameter n 33 nr 3 real kind 1d0 one 1d0 zero 0d0 Teeyeill Chealiovel ALClO telaineorssstoun inj ine 8S WN B O amp S DEapa p U Dy Bey real kind 1d0 TEMP1 nr TEMP2 nr UES Sie Um Ore Males mp_nprocs mp_setup Generate a random matrix if mp_rank 0 A rand A Compute the singular value decomposition S_D SVD A U U_D V V_D Compute the left orthogonal factor PS WD eho WD l Compute the right self adjoint factor O Wo ox CESID skde5 WD
131. BLACS_GRIDINIT CONTXT Rows NPROW NPCOL Get this processor s role in the process grid CALL BLACS_GRIDINFO CONTXT NPROW NPCOL MYROW MYCOL Associate context BLACS with IMSL communicator CALM BLACS GET CONTXT 10 P_LIBRARY_WORLD BLOCK DO Allocate local space for each array LDA_A NUMROC M MB MYROW 0 NPROW TDA_A NUMROC K NB MYCOL 0 NPCOL LDA_B NUMROC K NB MYROW 0 NPROW TDA_B NUMROC N NB MYCOL 0 NPCOL LDA_C NUMROC M MB MYROW 0 NPROW TDA_C NUMROC N NB MYCOL 0 NPCOL ALLOCATE d_A LDA_A TDA_A d_B LDA_B TDA_B amp GECAD aD AmC s A root process is used to create the matrix data for the test IF MP_RANK 0 THEN ALLOCATE A M K B K N C M N X M CALL RANDOM_NUMBER A CALL RANDOM _NUMBER B OPEN UNIT NIN FILE Atest dat STATUS UNKNOWN Write the data by columns DO J 1 K NB WRITE NIN A I L 1I 1 M L J min K J NB 1 END DO CLOSE NIN OPEN UNIT NIN FILE Btest dat STATUS UNKNOWN Write the data by columns DO J 1 N NB WRITE NIN B I L 1I 1 K L J min N J NB 1 END DO CLOSE NIN END IF Define the descriptor for the global matrices we Ba il COMB Wi K Ms IN inipyA7 HSC EES VAITE CONTEE aN Pa Saw ra Oa OFme ln Aum Py mC _C il CONMa Mi IN ME INE 0 O mDpAUC
132. Compute the inverse of a real symmetric random matrix of dimension NSIZE X NS1ZE Uses Aasen s decomposition forFortran 90 and Bunch Kaufman decomposition for FORTRAN 77 Generate NSIZE random numbers Solve a single system of linear equations with a positive definite real random matrix of dimension NSIZE X NSIZE Solve a single system of linear equations with a general real random matrix of dimension NSIZE X NSIZE Solve a single least squares system of linear equations with a real random matrix of dimension 2 X NSIZE X NSIZE Solve a single system of linear equations with a symmetric real random matrix of dimension NSIZE X NSIZE Compute the full singular value decomposition of a general real random matrix of dimension NSIZE X NSIZE Solve NSIZE systems of linear equations of a nonsymmetric NSIZE X NS1ZE tridiagonal matrix Uses cyclic reduction for both Fortran 90 and FORTRAN 77 versions Compute products of square matrices of size NSIZE X NSIZE The Fortran 90 version uses the IMSL defined operation C A x B The arrays are assumed shape The FORTRAN 77 version uses F matmul D E where the arrays are assumed size Identical problems A DandB Eare timed Compare times to use SHOW for writing a random array of size NSIZE to a CHARACTER buffer vs writing the same array to a scratch file Parallel Program Descriptions
133. DOUBLE_PRECISION MPI_ANY_SOURCE MPI_ANY_TAG MP_LIBRARY WORLD STATUS IERROR WRITE 7 STATUS MPI_SOURCE DATA If time at the root has elapsed nodes receive signal to stop Send the reporting node the go stop flag Mark if a node has been stopped CALL MPI_SEND CONTINUE 1 MPI_INTEGER STATUS MPI_SOURCE 0 MP_LIBRARY WORLD IERROR IF CONTINUE 0 MPI_NODE_PRIORITY STATUS MPI_SOURCE 1 MPI_NODE_PRIORITY STATUS MPI_SOURCE 1 1 END IF IF CONTINUE 0 MPI_NODE_PRIORITY 1 1 END IF END DO SIMULATE IF MP_RANK 0 THEN ENDF ILE UNIT 7 REWIND UNIT 7 Read the data Find extremes and averages MAX_TIME ZERO AV_TIME ZERO COUNTS 0 V_MIN HUGE ONE DO READ 7 END 10 I DATA COUNTS I 1 COUNTS I 1 1 AV_TIME I 1 AV_TIME I 1 DATA 5 IF MAX_TIME I 1 lt DATA 5 MAX TIME I 1 DATA 5 V_MIN MIN V_MIN DATA 4 END DO 10 CONTINUE CLOSE UNIT 7 Set printing Index to match node numbering SHOW_IOPT 1 SHOW_STARTING_INDEX_IS SHOW_IOPT 2 0 SHOW_INTOPT 1 SHOW_STARTING_INDEX_IS SHOW_INTOPT 2 0 CALL SHOW MAX_TIME Maximum Integration Time per process IOPT SHOW_IOPT AV_TIME AV_TIME MAX 1 COUNTS CALL SHOW AV_TIME Average Integration Time per 298 Chapter 8 Partial Differential Equations IMSL Fortran 90 MP Library 4 0 process IOPT SHOW_IOPT CALL SHOW COUNTS Number of Integrations IOPT SHOW_INTOPT WEEG UE hoe IN IG E a a e ater AyAl s lt ie
134. EFENSE Cedex FRANCE PHONE 33 1 46 93 94 20 FAX 33 1 46 93 94 39 e mail info vni paris fr Visual Numerics Japan Inc GOBANCHO HIKARI BLDG 4 Floor 14 GOBAN CHO CHIYODA KU TOKYO JAPAN 113 PHONE 81 3 5211 7760 FAX 81 3 5211 7769 e mail vnijapan vnij co jp COPYRIGHT NOTICE Copyright 1990 1998 an unpublished work by Visual Numerics Inc All rights reserved VISUAL NUMERICS INC MAKES NO WARRANTY OF ANY KIND WITH REGARD TO THIS MATERIAL INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE Visual Numerics Inc shall not be liable for errors contained herein or for incidental consequential or other indirect damages in connection with the furnishing performance or use of this material TRADEMARK NOTICE IMSL and Visual Numerics are registered trademarks or trademarks of Visual Numerics Inc in the U S and other countries All other trademarks are the property of their respective owners Use of this document is governed by a Visual Numerics Software License Agreement This document contains confidential and proprietary information constituting valuable trade secrets No part of this document may be reproduced or transmitted in any form without the prior written consent of Visual Numerics RESTRICTED RIGHTS LEGEND This documentation is provided with RESTRICTED RIGHTS Use duplication or disclosure by the U S Government is subject to the restrictions set
135. Fitting with Splines 109 xcheck i xdata j mod itngrid 1 ngrid delta_x ycheck i ydata j mod itngrid 1 ngrid delta_y end do diff norm xvalues xcheck 1 norm xcheck 1 amp norm yvalues ycheck 1 norm ycheck 1 if diff lt sqrt epsilon one then write Example 4 for SPLINE_FITTING is correct end if end Fatal and Terminal Error Messages See the messages gls file for error messages for spline_fitting These error messages are numbered 1340 1367 surface_constraints This function returns the derived type array result _surface_constraints given optional input There are optional arguments for the partial derivative indices the value applied to the spline and the periodic point for any periodic constraint The function is used for entry number j _surface_constraints j amp surface_constraints amp derivative derivative_index 1 2 amp point where_applied 1 2 value value_applied amp type constraint_indicator amp periodic_point periodic_point 1 2 The square brackets enclose optional arguments For each constraint the arguments value and periodic_point are not used at the same time Required Arguments point where_applied Input The point in the data domain where a constraint is to be applied Each point has an x and y coordinate in that order type constraint_indicator Input The indicator for the ty
136. Fortran 90 MP Library 4 0 Solve complex data system that transforms the initial values Xz_0 y_0 call lin_sol_gen x y_0 z_0 t i delta_t i 0 k 1 Compute y and y at the values t 1 k y matmul x exp spread d 2 k spread t 1 n amp spread z_0 1 n 1 2 k y_prime matmul x spread d 2 k amp exp spread d 2 k spread t 1 n amp spread z_0 l n 1 2 k Check results Is y Ay 0 err sum abs y_prime matmul atemp y amp sum abs atemp sum abs y if err lt sqrt epsilon one then write Example 4 for LIN_SOL_GEN is correct end if end Fatal and Terminal Error Messages See the messages gls file for error messages for 1in_sol_gen The messages are numbered 161 175 181 195 201 215 221 235 lin_sol_self Solves a system of linear equations Ax b where A is a self adjoint matrix Using optional arguments any of several related computations can be performed These extra tasks include computing and saving the factorization of A using symmetric pivoting representing the determinant of A computing the inverse matrix A or computing the solution of Ax b given the factorization of A An optional argument is provided indicating that A is positive definite so that the Cholesky decomposition can be used Required Arguments A Input Output Array of size n X n containing the self adjoint matrix b Input Output Array of size n X nb con
137. IDO 4 EXIT Due to errors CASE IDO 5 Evaluate initial data CASE IDO 6 Evaluate differential equations CASE IDO 7 Evaluate boundary conditions CASE IDO 8 Prepare to solve banded system CASE IDO 9 Solve banded system CALL PDE_1D_MG TO TOUT IDO U amp initial_conditions amp 268 Chapter 8 Partial Differential Equations IMSL Fortran 90 MP Library 4 0 pde_system_definition amp boundary_conditions IOPT END DO The arguments to PDE_1D_MG are required or optional Required Arguments TO Input Output This is the value of the independent variable where the integration of begins It is set to the value TOUT on return TOUT Input This is the value of the independent variable where the integration of ends Note Values of TO lt TOUT imply integration in the forward direction While Values of TO gt TOUT imply integration in the backward direction Either direction is permitted IDO Input Output This in an integer flag that directs program control and user action Its value is used for initialization termination and for directing user response during reverse communication IDO 1 This value is assigned by the user for the start of a new problem Internally it causes allocated storage to be reallocated conforming to the problem size Various initialization steps are performed IDO 2 This value is assigned by the routine when the integrator has s
138. IDO DO R E Ie joy Se Ue ICICI IDES REAN p p 1 IF ICHAR BUFFER p p gt ICHAR EXIT END DO g p 1 DO gq qtl IF ICHAR BUFFER q q lt ICHAR EXIT END DO WRITE 1x A BUFFER p q 1 p q END DO DISPLAY END DO end if IF MP_RANK 0 amp write Parallel Example 17 is finished See to any error messages and quit MPI mp_nprocs mp_setup Final end Parallel Example 18 Here we illustrate a surface fitting problem implemented using tensor product B splines with constraints There are three functions each depending on two parametric variables for the spatial coordinates Fitting each coordinate function to the data is a natural example of parallel computing in the sense that there are three separate problems of the same type The approach is to break the problem into three data fitting computations Each of these computations are allocated to nodes Note that the data is sent from the root to the nodes Every node completes the least squares fitting and sends the spline coefficients back to the root node This example requires four nodes to execute 226 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 USO SUCHE SNSS RET EENG USE rand_int USE norm_int USE Numerical_Libraries only DCONST USE mpi_setup_int implicit none INCLUDE mpif h
139. KIND 1D0 PARAMETER ZERO 0D0 ONE 1D0 REAL KIND 1D0 ALLOCATABLE amp INR Bg ES 7 BND Sy 8p M B e amp Wi ASAVE G 7 2 REAL KIND 1D0 RNORM INTEGER ALLOCATABLE INDEX IPART INTEGER K L DN J JSHIFT IERROR NSETP NSETZ LOGICAL PRINT false Serva one MRIS MP_NPROCS MP_SETUP DN N max 1 max 1 MP_NPROCS 1 ALLOCATE IPART 2 max 1 MP_NPROCS IMSL Fortran 90 MP Library 4 0 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers 255 Spread constraint rows evenly to the processors IPART 1 1 1 DO L 2 MP_NPROCS IPART 2 L 1 IPART 1 L 1 DN IPART 1 lL IPART 2 L 1 1 END DO IPART 2 MP_NPROCS N Define the constraints using random data K max 0 IPART 2 MP_RANK 1 IPART 1 MP_RANK 1 1 ALLOCATE A M K ASAVE M K BND 2 N amp X N W N B M Y M INDEX N The use of ASAVE can be replaced by regenerating the data for Als Parallel_bounded_LSQ A rand A ASAV IF MP_RANK K call show IPA E A RT and PRINT amp after the return from amp Ypeme netoa Ot the Conscient Eo D6 Solves Set the right hand side to b component zero elsew oh B Z ERO B M ON Solve the dual proble have no constraint for the primal proble ner m m one in the last Letting the dual variable
140. LAPACK_READ and ScaLAPACK_WRITE A linear system is solved with ScaLAPACK and checked US CI FE ScaLAPACK_SUPPORT USE ERROR_OPTION_PACKET USEE METEO ERUEN IMPLICIT NONE INCLUDE mpif h INTEGER PARAMETER N 9 MB 3 NB 3 NIN 10 INTEGER CONTXT PROW NPCOL MYROW MYCOL amp IO AON OA I aici IDA IN ID Ay HOI WHA WHEN Le Ty i WSC ACO Ae DESC_B 9 DESC_X 9 BUFF 3 RBUF 3 mS So LOGICAL COMMUTE true INTEGER ALLOCATABLI i ee real kind 1d0 ERROR 0d0 SIZE_X revel iki rme a elliloeeitaloila Chuneinestom s 8 88 I ICS amp XICA OLAS O28 r MP_NPROCS MP_SETUP Routines with the BLACS_ prefix are from the BLACS library CALL BLACS_PINFO MP_RANK MP_NPROCS Make initialization for BLACS CALL BLACS_GET 0 0 CONTXT Approximate processor grid to be nearly square PROW sqrt real MP_NPROCS NPCOL MP_NPROCS NPROW IF NPROW NPCOL lt MP_NPROCS THEN NPROW 1 NPCOL MP_NPROCS ND IF ALL BLACS_GRIDINIT CONTXT Rows NPROW NPCOL Qw Get this processor s role in the process grid CALL BLACS_GRIDINFO CONTXT NPROW NPCOL MYROW MYCOL Associate context BLACS with DNFL communicator CALL BLACS_GET CONTXT 10 MP_LIBRARY WORLD Bilt CKe SDO Allocate local space for each array LDA_A NUMROC N MB MYROW 0 NPROW TDA_A
141. LIN_SOL_LSOQ integer i j integer parameter m 128 n 32 k 2 n_eval 16 real kind 1d0 parameter one 1 0d0 delta_sqr 1 0d0 real kind 1d0 a m n b m 1 c n 1 p k m q k n amp x k m y k n t k m n res n_eval n_eval amp w n_eval delta Generate a random set of data points in k 2 space call rand_gen x p reshape x k m Generate a random set of center points in k space call rand_gen y q reshape y k n Compute the coefficient matrix for the least squares system t spread p 3 n do ae n end rj t 1 j spread q 1 j 2 m R a sqrt sum t 2 dim 1 delta_sqr Compute the right hand side of data values b 1 1 exp sum p 2 dim 1 Compute the solution call lin_sol_lsq a b 24 e Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 Check the results if sum abs matmul transpose a b matmul a c sum abs a amp lt sqrt epsilon one then write Example 3 for LIN_SOL_LSQ is correct end if Evaluate residuals known function approximation at a square i grid of points This evaluation is only for k 2 delta one real n_eval 1 kind one do i l n_eval w i i 1 delta end do res exp spread w 1 n_eval 2 spread w 2 n_eval 2 do j l n res res c j 1 sqrt spread w 1 n_eval q 1 j 2 amp spread w 2 n_eval q 2 j 2 delta_sqr end do end Ex
142. MG_RHS Array of size NEQ holding the linear system problem right h and side PDE_1D_MG_PANIC_FLAG Integer set to a non zero value only if the linear system is singular _PDE_1D_MG_SOL Array of size NEQ to receive the solution after the solving step IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations 271 U 1 NPDE 1 1 N nput Output This assumed shape array specifies Input information about the problem size and boundaries The dimension of the problem is obtained from NPDE 1 size U 1 The number of grid points is obtained by N size U 2 _ Limits for the variable X are assigned as input in array locations U NPDE 1 1 x U NPDE 1 N define U NPDE 1 j j 2 N 1 at completion the array U 1 NPDE 1 N contains the approximate solution value U x TOUT TOUT TAR It is not required to in location U I J The grid value i TOUT is in location U NPDE 1 J Normally the grid values are equally spaced as the integration starts Variable spaced grid values can be provided by defining them as Output from the subroutine initial_conditions or during reverse communication IDO 5 Optional Arguments initial_conditions Input The name of an external subroutine written by the user when using forward communication If this argument is not used then reverse communication is used to provide the problem information The routine gives the initial values for the system at t
143. MSL code PRINT NO STOP NO Terminal Class 5 Terminal terminal This error type indicates the existence of a serious error condition In normal use execution is terminated Default attributes PRINT YES STOP YES Global Class 6 Global warning This error type indicates the existence of a condition that may require corrective action by the user or calling routine Usually the condition can be ignored The stop or continue decision is made at the end of the processing step by calling N1RGB see Error Types and Attributes Default attributes PRINT YES STOP NO 7 Global fatal This error type indicates a condition that may be a serious error In most cases the user or calling routine must take corrective action to recover The stop or continue decision is made at the end of the processing step by calling N1RGB see Error Types and Attributes Default attributes PRINT YES STOP YES IMSL Fortran 90 MP Library 4 0 Chapter 9 Error Handling and Messages The Parallel Option 303 PRINT and STOP attributes The programmer or user can set PRINT and STOP attributes by calling E1POS as follows CAI WabHOS alF josie Seicicie where the change only applies to a single type i error 1 lt i lt 7 e If i 0 the change applies to all error types e If 7 lt i lt 1 the current attribute settings for the error type i are returned in pattr and sattr e As input values pattr or satt
144. More mathematical details for real matrices are found in Golub and Van Loan 1989 Chapter 4 When the optional Cholesky algorithm is used with a positive definite self adjoint matrix the factorization has the alternate form PAP R R 12 Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 where P is a permutation matrix and R is an upper triangular matrix The solution of the linear system Ax b is computed by solving the systems u R Pb and x PR u The permutation is chosen so that the diagonal term is maximized at each step of the decomposition The individual interchanges are optionally available in the argument pivots Example 2 System Solving with Cholesky Method This example solves the same form of the system as Example 1 The optional argument iopt is used to note that the Cholesky algorithm is used since the matrix A is positive definite and self adjoint In addition the sample covariance matrix T 0 A is computed where 2 2 _ d e O m n the inverse matrix is returned as the ainv optional argument The scale factor o andT are computed after returning from the routine Also see operator_ex06 Chapter 6 use lin_sol_self_int use rand_gen_int use error_option_packet implicit none This is Example 2 for LIN_SOL_SELF integer parameter m 64 n 32 real kind le0 parameter one 1 0e0 zero 0 0e0 real kind le0 err real kind le0 a n n b n 1
145. NaN b i j true See the isNaN function Chapter 6 Default The arrays are not scanned for NaNs iopt IO _options _lin_geig_gen_self_adj_pos _dummy If both matrices A and B are self adjoint and additionally B is positive definite then the Cholesky algorithm is used to reduce the matrix pencil to an ordinary self adjoint eigenvalue problem iopt IO _options _lin_geig_gen_for_lin_sol_self _dummy iopt IO 1 _options k size of options for lin_sol_self _dummy The options for 1in_sol_self follow as data in iopt iopt IO _options _lin_geig_gen_for_lin_eig_self _dummy iopt I0 1 _options k size of options for lin_eig_self _dummy The options for lin_eig_self follow as data in iopt iopt IO _options _lin_geig_gen_for_lin_sol_lsq _dummy iopt I0 1 _options k size of options for lin_sol_lsq _dummy The options for 1in_sol_1sq follow as data in iopt iopt IO _options _lin_geig_gen_for_lin_eig_gen _dummy IMSL Fortran 90 MP Library 4 0 Chapter 2 Singular Value and Eigenvalue Decomposition 73 iopt I0 1 _options k size of options for lin_eig_gen _dummy The options for lin_eig_gen follow as data in iopt Description Routine 1in_geig_gen implements a standard algorithm that reduces a generalized eigenvalue or matrix pencil problem to an ordinary eigenvalue problem An orthogonal decomposition is compute
146. PLICIT NONE INTEGER PARAMETER NPDE 2 N 51 NFRAMES 5 IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations 277 INTEGER I IDO Define array space for the solution real kind 1d0 U NPDE 1 N TO TOUT real kind 1d0 DELTA_T 10D0 T N ERO 0D0 ONE 1D0 amp ti D 4D0 EXTERNAL IC_01 PDE_01 BC_01 Start loop to integrate and write solution values IDO 1 DO n x EJ ECT CASE IDO Define values that determine limits CASE 1 0 ZERO TOUT 1D 3 U NPDE 1 1 ZERO U NPDE 1 N ONE OPEN FILE PDE_ex01 out UNIT 7 WRITE 7 3I5 4F10 5 NPDE N NFRAMES amp U NPDE 1 1 U NPDE 1 N TO TEND Update to the next output point Write solution and check for final point CASE 2 WRITE 7 F10 5 TOUT DO I 1 NPDE 1 WRITE 7 4E15 5 U I END DO 0 TOUT TOUT TOUT DELTA_T IF TO gt TEND IDO 3 OUT MIN TOUT TEND All completed Solver is shut down CASE 3 CLOSE UNIT 7 EXIT END SELECT Forward communication is used for the problem data 278 Chapter 8 Partial Differential Equations IMSL Fortran 90 MP Library 4 0 SUB END SUB This i END SUB IMSL Fortran 9
147. P_TYPES This is inherited by using the module SPLINE_FITTING_INT Examples 1 4 illustrate how this derived type is declared and assigned components The Derived Type Function spline_constraints The user defines the constraints of the spline at discrete points by use of an array of derived type Each entry of that array has components with the following definitions type _spline_constraints 96 Chapter 4 Curve and Surface Fitting with Splines IMSL Fortran 90 MP Library 4 0 integer derivative_index real kind where_applied CHARACTER LEN constraint_indicator real kind value_applied end type A generic function is packaged in the module SPLINE_FITTING_INT Its values are arrays of derived type _spline_constraints determined by the precision of the arguments The Evaluator Function spline values After computation of the B spline coefficients values of the spline its derivative functions or the square root of the variance function are evaluated with this function Since a major use of the values are likely to be for graphical display a vector of input value yields a vector of output spline values of the same size as the input The same quantities can be evaluated at a single independent variable value The Array Function spline fitting The coefficients of the B spline are the output values of this generic function The precision of the coefficients is determined through the generic interfa
148. Peter 1982 Essentials of Numerical Analysis John Wiley amp Sons New York Hildebrand Hildebrand F B 1974 Introduction to Numerical Analysis 2d ed McGraw Hill Book Co New York IMSL IMSL 1994 IMSL MATH LIBRARY User s Manual Version 3 0 Visual Numerics Inc Houston Texas Koelbel et al Koelbel et al 1994 The High Performance Fortran Handbook MIT Press Cambridge MA Lawson and Hanson Lawson Charles L and Hanson R J 1995 Solving Least Squares Problems Classics in Applied Mathematics 15 SIAM Publications Philadelphia PA Metcalf and Reid Metcalf M and J Reid 1990 Fortran 90 Explained Oxford Science Publications Oxford United Kingdom IMSL Fortran 90 MP Library 4 0 Appendix C References C 3 C 4 Appendix C References Mor et al Mor J J Garbow B S and Hillstrom K E 1982 Testing Unconstrained Minimization Software ACM Trans Math Soft 7 1 pages 1 16 NAG NAG 1991 NAGWare The Essential f90 Compiler Releases 2 0a NCSU4OIN Pennington Berzins Pennington S V Berzins M 1994 New NAG Library Software for First Order Partial Differential Equations ACM Trans Math Soft 20 1 pages 63 99 Rodrigue Rodrigue Garry 1982 Parallel Computation Academic Press New York NY Snir Otto Huss Lederman Walker andDongarra Snir Marc Steve Otto Steven Huss Lederman David Walker and Jack Dongar
149. R x t u u Q x t u u j 1 NPDE x lt x lt xp t gt t m 0 1 2 Equation 2 IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations 265 e 7 ree on is the solution The integer value NPDE 1 is the OOF eats regarded in u Cik R and Q ae The vector i f R an number of differential equations The functions 7 special cases as flux and source terms The functions expected to be continuous Allowed values 0 m 1 and m 2 are for problems in Cartesian cylindrical polar and spherical polar coordinates In the two cases m gt 0 the interval z xr must not contain x 0 as an interior point The boundary conditions have the master equation form B x t R tuu y x t u u at x x and x xp j 1 NPDE Equation 3 sA and y f In the boundary conditions the B Yj are continuous functions of their arguments In the two cases m gt 0 and an endpoint occurs at 0 the finite value of the solution at x O must be ensured This requires the specification of the R 0 R j 0 X x or X XR The initial values solution at x 0 or implies that satisfy u x to z u x XE LR where 0is a piece wise continuous vector function of X with NPDE components The user must pose the problem so that mathematical definitions are known for the functions ki Ri 2 Bj Yj and 0 These functions are provided to the routine PDE_1D_MG in the form of three subrouti
150. S is called to change PRINT or STOP attributes errors at the current level are not affected Also note the following Calls to E1MES at Level 1 should be surrounded by calls to E1PSH and E1POP so that the user can control printing and stopping If a PRINT or STOP attribute is set to NO for example by the user at Level 1 then it cannot be set to YES at any level greater than 1 Q 9 Are tracebacks on for all messages A Traceback is ON all error types but tracebacks are given only if printing occurs Q 10 How can I force a specific portion of my message to begin on a new line A Insert the following two characters in the message For example CAbIs BLPSH MY SUB 3 GN MMS M04 iw cme wise ie I et igen LUNS CALI EBLPOP OMYSUB The resulting message might look like the following x FATAL ERROR 104 from MYSUB Line one mS aks ey ASN aE Q 11 Is there a way to avoid having trailing blanks removed from a string inserted into a message A Yes use a negative index For example CAE ESE E UString e aN oa ES u Q 12 Why do error messages not print when the PRINT attributes are set to YES A They should print when E1PoP has reached Level 1 so that no more routine names remain on the stack Q 13 Lused the error printing routine E1MES in my code My function call to NIRTY 1 returned the correct error type but no message printed What
151. SELECT CASE IDO Define values that determine limits CASE 1 0 ZERO TOUT 1D 3 U NPDE 1 1 ZERO U NPDE 1 N ONE IOPT 1 PDE_1D_MG_MAX_BDF_ORDER IOPT 2 5 IOPT 3 D_OPTIONS PDE_1D_ MG RELATIVE_TOLERANCE 1D 2 IOPT 4 D_OPTIONS PDE_1D_MG_ABSOLUTE_TOLERANCE 1D 2 IMES MP I_WTIME Update to the next output point Write solution and check for final point CASE 2 O0 TOUT TOUT TOUT DELTA_T IF TO gt TEND IDO 3 OUT MIN TOUT TEND All completed Solver is shut down CAS HGS IMEE MP I_WTIME EXIT Define initial data values SANS Hie to U 1 1D0 U 2 0D0 Define differential equations CASE 6 D_PDE_1D_MG_C 0D0 D_PDE_1D_MG_C 1 1 1D0 D_PDE_1D_MG_C 2 2 1D0 D_PDE_1D_MG_R P D_PDE_1D_MG_DUDX D_PDE_1D_MG_R 1 D_PDE_1D_MG_R 1 EPS Z ETA D_PDE_1D_MG_U 1 D_PDE_1D_MG_U 2 THREE D_PDE_1D_MG_Q 1 EXP Z EXP TWO Z D_PDE_1D_MG_Q 2 D_PDE_1D_MG_Q 1 Define boundary conditions CASET IF PDE_1D_MG_LEFT THE D_PDE_1D_MG_BETA 1 1D0 D_PDE_1D_MG BETA 2 0D0 D_PDE_1D_MG_GAMMA 1 0D0 D_PDE_1D_MG_GAMMA 2 D_PDE_1D_MG_U 2 ELSE D_PDE_1D_MG_BETA 1 0D0 D_PDE_1D_MG_BETA 2 1D0 D_PDE_1D_MG_GAMMA 1 D_PDE_1D_MG_U 1 1D0 D_PDE_1D_MG_GAMMA 2 0D0O END IF END SELEC Rev
152. SETUP_INT IMPLICIT NONE TINCHIDI Ho E Ion N 6 MB 2 NB 2 NIN 10 NPCOL MYROW amp TDA Car 8p CEA 3 INTEGER PARAMETER M 6 INTEGER CONTXT DESC_A 9 NPROW MAACO IHR Io ie Ise EDA real kind 1d0 allocatable A real kind 1d0 ERROR TYPE d_OPTIONS IOPT 1 P_NPROCS MP_SETUP CALL BLACS_PINFO MP_RANK MP_NPROCS Make initialization for BLACS CALL BLACS_GET 0 0 CONTXT Approximate processor grid to be nearly square PROW sqrt real MP_NPROCS N NPROW 1 NPCOL MP_NPROCS HAINID ILI PCOL MP_NPROCS NPROW IF NPROW NPCOL lt MP_NPROCS THEN CALL BLACS_GRIDINIT CONTXT Rows NPROW NPCOL Get this processor s role in th process grid CALL BLACS_GRIDINFO CONTXT NPROW NPCOL MYROW MYCOL BLOCK DO LDA NUMROC M MB MYROW 0 NPROW TDA NUMROC N NB MYCOL 0 NPCOL ALLOCATE d_A LDA TDA A root process is used to create the matrix data for the test IF MP_RANK 0 THEN ALLOCATE A M N IMSL Fortran 90 MP Library 4 0 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers e 237 Fill array with a pattern that is easy to recognize K 0 DO K K 1 IF 10 K gt N EXIT END DO DO J 1 N DO I 1 M The values will appear as decimals I J where I is the row and J is the column A 1I J REAL I REAL J
153. SL Fortran 90 MP Library 4 0 A rand A Compute the singular value decomposition S SVD A U U V V Check for small residuals of the expression A V U S err norm A x V U x diag S norm S if err lt sqrt epsilon one then write Example 1 for LIN_SVD operators is correct end if end Operator_ex22 use linear_operators implicit none This is Example 2 using operators for LIN_SVD integer parameter m 64 n 32 k 4 real kind 1d0 parameter one 1 0d0 zero 0 0d0 real kind 1d0 a m n s n u m m v n n amp b m k x n k g m k alpha k lamda k amp delta_lamda k t_g n k s_sq n phi n k amp phi_dot n k move k err Generate a random matrix for both A and B A rand A b rand b Compute the singular value decomposition S SVD A U u V v Choose alpha so that the lengths of the regularized solutions are 0 25 times lengths of the non regularized solutions g u atxi bj x v x diag one S x g 1 n alpha 0 25 sqrt sum x 2 DIM 1 t_g diag S x g l n S_sq s 2 lamda zero solve_for_lamda do x one spread s_sq DIM 2 NCOPIES k amp spread lamda DIM 1 NCOPIES n phi t_g x 2 phi_dot 2 phi x delta_lamda sum phi DIM 1 alpha 2 sum phi_dot DIM 1 Make Newton method correction to solve the secular equations for lamda lamda lamda delta_lamda Test for conv
154. ScaLAPACK routine It uses the two dimensional block cyclic array descriptor for the matrix to extract the data from the assumed size 236 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers IMSL Fortran 90 MP Library 4 0 This is Exampl LOTE Shows skisi Saliel Cie am OIlLAeS Erangoosttion O El l aloe eyele arrays on the processors The blocks of data are transmitted and received then written The block sizes contained in the array descriptor determines the data set size for each blocking send and receive pair The number of these synchronization points is proportional to M x N MBx NB A temporary local buffer is allocated for staging the matrix data It is of size M by NB when writing by columns or N by MB when writing by rows Example 1 Distributed Transpose of a Matrix In Place The program SCPK_EX1 illustrates an in situ transposition of a matrix An mXnmatrix A is written to a file by rows The nXmmatrix B Ke overwrites storage for A Two temporary files are created and deleted There is usage of the BLACS to define the process grid and provide further information identifying each process This algorithm for transposing a matrix is not efficient We use it to illustrate the read and write routines and optional arguments for writing of data by matrix rows program scpk_exl e 1 for ScaLAPACK_READ and ScaLAPACK_WRITI Ea ETS USE ScaLAPACK_SUPPORT USE ERROR_OPTION_PACKET USE MPI
155. T Define processor grid to be 1 by MP_NPROCS NPROW 1 CALL BLACS_GRIDINIT CONTXT N A NPROW MP_NPROCS Get this processor s role in the process grid CALL BLACS_GRIDINFO CONTXT NPROW PINPROCS amp MYROW MYCOL Connect BLACS context with communicator MP_LIBRARY_WORLD CALL BLACS_GET CONTXT 10 MP_LIBRARY WORLD Setup for MPI 250 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers IMSL Fortran 90 MP Library 4 0 MP_NPROCS MP_ SETUP DN max 1 NP MP_NPROCS ALLOCATE IPART 2 MP_NPROCS Spread columns evenly to the processors Any odd number of columns are in the processor with highest rank IPART 1 1 IPART 2 0 DO L 2 MP_NPROCS IPART 2 L 1 IPART 1 L 1 DN IPART 1 1L IPART 2 L 1 1 END DO IPART 2 MP_NPROCS NP IPART 2 min NP IPART 2 Note which processor L 1 receives the right hand side DO L 1 MP_NPROCS 1g oes Neee Lik lt NY Seal We lt Neue Bin sy ida END DO K max 0 IPART 2 MP_RANK 1 IPART 1 MP_RANK 1 1 ALLOCATE d_A M K W N X N Y N amp B M C M INDEX N IF MP_RANK 0 THEN ALLOCATE A M N Define the matrix data using random values A rand A B rand B Write the rows of data to an external file OPEN UNIT NIN FILE Atest dat STATUS UNKNOWN DO I 1 M WRITE NIN A I J
156. TATUS DELETE OPEN UNIT NIN FILE TEST DAT STATUS OLD DO J 1 M MB READ NIN A I L I 1 N L J min M J MB 1 END DO CLOSE NIN STATUS DELETE DOMEENIN 238 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers IMSL Fortran 90 MP Library 4 0 DO J 1 M The values will appear as decimals I J where I is the row U eme o Sseeh cmc osumiays A 1I J REAL J REAL I 10d0 K A I J END DO END DO ERROR SUM ABS A END IF The processors in use now exit the loop EXLT BLOCK END DO BLOCK See to any error messages call elpop Mp_setup Check results on just one process IF ERROR lt SORT EPSILON ERROR and amp P_ RANK 0 THEN weree 7 23 Ua sEEam pike euler Oise SHA CSu eS COC rS Citra END IF IF ALLOCA Deallocate storage arrays and exit from BLACS ED A D IF ALLOCA ALLOCAT E A ED d_A CA CA iby 15 iby JB ACS_GRIDI ACS_EXIT 0 END program Py aboisiey als ance hve USI Ex The program SCPK_I C m file DEALLOCAT E d_A Bole cron wenna Cle JOIEOOSES Cake EXIT CONTXT ample 2 Distributed Matrix Product with PBLAS EX2 illustrates computation of the matrix product xn Amxk Bkxn The matrices on the right hand side are random Three temporary s are created and d
157. There are explicit solutions for this equation based on the Normal Curve of Probability The normal curve and the solution itself can be efficiently computed with the IMSL function ANORDF IMSL 1994 page 186 With numerical integration the equation itself or the payoff can be readily changed to include other formulas cs T and corresponding boundary conditions We use e 100 r 0 08 T t 0 25 0 0 04 sz 0 and sp 150_ Rationale This is a linear problem but with initial conditions that are discontinuous It is necessary to use a positive time smoothing value to prevent grid lines from 3 B crossing We have used an absolute tolerance of 10 In US this is one tenth of a cent a program PDE_1D_ Black Scholes call USE G_EX08 price pde_ld_mg_int USE error_option_packet IMPLICIT NONE INTEG ER PARAMET INTEGER I I Define array space real Il m ys D 27 L 8 TYP Gl DO kind 1d0 kind 1d0 TA_T 25D 3 E 100D0 D_OPTIONS FRAMES the U NPDE 1 ZE EN for T IOPT 5 NPD solu RO 0D0 D 25D 2 tion TO TOUT HALF 5D 1 XMAX 150 SIGSQ XVAL ONE 1D0 amp SIGMA 2D 1 amp Start loop to integrate and record solution values IDO 1 DO S H ECT CAS IDO h Z Define values CASE U P EN ERO
158. There should be no FATAL error announcement within the prepmess_output file Private Message Files Users can create a private message file within their own messages This file would generally be used by an application that calls the IMSL Fortran 90 MP Library Follow the steps outlined above to created a private messages gls file The user should then be given a copy of the prepmess executable In the application code call the error_post subprogram with the new_unit new_path optional arguments The new path should point to the directory in which the private messages daf file resides rand_gen use rand_gen_int Generates a rank 1 array of random numbers The output array entries are positive and less than 1 in value Required Argument x Output Rank 1 array containing the random numbers Example 1 Running Mean and Variance An array of random numbers is obtained The sample mean and variance are computed These values are compared with the same quantities computed using a stable method for the running means and variances sequentially moving through the data Details about the running mean and variance are found in Henrici 1982 pp 21 23 implicit none This is Example 1 for RAND_GEN 126 Chapter 5 Utilities IMSL Fortran 90 MP Library 4 0 integer i integer parameter n 1000 real kind le0 parameter one le0 zero 0e0 real kind le0 x n mean_1 0 n mean_2 0 n s_1 0 n s_2 0 n Ob
159. Two nodes are required to execute use linear_operators use mpi_setup_int implicit none This is Parallel Example 4 for matrix exponential The box dimension has a single rack integer parameter n 32 k 128 nr 1 integer i real kind le0 real kind le0 real kind le0 complex kind 1 Vian Cora sy are eee parameter one le0 t_max one delta_t t_max k 1 err nr sizes nr A n n nr t k y n k nr y_prime n k nr SO 5 E mE nS On Gy ine BS s n sa iane 4 7 Ol Setup for MPI Establish a node priority order Restrict the root from significant computing Illustrates using the best performing node that SEG TOON TENE TeGXOIE Cor GE Socle rask MP_NPROCS MP_SETUP n MPI_ROOT_WORKS false Generate a random coefficient matrix A rand A Compute th igenvalu igenvector decomposition of the system coefficient matrix on an alternate node D EIG A W X Generate a random initial value for the ODE system y O rand y_0 Solve complex data system that transforms the initial values X z_0O y_0 NS OC pti WA HLS pies SS eae sees satie The grid of points where a solution is computed t i delta_t i 0 k 1 IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 209 Compute y and y at the values t 1 k With th igenvalu igenvector decomposition AX XD this is an evaluation
160. Vinales S Fortran 90 Subroutines and Functions with M PI Enhanced Subroutines and Functions for Distributed Scientific Applications Fortran 90 MP Library User s Guide when the bookmarks are displayed E Click to display only the page Click to go back to the previous page from which you jumped 5 Click to display both bookmark xE ithe pane IY POETO ONTAN gt Click to go to the next page Ei Double click to jump to a topic gt l Click to go to the last page vb Click to jump to a topic when the Click to go back to the previous view and bookmarks are displayed page from which you jumped Click to display both thumbnails and the page Click to return to the next view Click and use to drag the page in vertical begs Click to view the page at 100 zoom direction and to select items on the page Click to fit the entire page within the Click and drag to page to magnify window the view Click and drag to page to reduce the view Click to fit the page width inside the window Click to find part of a word a complete word or multiple words in a active document Click and drag to the page to select text a egek a Click to go to the first page za E 2 B z3 E M Printing an online file Select Print from the File menu to print an online file The dialog box that opens allows you to print full text range of pages or selection Important Note The last blank page of each chapter
161. X Ni Ax u n K K 1 n 2n n 0 lt i lt N n o nyy Sny The values i are the so called point concentration of the grid and K 2 0 denotes a spatial smoothing parameter Now the grid points are defined implicitly so that uted p a Et dt i d F EEN 1 lt i lt N i l i I gt where T20 is a time smoothing parameter Choosing T very large results in a fixed grid Increasing the value of T from its default avoids the error condition where grid lines cross The divisors are A2 NPDE U 1 U M a NPDE UED jio Ax The value K determines the level of clustering or spatial smoothing of the grid points Decreasing K from its default decrease the amount of spatial smoothing M The parameters i approximate arc length and help determine the shape of the grid or i distribution The parameter T prevents the grid movement from adjusting immediately to new values of the M thereby avoiding oscillations in the grid that cause large relative errors This is important when applied to solutions with steep gradients The discrete form of the differential equation and the smoothing equations are combined to yield the implicit system of differential equations The three tiered equal sign used here and below is read a b or a and b are exactly the same object or value IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations 267 dY A Y Oa i 1 NPDE 1 NPDE Lee
162. X array of the same precision as the data For rank 1 transforms the size of WORK is n 15 To define this array for each problem set WORK 1 0 Each additional rank adds the dimension of the transform plus 15 Using the optional argument WORK increases the efficiency of the transform This function uses routines fast_dft fast_2dft and fast_3df t from Chapter 3 162 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 The option and derived type names are given in the following tables Option Name for IFFT Option Value options_for_fast_dft 1 Derived Type Name of Unallocated Array s_options s_ifft_options s_options s_ifft_options_once d_options d_ifft_options d_options d_ifft_options_once Example Compute the DFT of a random complex array and its inverse transform x rand x y fft x x ifft y IFFT BOX The inverse Discrete Fourier Transform of several complex or real sequences Required Argument The function requires one argument x If x is an assumed shape complex array of rank 2 3 or 4 the result is the complex array of the same shape and rank consisting of the inverse DFT Modules Use the appropriate module ifft_box_int or linear_operators Optional Variables Reserved Names The optional argument is WORK a COMPLEX array of the same precision as the data For rank 1 transforms the size of WORK is n 15 To define this array for
163. _NONNEGATIVE_LSQ Option Name Option Value PNLSQ_SET_TOLERANCE 1 PNLSQ_SET_MAX_ITERATIONS 2 PNLSQ_SET_MIN_RESIDUAL 3 246 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers IMSL Fortran 90 MP Library 4 0 IOPT 10 _OPTIONS PNLSQ_SET_TOLERANCE TOLERANCE Replaces the default rank tolerance for using a column from EPSILON TOLERANCE to TOLERANCE Increasing the value of TOLERANCE will cause fewer columns to be moved from their constraints and may cause the minimum residual RNORM to increase IOPT 10 _OPTIONS PNLSQ_SET_MIN_RESIDUAL RESID Replaces the default target for the minimum residual vector length from 0 to RESID Increasing the value of RESID can result in fewer iterations and thus increased efficiency The descent in the optimization will stop at the first point where the minimum residual RNORM is smaller than RESID Using this option may result in the dual vector not satisfying its optimality conditions as noted above IOPT IO PNLSQ_SET MAX ITERATIONS IOPT I0 1 NEW_MAX_ITERATIONS Replaces the default maximum number of iterations from 3 N to NEW_MAX_ITERATIONS Note that this option requires two entries in the derived type array Algorithm Subroutine PARALLEL_NONNEGATIVE_LSQ solves the linear least squares system Ax b x 20 using the algorithm NNLS found in Lawson and Hanson 1995 pages 160
164. _PDE_1D_MG_R 2 D_PDE_1D_MG_DUDX 1 D_PDE_1D_MG_Q 1 ZERO D_PDE_1D_MG_Q 2 amp D_ PDE 1D MG_U 2 D_PDE_1D_MG_DUDX 1 Define boundary conditions CASE 7 D_PDE_1D_MG_BETA ZERO IF PDE_1D_MG_LEFT THEN DIFF EXP 20D0 D_PDE_1D_MG_T Blend the left boundary value down to zero D_PDE_1D_MG_GAMMA D_PDE_1D_MG_U 1 DIFF D_PDE_1D_MG_U 2 ELSE D_PDE_1D_MG_GAMMA D_PDE_1D_MG_U 1 ONE D_PDE_1D_MG_DUDX 2 END IF END SELECT Reverse communication is used for the problem data CALL PDE_1D_MG TO TOUT IDO U IOPT IOPT END DO end program 282 Chapter 8 Partial Differential Equations IMSL Fortran 90 MP Library 4 0 Example 3 Population Dynamics This example is from Pennington and Berzins 1994 The system is u u 1 t u x 0 lt x lt a xg t20 a I t Ju x t ax 0 u 0 t g folx a t st where z xy exp x ee y 1 and 42 2 2 exp a exp t 1 exp a 1 1 2a exp 2a 1 exp a exp t This is a notable problem because it involves the unknown exp x u x t 1 exp a exp t across the entire domain The software can solve the problem by introducing two dependent algebraic equations u x t dx v t 0 v gt t J xexo x u x t ae 0 This leads to the modified system u U VyU OS xa t20 1 u 0 t g ivy v 1 In the interface to the evaluation of the differential equation and boundary conditi
165. _RANK 1 JSHIFT J IPART 1 MP_RANK 1 1 X J dot_product B ASAVE JSHIFT END DO This cleans up residuals that are about rounding error unit times the size of the constraint equation and right hand side They are replaced by exact zero WHERE W ZERO X ZERO W X Each group of residuals is disjoint per processor We add all the pieces together for the total set of IRComs terscuenie ses IF MP_NPROCS gt 1 amp CALL MPI_REDUCE X W N MPI_DOUBLE_P MPI_SUM 0 MP_LIBRARY WORLD IERRO IF MP_RANK and PRINT amp call show W Residuals for the constraints ECISION amp See to any errors and shut down MPI P_NPROCS MP_SETUP Final P_RANK 0 THEN COUNT W lt ZERO and amp ZERO gt F WRITE amp Example 1 for PARALLEL _BOUNDED_LSQ is correct al Example 2 Distributed Newton Raphson Method with Step Control The program PBLSQ_EX2 illustrates the computation of the solution of a non linear system of equations We use a constrained Newton Raphson method This algorithm works with the problem chosen for illustration The step size control used here employing only simple bounds may not work on other non linear systems of equations Therefore we do not recommend the simple non linear solving technique illustrated here for an arbitrary problem The test c
166. _array must be used together work_array w Output Input Complex array of rank 1 used to store working variables and values between calls to fast_3dft The value for size w must be at least as large as the value ido for the value of ido lt 0 iopt iopt Input Output Derived type array with the same precision as the input array used for passing optional data to fast_3dft The options are as follows Packaged Options for fast_3dft Option Prefix Option Name Option Value C Z fast_3dft_scan_for_NaN 1 Z fast_3dft_near_power_of_2 2 gu fast_3dft_scale_forward 3 z fast_3dft_scale_inverse 4 iopt IO _options _fast_3dft_scan_for_NaN _dummy Examines each input array entry to find the first value such that isNaN x i j k true See the isNaN function Chapter 6 Default Does not scan for NaNs iopt IO _options _fast_3dft_near_power_of_2 _dummy Nearest powers of 2 gt m n and k are returned as an outputs in iopt I0 1 Sidummy iopt 10 2 idummy and iopt 10 3 Sidummy iopt IO _options _fast_3dft_scale_forward real_part_of_scale iopt IO 1 _options _dummy imaginary_part_of_scale Complex number defined by the factor IMSL Fortran 90 MP Library 4 0 Chapter 3 Fourier Transforms 93 cmplx real_part_of_scale imaginary_part_of_scale is multiplied by the forward transformed array Default value is 1 iopt IO _options _fa
167. _eig_gen_scan_for_NaN _dummy Examines each input array entry to find the first value such that isNaN a i j true See the isNaN function Chapter 6 Default The array is not scanned for NaNs iopt IO _options _lin_eig_no_balance _dummy The input matrix is not preprocessed searching for isolated eigenvalues followed by rescaling See Golub and Van Loan 1989 Chapter 7 for references With some optional uses of the routine this option flag is required Default The matrix is first balanced iopt IO _options _lin_eig_gen_set_iterations _dummy Resets the maximum number of iterations permitted to isolate each diagonal block matrix Default The maximum number of iterations is 52 iopt IO _options _lin_eig_gen_in_Hess_form _dummy The input matrix is in upper Hessenberg form This flag is used to avoid the initial reduction phase which may not be needed for some problem classes Default The matrix is first reduced to Hessenberg form iopt IO _options _lin_eig_gen_out_Hess_form _dummy The output matrix is transformed to upper Hessenberg form H If the optional argument v v is passed by the calling program unit then the array V contains an orthogonal matrix Q such that AQ QH 0 Requires the simultaneous use of option _lin_eig_no_balance Default The matrix is reduced to diagonal form iopt IO _options _lin_eig_gen_out_block_form _dummy The
168. _save_factors zero Suppress error messages and stopping due to singularity of the matrix which is expected iopti 3 d_options d_lin_sol_self_no_sing_mess zero atemp a do i l n a i i a i i e k end do Compute A eigenvalue I as the coefficient matrix do tries 1 2 call lin_sol_self a b x amp pivots ipivots iopt iopti When code is r ntered the already computed factorization is used iopti 4 d_options d_lin_sol_self_solve_A zero Reset right hand side nearly in the direction of the eigenvector b x sqrt sum x 2 end do IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 15 Normalize the eigenvector x x sqrt sum x 2 Check the results err dot_product x 1l n 1 matmul atemp l n 1 n x 1l n 1 amp e k If any result is not accurate quit with no summary printing if abs err lt sqrt epsilon one e 1 then write Example 3 for LIN_SOL_SELF is correct end if end Example 4 Accurate Least squares Solution with Iterative Refinement This example illustrates the accurate solution of the self adjoint linear system IT Al ir Ib A olx lo computed using iterative refinement This solution method is appropriate for least squares problems when an accurate solution is required The solution and residuals are accumulated in double precision while the decomposition is computed in single precision
169. _sol_gen_solve_ADJ 4 Bde ez lin_sol_gen_no_pivoting 5 s d c z_ lin_sol_gen_scan_for_NaN 6 s d c z_ lin_sol_gen_no_sing_mess 7 s d c z_ lin_sol_gen_A_is_sparse 8 iopt IO _options _lin_sol_gen_set_small Small Replaces a diagonal term of the matrix U if it is smaller in magnitude than the value Small using the same sign or complex direction as the diagonal The system is declared singular A solution is approximated based on this replacement if no overflow results Default the smallest number that can be reciprocated safely iopt IO _options _lin_sol_gen_set_save_LU _dummy Saves the LU factorization of A Requires the optional argument pivots if the routine will be used later for solving systems with the same matrix This is the only case where the input arrays A and b are not saved For solving efficiency the diagonal reciprocals of the matrix U are saved in the diagonal entries of A iopt IO _options _lin_sol_gen_solve_A _dummy Uses the LU factorization of A computed and saved to solve Ax b iopt IO _options _lin_sol_gen_solve_ADJ _dummy Uses the LU factorization of A computed and saved to solve A D iopt IO _options _lin_sol_gen_no_pivoting _dummy Does no row pivoting The array pivots if present are output as pivots i i for i 1 n iopt IO _options _lin_sol_gen_scan_for_NaN _dummy Examines each input array entry to find the first value suc
170. _sol_tri_ex3 lin_sol_tri_ex4 lin_svd_exl lin_svd_ex2 lin_svd_ex3 lin_svd_ex4 lin_eig_self_exl lin_eig_self_ex2 lin_eig_self_ex3 lin_eig_self_ex4 lin_eig_gen_ex1 lin_eig_gen_ex2 lin_eig_gen_ex3 lin_eig_gen_ex4 lin_geig_gen_ex1l lin_geig_gen_ex2 lin_geig_gen_ex3 lin_geig_gen_ex4 B 2 e Appendix B List of Examples Compress an image the black interior of an approximate circle using SVD Inversion of the Laplace Transform of a unit step function using SVD Solve many tridiagonal systems using cyclic reduction with random data Solve many tridiagonal systems using iterative refinement Switch solution method from Cyclic Reduction to Gaussian Elimination if required Uses random data Solve for selected eigenvectors of a tridiagonal matrix Switch solution method from Cyclic Reduction to Gaussian Elimination if required Uses random data Solve a One Dimensional diffusion PDE Uses the IMSL MATH LIBRARY DAE solver D2SPG Solves the tridiagonal corrector equations in reverse communication mode Outer loop solves a boundary value problem Compute SVD of a square matrix with random data Use SVD to solve linear least squares problem with a quadratic constraint Uses random data Use SVD to compute a GSVD of two random matrices Use SVD to solve a linear least squares problem based on ridge regression as cross validation Uses random data Compute eigenvalues of a self adjoint matri
171. a reactant in a chemical system The formula for h z is equivalent to their example IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations 289 u u th u where h z f isa z exp 5 1 z 1 a a 1 6 20 R 5 O0 lt x lt 1 0 lt t lt 0 29 u x 0 1 u 0 x 0 u l x 1 Rationale This is a non linear problem The output shows a case where a rapidly changing front or hot spot develops after a considerable way into the integration This causes rapid change to the grid An option sets the maximum order BDF formula from its default value of 2 to the theoretical stable maximum value of 5 USE pde_ld_mg_int USE error_option_packet IMPLICIT NONE INTEGER PARAMETER NPDE 1 N 80 INTEGER I IDO NFRAMES Define array space for the solution real kind 1d0 U NPDE 1 N TO TOUT kind 1d0 ZERO 0D0 ONE 1D0 DELTA_T 1D 2 amp ND 29D 2 XMAX 1D0 A 1D0 DELTA 2D1 R 5D0 D_OPTIONS IOPT 2 Start loop to integrate and record solution values IDO 1 DO SE H ECT CASE IDO Define value t determine limits CAS Fn th OUT DELTA_T U NPDE 1 1 ZERO U NPDE 1 N XMAX OPEN FILE PDE_ex06 out UNIT 7 NFRAMES TEND DELTA_T DELTA_T WRITE 7 3I5 4D14 5 NPDE N NFRAMES amp U NPDE 1 1 U NPDE 1 N TO TEND Illustrate allowing the BDF order to increase to its maximum allowed val
172. abs c a maxval abs c if err lt sqrt epsilon one then write Example 1 for FAST_2DFT is correct end if end Optional Arguments forward_in x Input Stores the input complex array of rank 2 to be transformed 86 Chapter 3 Fourier Transforms IMSL Fortran 90 MP Library 4 0 forward_out y Output Stores the output complex array of rank 2 resulting from the transform inverse_in y Input Stores the input complex array of rank 2 to be inverted inverse_out x Output Stores the output complex array of rank 2 resulting from the inverse transform mdata m Input Uses the sub array in first dimension of size m for the numbers Default value m size x 1 ndata n Input Uses the sub array in the second dimension of size n for the numbers Default value n size x 2 ido ido Input Output Integer flag that directs user action Normally this argument is used only when the working variables required for the transform and its inverse are saved in the calling program unit Computing the working variables and saving them in internal arrays within fast_2dft is the default This initialization step is expensive There is a two step process to compute the working variables just once Example 3 illustrates this usage The general algorithm for this usage is to enter fast_2dft with ido 0 A return occurs thereafter with ido lt 0 The optional rank 1 complex array w with size w gt
173. active bounds for the solution All processors in the communicator start and exit with the same vector BND 1 2 1 N Input Assumed size array containing the bounds for xX The lower bound x is in BND 1 J and the upper bound B is in BND 2 J X 1 N Output Assumed size array of length N containing the solution amp lt x lt The value SIZE x defines the value of N All processors exit with the same vector RNORM Output Scalar that contains the Euclidean or least squares length of the residual vector Ax b All processors exit with the same value W 1 N Output Assumed size array of length N containing the dual vector w A b Ax At a solution exactly one of the following is true for each j l lt j lt n ea x b and w arbitrary a x andw lt 0 e o 5 5 oe x B andw 20 a lt x lt B and w 0 All processors exit with the same vector INDEX 1I N Output Assumed size array of length N containing the NSETP indices of columns in the solution interior to bounds and the remainder that are at a constraint All processors exit with the same array IPART 1 2 1 max 1 MP_NPROCS Input Assumed size array containing the partitioning describing the matrix A The value MP_NPROCS is the number of processors in the communicator except when MPI has been finalized with a call to the routine MP_SETUP Final This causes MP_NPROCS to be
174. aise END DO END DO IF MP_NPROCS gt 1 n GS As ae aay res Gays PN Gof ne and MPI_NODE_PRIORITY 1 0 BEST 2 Only the most The rest set i ffective node does this job dl IF MP_RANK Compute the si S SVD A How many terms k count Mee ae Wee es Ie TOKE ALS NOR IF MPI_NOD MPI_NODE_PRIORITY BEST EXIT BLOCK ngular value decomposition U U V V to the nearest integer match the circle e aa e oo ESKES E eaae VEE the most efficient node send C back amp PRIORITY BEST ORS CALL MPI_S EXIT BLOCK END DO BLOCK There may be a IF MPI_NOD CALL MPI MP_LIB aie Coume write See tO amy eri END C N 2 MPI_REAL 0 MP_RANK MP_LIBRARY_WORLD IERROR matrix to receive from the best node E_PRIORITY BEST gt 0 and MP_RANK amp _RECV C N 2 MPI_REAL MPI_ANY SOURCE MPI_ANY TAG amp RARY_WORLD STATUS IERROR Genin C A 0 O Een MS AN amp Parallel Example 16 is correct mp_nprocs end or messages and exit MPI mp_setup Final Parallel Example 17 Occasionally it is necessary to print output from all nodes of a communicator This example has each non root node prepare the output it will print in a character buffer Then each node in turn the character buffer is transmitted to the root The root prints the b
175. aj Define values CASE at determine limits F 0 O U OP NE WR N DE 1 1 ZERO U NPDE 1 N XMAX FILE PDE_ex05 out UNIT 7 S NINT TEND DELTA_T DELTA_T 7 3I5 4D14 5 NPDE N NFRAMES amp D iw E H ka D D Z wa Ae VaQHDAMa2AciIaAdc ljo ang NPDE 1 1 U NPDE 1 N TO TEND IO 1 PDE_1D_MG_REV_COMM_FACTOR_SOLVE Update to the next output point Write solution and check for final point CASE 2 TO TOUT IF TO lt TEND THEN WRITE 7 F10 5 TOUT DO I 1 NPDE 1 WRITE 7 4E15 5 U I END DO TOUT MIN TOUT DELTA_T TEND IF TO TEND IDO 3 END IF All completed Solver is shut down CASE 3 CLOSE UNIT 7 EXIT Define initial data values CASE 5 U 1 ONE U 2 2D 1 WRITE 7 F10 5 TO DO I 1 NPDE 1 WRITE 7 4E15 5 U I Define differential equations CASE 6 D _1D_MG_C ZERO D D_MG_C 1 1 ONE D_PDE_1D_MG_C 2 2 ON Gl UO As iw ti 1D_MG_R D_PDE_1D_MG_DUDX D_PDE_1D MG Q 1 D_PDE_1D MG _U 1 F D_PDE_1D MG U 2 288 Chapter 8 Partial Differential Equations IMSL Fortran 90 MP Library 4 0
176. al the singular values 56 Chapter 2 Singular Value and Eigenvalue Decomposition IMSL Fortran 90 MP Library 4 0 if sum abs abs D S lt amp sqrt epsilon one S 1 then write Example 1 for LIN_EIG_SELF is correct end if end Optional Arguments NROWS n Input Uses array A 1 n 1 n for the input matrix Default n size A 1 v v Output Array of the same type and kind as A 1 n 1 n It contains the n x n orthogonal matrix V iopt iopt Input Derived type array with the same precision as the input matrix used for passing optional data to the routine The options are as follows Packaged Options for 1in_eig_self Option Name Option Value Beg eZ lin_eig_self_set_small 1 s d c z_ lin_eig_self_overwrite_input 2 8 0 C 2_ lin_eig_self_scan_for_NaN 3 s d c z_ lin_eig_self_use_OR 4 s_ d_ c_ Z_ lin_eig_self_skip_Orth 5 s d c z_ lin_eig_self_use_Gauss_elim 6 s_ d c z_ lin_eig_self_set_perf_ratio 7 iopt IO _options _lin_eig_self_set_small Small If a denominator term is smaller in magnitude than the value Small it is replaced by Small Default the smallest number that can be reciprocated safely iopt IO _options _lin_eig_self_overwrite_input _dummy Do not save the input array A iopt IO _options _lin_eig_self_scan_for_NaN _dummy Examines each input array entry to
177. al and Warning Error Messages See the messages gls file for error messages for 1in_svd These error messages are numbered 1001 1010 1021 1030 1041 1050 1061 1070 lin_eig_self Computes the eigenvalues of a self adjoint matrix A Optionally the eigenvectors can be computed This gives the decomposition A vDy where Visann xn orthogonal matrix and D is a real diagonal matrix Required Arguments A Input Output Array of size n X n containing the matrix d Output Array of size n containing the eigenvalues The values are in order of decreasing absolute value Example 1 Computing Eigenvalues The eigenvalues of a self adjoint matrix are computed The matrix A C C T is used where C is random The magnitudes of eigenvalues of A agree with the singular values of A Also see operator_ex25 Chapter 6 use lin_eig_self_int use lin_sol_svd_int use rand_gen_int implicit none This is Example 1 for LIN_EIG_SELF integer parameter n 64 real kind le0 parameter one le0 real kind le0O A n n b n 0 D n S n x n 0 y n n Generate a random matrix and from it a self adjoint matrix call rand_gen y A reshape y n n A A transpose A Compute th igenvalues of the matrix call lin_eig_self A D For comparison compute the singular values call lin_sol_svd A b x nrhs 0 s S Check the results Magnitude of eigenvalues should equ
178. allel_bounded_LSQ to solve a non linear system of equations The example is an ACM TOMS test problem except for the larger size It is Brown s Almost Linear EUN GERONEN USE RROR_OPTION_PACKET USEUPBLSO_INT USE PETESE RUPSEN USE SHOW INT USE Numerical_Libraries ONLY NIRTY IMPLICIT NONE INTEGER PARAMETER N 200 MAXIT 5 REAL KIND 1D0 PARAMETER ZERO 0D0 ONE 1D0 amp HALF 5D 1 TWO 2D0 REAL KIND 1D0 ALLOCATABLE amp M878 7 Ber BNDL neda 2B 7 SCE WE REAL KIND 1D0 RNORM INTEGER ALLOCATABLE INDEX IPART INTEGER K L DN J JSHIFT IERROR NSETP amp INSERM Zi ele EaR PRINT false TYPE D2OPTIONS LOPT 3 l Serijo owr MEg MP_NPROCS MP_SETUP DN N max 1 max 1 MP_NPROCS 1 ALLOCATE IPART 2 max 1 MP_NPROCS Spread Jacobian matrix columns evenly to the processors IPART 1 1 1 DO L 2 MP_NPROCS IPART 2 L 1 IPART 1 1 1 DN IPART 1 L IPART 2 1L 1 1 258 e Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers IMSL Fortran 90 MP Library 4 0 his i inear PH ITER 0 I en E ND DO OCAT X N E A N K W N s BENICE LOT Y X HALF Ian CA iL ERS ait ah N M ETHOD DO EWTON_ Set bo excep BND 1
179. ample 4 Least squares with an Equality Constraint This example solves a least squares system Ax b with the constraint that the solution values have a sum equal to the value 1 To solve this system one heavily weighted row vector and right hand side component is added to the system corresponding to this constraint Note that the weight used is g2 where is the machine precision but any larger value can be used The fact that lin_sol_lsq performs row pivoting in this case is critical for obtaining an accurate solution to the constrained problem solved using weighting See Golub and Van Loan 1989 Chapter 12 for more information about this method Also see operator_ex12 Chapter 6 use lin_sol_lsq_int use rand_gen_int implicit none This is Example 4 for LIN_SOL_LSQ integer parameter m 64 n 32 real kind le0 parameter one 1 0e0 real kind le0O a m 1 n b m 1 1 x n 1 y m n Generate a random matrix call rand_gen y a l m 1 n reshape y m n Generate a random right hand side IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 25 call rand_gen b 1 m 1 Heavily weight desired constraint All variables sum to one a m 1 1 n one sqrt epsilon one b mt 1 1 one sqrt epsilon one call lin_sol_lsq a b x if abs sum x one sum abs x lt amp sqrt epsilon one then write Example 4 for LIN_SOL_LSQ is correct end if end F
180. an 90 MP Library 4 0 DASPG The default value is ATOL 1E 2 for single precision and ATOL 1D 4 for double precision IOPT IO PDE_1D_MG_MAX_BDF_ORDER IOPT IO 1 MAXBDF Reset the maximum order for the BDF formulas used in DAsPG The default value is MAXBDF 2 The new value can be any integer between 1 and 5 Some problems will benefit by making this change We used the default value due to the fact that DASPG may cycle on its selection of order and step size with orders higher than value 2 IOPT IO PDE_1D_MG_REV_COMM_FACTOR_SOLVE The calling program unit will solve the banded linear systems required in the stiff differential algebraic equation integrator Values of IDO 8 9 will occur only when this optional value is used IOPT IO PDE_1D_MG_NO_NULLIFY_STACK To maintain an efficient interface the routine PDE_1D_MG collapses the subroutine call stack with CALL_E1PSH NULLIFY_STACK This implies that the overhead of maintaining the stack will be eliminated which may be important with reverse communication It does not eliminate error processing However precise information of which routines have errors will not be displayed To see the full call chain this option should be used Following completion of the integration stacking is turned back on with CALL_E1POP NULLIFY_STACK Remarks on the Examples Due to its importance and the complexity of its interface this subroutine is
181. an Loan 1989 Chapter 8 for more information Also see operator_ex28 Chapter 6 use lin_eig_self_int use rand_gen_int implicit none This is Example 4 for LIN_EIG_SELF integer i integer parameter n 64 real kind le0Q parameter one 1d0 real kind 1le0 b_sum real kind le0 dimension n n S n vb_d X ytemp n n Generate random self adjoint matrices call rand_gen ytemp A reshape ytemp n n A A transpose A call rand_gen ytemp B reshape ytemp n n B B transpose B b_sum sqrt sum abs B 2 n IMSL Fortran 90 MP Library 4 0 D n lambda n amp Chapter 2 Singular Value and Eigenvalue Decomposition 61 Add a scalar matrix so B is positive definite do i l n B i i B i i b_sum end do Get th igenvalues and eigenvectors for B call lin_eig_self B S v vb_d For full rank problems convert to an ordinary self adjoint problem All of thes xamples are full rank if S n gt epsilon one then D one sqrt S C spread D 2 n matmul transpose vb_d amp matmul A vb_d spread D 1 n Get th igenvalues and eigenvectors for C call lin_eig_self C lambda v X Compute the generalized eigenvectors X matmul vb_d spread D 2 n X Normalize the eigenvectors for the generalized problem X X spread one sqrt sum X 2 dim 2 1 n res matmul A X amp matmul B X
182. and Byron W Howell 1992 The IMSL Error Handler for FORTRAN Technical Report 9103 Visual Numerics Inc Houston Texas Anderson et al Anderson E et al 1995 LAPACK Users Guide SIAM Publications Philadelphia PA ANSI IEEE ANSI IEEE Std 754 1985 1985 IEEE Standard for Binary Floating Point Arithmetic IEEE Inc New York Blackford et al Blackford L S et al 1997 ScaLAPACK Users Guide SIAM Publications Philadelphia PA Blom et al Blom J G Zegeling P A 1994 Algorithm 731 A Moving Grid Interface for Systems of One Dimensional Time Dependent Partial Differential Equations ACM Trans Math Soft 20 2 pages 194 214 IMSL Fortran 90 MP Library 4 0 Appendix C References C 1 C 2 e Appendix C References Boisvert Howe and Kahaner Boisvert R E S E Howe D K Kahaner 1985 GAMS A framework for the management of scientific software ACM Transactions on Mathematical Software 11 313 355 Brenan Campbell and Petzold Brenan K E S L Campbell L R Petzold 1989 Numerical Solutions of Initial Value Problems in Differential Algebraic Equations Elsevier Science Publishing Co Inc New York de Boor de Boor Carl 1978 A Practical Guide to Splines Springer_Verlag New York Fox Hall and Schryer Fox P A A D Hall and N L Schryer 1978 Framework for a portable Fortran subroutine library Machine dependent constants automatic error han
183. and V 29 Modules Use the appropriate module svd_int or linear_operators 170 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 Optional Variables Reserved Names This function uses one of the routines lin_svdand lin_sol_svd Ifa complete decomposition is required 1in_svd is used If singular values only or singular values and one of the right and left singular vectors are required then lin_sol_svd is called The option and derived type names are given in the following tables Option Name for svp Option Value options_for_lin_svd options_for_lin_sol_svd skip_error_processing Derived Type Name of Unallocated Array s_options s_svd_options s_options s_svd_options_once d_options d_svd_options d_options d_svd_options_once Example Compute the singular value decomposition of a random square matrix A rand A S SVD A U U V V A U x diag S xt V UNIT Normalize the columns of a rank 2 or rank 3 array so each has Euclidean length of value one Required Arguments The argument must be a rank 2 or rank 3 array of any intrinsic floating point type The output function value is an array of the same type and kind where each column of each rank 2 principal section has Euclidean length of value one Modules Use the appropriate one of the modules unit_int or linear_operators Optional Variables Reserved Names This function uses
184. arallel computing deliver results to the root and have been tested for correctness by validating small residuals or other first principles Program names are parallel_exnn where nn 01 02 The numerical digit part of the name matches the example number Parallel Examples 1 2 comments These show the box data type used for solving several systems and then checking the results using matrix products and norms or other mathematical relationships Note the first call to the function MP_SETUP that initiates MPI The call to the function MP_SETUP Final shuts down MPI and retrieves any error messages from the nodes It is only here that error messages will print in reverse node order at the root node Note that the results are checked for correctness at the root node This is common to all the parallel examples Parallel Example 1 use linear_operators use mpi_setup_int implicit none This is Parallel Example 1 for ix with box data types U anel EUNET ONS integer parameter n 32 nr 4 real kind 1e0 one 1e0 real kind le0 dimension n n nr A b x err nr SS Uio seers WIPIC MP_NPROCS MP_SETUP Generate random matrices for A and b A rand A b rand b Compute the box solution matrix of Ax b KASRA RED Check the results err norm b A x x norm A norm x norm b if ALL err lt sqrt epsilon one and MP_RANK write Parallel Example 1 i
185. are available in two data types single precision real and double precision real and they are not part of ScaLAPACK The matrices are distributed in a general block column layout Contents ScaLAPACK Supporting Modules DCALAPACK READ saritaaescatesataatudccs anstacanestatedacnanmaaratadiadseatanedataceedansemtanaes ScaLAPACK _WRITE cccccccseceeeeeeeeeeeeeeseeeeeees Example 1 Distributed Transpose of a Matrix In Place 237 Example 2 Distributed Matrix Product with PBLAS cccccesees 239 Example 3 Distributed Linear Solver with ScaLAPACK ccccee 242 232 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers IMSL Fortran 90 MP Library 4 0 Parallel Constrained Least Squares Solvers essees 245 PARALLEL NONNEGATIVE USO d cicsisdisatiastescsinesraedetuetaadanesiaadeeettiasaveeeaacds 245 Example 1 Distributed Linear Inequality Constraint Solver 4 247 Example 2 Distributed Non negative Least Squares 0 cceeeeee 249 PARALLEL BOUNDED USO pinap aai AA aA canes 252 Example 1 Distributed Equality and Inequality Constraint Solver 255 Example 2 Distributed Newton Raphson Method with Step Control 257 ScaLAPACK Supporting Modules MPI REQUIRED Module Name We recommend that users needing routines from ScaLAPACK PBLAS or BLACS Version 1 4 use modules that describe the interface to individual codes This practice including use of th
186. areeedes 268 Example 1 Electrodynamics Model issasesscescasgstoeins iesgadeasstvinessatnaasevaans 277 Example 2 Inviscid Flow on a Plate c ecceccceeseeeeceeeeeeeeeeeeeeeeeeeeees 280 Example 3 Population Dynamics s issssssisisseiissrssesrsssisesniserseinssssnesissi 283 Example 4 A Model in Cylindrical Coordinates cccceeeeeeeeeeeees 285 Example 5 A Flame Propagation Model ce ceeeeseceeeeeeeeeeeeenenes 287 Exaimple 6 A Hot Spot Model s scicsiecsctessscccascccccsaceiicnesselaveuestaainersadoeesd 289 Example 7 Traveling WAVES iisscccqeiscctaesiseceacelocdaeaasinaderssdaisteatinideanedniees 291 Example 8 Black Scholes i sesacsinisioscasadeassepaccesaasssejenasselevaannesasnaspsdeneae 293 Example 9 Electrodynamics Parameters Studied with MPI 295 Introduction This chapter describes an algorithm and a corresponding integrator subroutine PDE_1D_MG for solving a system of partial differential equations a t u f u x t X lt X lt Xg t gt to Equation 1 This software is a one dimensional solver It requires initial and boundary conditions in addition to values of The integration method is noteworthy due to the maintenance of grid lines in the space variable X Details for choosing new grid lines are given in Blom and Zegeling 1994 The class of Equation 1 problems solved with PDE_1D_MG is expressed by equations NPDE du m 0 m 2 Gist a 5
187. argument MP_NPROCS MP_SETUP CALL A_Name 0 Finalize MPI and print any error messages The programs STOP by default MP_NPROCS MP_SETUP Final END PROGRAM SUBROUTINE A_Name I Atlas se oyuhe akin nerates an error messag IMPLICIT NONE INTEGE I lt Il W I 0 THEN Push the name onto the stack CALL E1PSH A_Name Drop a value into the message CAMEE STE Prepare the message for printing CALL E1MES 4 1 amp The agument should be positive amp It now has value il1 Pop the name off the stack CALL E1POP A_Name Had an invalid argument so RETURN RETUR END IF END SUBROUTINI Gl Output for Example 1 Sees NIN GHRIROIRY IL eno seeing iL ieeiesi lt al eae mS COn EOM agument should be positive It now has value 0 FORWARD Calls Error Types and Codes ME Seer ue 0 0 A_Name 4 il T xxx FATAL ERROR 1 on rank 0 texas rd imsl com from A Name agument should be positive It now has value 0 FORWARD Calls Error Types and Codes MP USELUP 0 0 A_Name 4 il IMSL Fortran 90 MP Library 4 0 Chapter 9 Error Handling and Messages The Parallel Option 311 Example 2 This program is Example with a different message from each node The messages are gathered from the nodes and printed at the root program errpex2 USE MPI_SETUP_INT IMPLICIT NONE
188. ase is Brown s Almost Linear Problem Mor et al 1982 The components are given by ef 4 a n i jal ef x Siea 1 T The functions are zero at the point x 5 8 8 where 6 gt 1 is a particular root of the polynomial equation n n iS 1 0 To avoid convergence Aw T to the local minimum x 0 0 0 1 we start at the standard point IMSL Fortran 90 MP Library 4 0 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers 257 c 1 EA os Sl Be Mi 2 and develop the Newton method using the linear terms f x y f x J x y 0 where J x is the Jacobian matrix The update is constrained so that the first n 1 components satisfy x y 21 2 or y lt x 1 2 The last component is bounded from both sides O lt x Y 1 2 or x gt Yn 2 x 1 2 These bounds avoid the local n minimum and allow us to replace the last equation by n x a 0 which is j l better scaled than the original The positive lower bound for x y is replaced by the strict bound EPSILON 1D0 the arithmetic precision which restricts the relative accuracy of x The input for routine PARALLEL_BOUNDED_LSQ expects each processor to obtain that part of J x it owns Those columns of the Jacobian matrix correspond to the partition given in the array IPART Here the columns of the matrix are evaluated in parallel on the nodes where they are required PROGRAM PBLSQ_EX2 Use Par
189. assigned 0 Normally users will give the partitioning to processor of rank MP_RANK by setting IPART 1 MP_RANK 1 first column index and IPART 2 MP_RANK 1 last column index The number of columns per node is typically based on their relative computing power To avoid a node with rank MP_RANK doing any work except communication set IPART 1 MP_RANK 1 0 and IPART 2 MP_RANK 1 1 In this exceptional case there is no reference to the array A at that node IMSL Fortran 90 MP Library 4 0 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers 253 NSETP Output An INTEGER indicating the number of solution components not at constraints The column indices are output in the array INDEX NSETZ Output An INTEGER indicating the solution components held at fixed values The column indices are output in the array INDEX Optional Argument IOPT Input Assumed size array of derived type S_OPTIONS or D_OPTIONS This argument is used to change internal parameters of the algorithm Normally users will not be concerned about this argument so they would not include it in the argument list for the routine Packaged Options for PARALLEL_BOUNDED_LSQ Option Name Option Value PBLSQ_SET_TOLERANCE 1 PBLSQ_SET_MAX_ITERATIONS 2 PBLSQ_SET_MIN_RESIDUAL 3 IOPT 10 _OPTIONS PBLSQ_SET_TOLERANCE
190. atal and Terminal Error Messages See the messages gls file for error messages for 1in_sol_1sq These error messages are numbered 241 256 261 276 281 296 301 316 lin_sol_svd Solves a rectangular least squares system of linear equations Ax b using singular value decomposition A USV With optional arguments any of several related computations can be performed These extra tasks include computing the rank of A the orthogonal m x m and n Xn matrices U and V and the m x n diagonal matrix of singular values S Required Arguments A Input Output Array of size m X n containing the matrix b Input Output Array of size m X nb containing the right hand side matrix x Output Array of size n X nb containing the solution matrix Example 1 Least squares solution of a Rectangular System The least squares solution of a rectangular m x n system Ax b is obtained The use of lin_sol_lsq is more efficient in this case since the matrix is of full rank This example anticipates a problem where the matrix A is poorly conditioned or not of full rank thus 1in_sol_svd is the appropriate routine Also see operator_ex13 Chapter 6 26 Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 use use impl This is inte real real Generate call A call lin_sol_svd_int rand_gen_int icit none Example 1 for LIN_SOL_SVD ger parameter m 128 n 32 kind 1d0 parameter one 1d0 kind 1d
191. ate the samples by obtaining uniform samples u O lt u lt 1 and solve the equation q x u 0 m lt x lt 7T These are evaluated in vector form that is all entries at one time using Newton s method x lt x dx dx q x u p x An iteration counter forces the loop to terminate but this is not often required although it is an important detail use rand_gen_int use show_int use Numerical_Libraries IMPLICIT NONE This is Example 4 for RAND_GEN integer i i_map k integer parameter n_bins 36 integer parameter offset 18 integer parameter n_samples 10000 integer parameter n_samples_30 30 integer parameter COUNT 15 real kind le0 probabilities n_bins 132 e Chapter 5 Utilities IMSL Fortran 90 MP Library 4 0 real kind 1le0 dimension n_bins counts 0 0 real kind le0 dimension n_samples rn x f fprime dx real kind le0 dimension n_samples_30 rn_30 amp x_30 _30 fprime_30 dx_30 real kind le0 parameter one le0 zero 0e0 half 0 5e0 real kind le0 parameter tolerance 0 01 real kind le0 two_pi omega Initialize values of two_pi and omega two_pi 2 0 const pi omega two_pi n_bins Compute the probabilities for each bin according to the probability density cos x 1 2 pi pi lt x lt pi do i 1 n_bins probabilities i sin omega i offset amp sin omega i offset 1 tomega tw
192. ation Partial pivoting is numerically stable but is likely to be less efficient than cyclic reduction Example 2 Iterative Refinement and Use of Partial Pivoting This program unit shows usage that typically gives acceptable accuracy for a large class of problems Our goal is to use the efficient cyclic reduction algorithm when possible and keep on using it unless it will fail In exceptional cases our program switches to the LU factorization with partial pivoting This use of both factorization and solution methods enhances reliability and maintains efficiency on the average Also see operator_ex18 Chapter 6 use lin_sol_tri_int use rand_gen_int implicit none This is Example 2 for LIN_SOL_TRI integer i nopt integer parameter n 128 IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 37 real kind le0 parameter s_one le0 s_zero 0e0 real kind 1d0 parameter d_one 1d0 d_zero 0d0 real kind le0 dimension 2 n n d b c res n n amp x Y real kind le0 change_new change_old err type s_options iopt 2 s_options 0 s_zero real kind 1d0 dimension n n d_save b_Ssave c_save amp X_Save y_save x_sol logical solve_only c s_zero d s_zero b s_zero x S_zero Generate the upper main and lower diagonals of the matrices A A random vector x is used to construct the right hand sides y A x do i 1 n call rand_gen call rand_gen call rand_gen call rand_gen
193. ave covered the sphere twice by allowing 116 Chapter 4 Curve and Surface Fitting with Splines IMSL Fortran 90 MP Library 4 0 ASuUuST for latitude We solve three data fitting problems one for each coordinate function Periodic constraints on the value of the spline are used for both u and v We could reduce the computational effort by fitting a spline function in one variable for the z coordinate To illustrate the representation of more general surfaces than spheres we did not do this When the surface is evaluated we compute latitude moving from the South Pole to the North Pole H lt S2u lt cat Our surface will approximately satisfy the equality x y Z a These residuals are checked at a rectangular mesh of latitude and longitude pairs To illustrate the use of some options we have reset the three regularization parameters to the value zero the least squares system tolerance to a smaller value than the default and obtained the residuals for each parametric coordinate function at the data points USE surface_fitting_int USE rand_int USE norm_int USE Numerical_Libraries implicit none This is Example 2 for SURFACE_FITTING tensor product B splines approximation Fit x y z parametric functions for points on the surface of a sphere of radius A Random values of latitude and longitude are used to generat data The functions are evaluated at a rectangular grid in latitude and longitude and checke
194. ave data 1 4 as input iopt IO _options surface_fitting_print dummy This option prints the knots or breakpoints for x and y and the count of data points in cell processing order The default is to not print these arrays iopt IO _options surface_fitting_thinness _value This resets the square root of the regularizing parameter multiplying the squared integral of the second partial derivatives of the unknown function The argument _value is replaced by the default value The default is 3 P _value 10 x size where size L data 3 data 4 V ndata 1 Description The coefficients are obtained by solving a least squares system of linear algebraic equations subject to linear equality and inequality constraints The system is the result of the weighted data equations and regularization If there are no constraints the solution is computed using a banded least squares solver Details are found in Hanson 1995 Additional Examples Example 2 Parametric Representation of a Sphere From Struik 1961 the parametric representation of points x y z on the surface of a sphere of radius a gt 0 is expressed in terms of spherical coordinates x u v acos u cos v NMS2u lt st y u v acos u sin v t lt vs z u v a sin u The parameters are radians of latitude u and longitude v The example program fits the same ndata random pairs of latitude and longitude in each coordinate We h
195. be used to obtain the upper triangular or upper trapezoidal matrix R If this optional argument with keyword R is present the decomposition is complete The array output contains the matrix Q If the first argument is rank 3 the output array and the optional argument are rank 3 Modules Use the appropriate one of the modules orth_int or linear_operators IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 167 Optional Variables Reserved Names The option and derived type names are given in the following tables Option Name for ORTH Option Value skip_error_processing 5 Derived Type s_options s_options d_options d_options Name of Unallocated Array s_orth_options d_orth_options s_orth_options_once d_orth_options_once Example Compute the scaled sample variances v of an m x n linear least squares system m gt n Ax b Q ORTH A R R G i R xX G x Q hx b v DIAGONALS G xh G v v sum b A x x 2 m n RAND Compute a scalar rank 1 rank 2 or rank 3 array of random numbers Each component number is positive and strictly less than one in value Required Arguments The argument must be a scalar rank 1 rank 2 or rank 3 array of any intrinsic floating point type The output function value matches the required argument in type kind and rank For complex arg
196. block times the part of the dual corresponding to that part of the partition Y ZERO DO J IPART 1 MP_RANK 1 IPART 2 MP_RANK 1 JSHIFT J IPART 1 MP_RANK 1 1 Y Y ASAVE JSHIFT X J END DO Accumulate the pieces from all the processors Put sum into B on rank 0 processor B Y IF MP_NPROCS gt 1 amp 248 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers IMSL Fortran 90 MP Library 4 0 IMSL Fortran 90 MP Library 4 0 CALL MPI_REDUCE Y B M MPI_DOUBLE __PRECISION amp MEN SUM iFaaO P_LIBRARY_WORLD IERROR IF MP_RANK 0 THEN Compute constrained solution at the root The constraints will All of thes B M a IF MP_NPROCS gt 1 amp CALL MPI_BCAST B M MPI_DOUBLE Vi D IF have no solution if B M ON xample problems have solutions B M ONE B B B M pg inequality constraint solution to all nodes MP_LIBRARY_WORLD TERROR PRECI SION amp For large problems this printing needs to be removed IF MP_RANK and PRINT amp CALL SHOW B 1 NP amp Minimal length solution of the constraints Compute residuals of the individual constraints Iie Oily ic X ZE DO J IPART 1 MP_RANK 1 IPART 2 MP_RANK 1 This cleans up residuals t error unit RO JSHIFT J IPART 1 MP_RANK 1 1 X J dot_product B ASAVE JSHIFT END DO he solu
197. brary MPI communicator MP_LIBRARY_WORLD See Chapter 6 Parallelism Using MPI PARALLEL NONNEGATIVE _LSQ Solve a linear non negative constrained least squares system Usage Notes CALL PARALLEL NONNEGATIVE _LSQ amp A B X RNORM W INDEX IPART IOPT IOPT Required Arguments A 1 M Input Output Columns of the matrix with limits given by entries in the array IPART 1 2 1 max 1 MP_NPROCS On output A is replaced by the product QA where Qis an orthogonal matrix The value SIZE A 1 defines the value ofm Each processor starts and exits with its piece of the partitioned matrix B 1 M Input Output Assumed size array of length M containing the right hand side vector b On output b is replaced by the product Qb where Qis the orthogonal matrix applied toA All processors in the IMSL Fortran 90 MP Library 4 0 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers 245 communicator start and exit with the same vector X 1 N Output Assumed size array of length N containing the solution x20 The value SIZE x defines the value of N All processors exit with the same vector RNORM Output Scalar that contains the Euclidean or least squares length of the residual vector Ax b All processors exit with the same value W 1 N Output Assumed size array of length N containing the dual vector w A b Ax lt 0 All processors exit wit
198. c m n d m 1 cov n n x n 1 amp res n 1 y m n type s_options iopti 1l s_options 0 zero Generate a random rectangular matrix and a random right hand side call rand_gen y c reshape y m n call rand_gen d l n 1 Form the normal equations for the rectangular system IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 13 a matmul transpose c c b matmul transpose c d Use packaged option to use Cholesky decomposition iopti l s_options s_lin_sol_self_Use_Cholesky zero Compute the solution of Ax b with optional inverse obtained call lin_sol_self a b x ainv cov amp iopt iopti Compute residuals x inverse b for consistency check res x matmul cov b Scale the inverse to obtain the covariance matrix cov sum d matmul c x 2 m n cov Check the results err sum abs res sum abs cov if err lt sqrt epsilon one then write Example 2 for LIN_SOL_SELF is correct end if end Example 3 Using Inverse Iteration for an Eigenvector This example illustrates the use of the optional argument iopt to reset the value of a Small diagonal term encountered during the factorization Eigenvalues of the self adjoint matrix A C C are computed using the routine 1in_eig_self An eigenvector corresponding to one of these eigenvalues A is computed using inverse iteration This solves the near s
199. ce 119 convolution with Fourier Transform 84 cross validation with weighting 54 cyclical 2D data with a linear trend 88 cyclical data with a linear trend 82 eigenvalue eigenvector expansion of a square matrix 58 evaluating the matrix exponential 6 7 Generalized Singular Value Decomposition 52 generating strategy with a histogram 130 generating with a Cosine distribution 132 internal write of an array 139 iterative refinement and use of partial pivoting 38 Laplace transform solution 31 larger data uncertainty 76 least squares with an equality constraint 25 least squares solution of a rectangular system 27 linear least squares with a quadratic constraint 50 matrix inversion and determinant 1 5 natural cubic spline interpolation to data 101 parametric representation of a sphere 116 periodic curves 108 polar decomposition of a square matrix 29 printing an array 137 reduction of an array of black and white 30 ridge regression 54 running mean and variance 126 seeding using and restoring the generator 129 selected eigenvectors of tridiagonal matrices 40 IMSL Fortran 90 MP Library self adjoint positive definite generalized eigenvalue problem 74 several 2D transforms with initialization 90 several transforms with initialization 83 shaping a curve and its derivatives 104 solution of multiple tridiagonal systems 35 solving a linear least squares system of equations 9 18 solving a li
200. ce by the precision of the arguments The data array and the derived type _spline_knots are required arguments The array of derived type _spline_constraints is an optional argument Two Dimensional Smoothing Check List For two dimensional smoothing users should follow the check list below 1 Choose the degree of the piece wise polynomials tensor product spline function and their knots in both independent variables The degree of the spline must be the same in both dimensions Use the IMSL DNFL derived type s_spline_knots or d_spline_knots to define this data for use as an argument to the fitting routine Note that this derived type is also used for the one dimensional problem but for two dimensional problems separate arguments are needed in each dimension 2 Choose the regularization parameters and constraints that the tensor product spline function must satisfy Values of the regularization parameters are passed to the fitting routine using the derived type s_options or d_options Of particular importance for obtaining pleasing results is the need to vary the parameters thinness and occasionally flatness or smallness appearing in the least squares model 3 Use the generic derived type function surface_constraints for specifying optional constraint information for the fitting routine This derived type is discussed below 4 Define the data values to be fit These are quadruples consisting of pairs of independent and single depend
201. cients GSM silk W Generate an equally spaced grid on the interval delta_x 2 real m 1 kind one do nrack 1 nr 25 i auae one i delta_x i 0 m 1 enddo Evaluate residuals using backward recurrence formulas u zero v zero do nrack 1 nr clo tS Of i Vere r hrack 27212 rec Aude By nra eki G wile perpara ar Cal arac vlere p reek Anew Soiree LCS Brak H wl fyirack end do enddo Compute residuals at the grid Vi OXON COS N PEOVER UVa Check that n l sign changes in the residual curve occur See x one x sign x y ie Cora es email As YS se Aeins lp Sa ioe Ee if mp_rank 0 amp write Parallel Example 10 is correct end if to any error messages and exit MPI MP_NPROCS MP_SETUP Final end Parallel Example 11 In this example a single problem is elevated by using the box data type with one rack The function call MP_SETUP M may take longer to compute than the computation of the generalized inverse which follows Other methods for determining the node priority order perhaps based on specific knowledge of the network environment may be better suited for this application This example requires two nodes to execute use linear_operators use mpi_setup_int use Numerical_Libraries only DCONST implicit none This is Parallel Example 11 using a priority order with only the fastest alternate node working
202. cision copies of the matrix and right hand side d a c b Start solution at zero y d_zero change_old huge one Use packaged option to save the factorization iopti 1 s_options s_lin_sol_gen_save_LU zero iterative_refinement do b c matmul d y call lin_sol_gen a b x amp pivots ipivots iopt iopti yruxqty change_new sum abs x Exit when changes are no longer decreasing if change_new gt change_old amp xit iterative_refinement change_old change_new Use option to re enter code with factorization saved solve only iopti 2 s_options s_lin_sol_gen_solve_A zero end do iterative_refinement write Example 3 for LIN_SOL_GEN is correct end Example 4 Evaluating the Matrix Exponential This example computes the solution of the ordinary differential equation problem dy dt with initial values y 0 y For this example the matrix A is real and constant with respect to f The unique solution is given by the matrix exponential At y t e yo IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 7 This method of solution uses an eigenvalue eigenvector decomposition of the matrix A XDXx to evaluate the solution with the equivalent formula Dt y t Xe Zo where _y l zy X Yo is computed using the complex arithmetic version of 1in_sol_gen The results for y t are real quantities but the evaluation uses intermed
203. coordinate functions are discontinuous The value of this representation is that for each the splines representing x t y t are points on the perimeter of the box This eases the complexity of evaluating the edge of the box This example illustrates a method for representing the edge of a domain in two dimensions bounded by a periodic curve use spline_fitting_int use norm_int 108 Chapter 4 Curve and Surface Fitting with Splines IMSL Fortran 90 MP Library 4 0 implicit none This is Example 4 for SPLINE_FITTING Use piecewise linear splines to represent the perimeter of a rectangular box integer i j integer parameter nbkpt 9 nord 2 ndegree nord 1l amp ncoeff nbkpt nord ndata 7 ngrid 100 amp nvalues ndata 1 ngrid real kind le0 parameter zero 0e0 one 1le0 real kind le0O parameter delta_t one delta_b one delta_v 0 01 real kind le0 delta_x delta_y real kind le0 dimension ndata sddata one amp These are redundant coordinates on the edge of the box xdata 0 0 1 0 2 0 2 0 1 0 0 0 0 0 amp ydata 0 0 0 0 0 0 1 0 1 0 1 0 0 0 real kind le0 tdata ndata xspline_data 3 ndata amp yspline_data 3 ndata tvalues nvalues amp xvalues nvalues yvalues nvalues xcoeff ncoeff amp ycoeff ncoeff xcheck nvalues ycheck nvalues diff real kind le0 target bkpt nbkpt real kind le0 pointer pointer_bkp
204. ction is likely to be required whenever there is the possibility that a subroutine blocked the output with NaNs in the presence of an error condition Required Arguments The argument can be a scalar or array of rank 1 rank 2 or rank 3 The output value tests true only if there is at least one NaN in the scalar or array The values can be any of the four intrinsic floating point types Modules Use one of the modules isNaN_int or linear_operators Optional Variables Reserved Names This function has neither packaged optional variables nor reserved names 164 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 Example If there is not a NaN in an array A it is used to solve a linear system if not isNaN A x A ix b NaN Returns as a scalar function a value corresponding to the IEEE 754 Standard format of floating point ANSI TEEE 1985 for NaN For other floating point formats a special pattern is returned that tests t rue using the function isNaN Required Arguments e x Input Scalar value of the same type and precision as the desired result NaN This input value is used only to match the type of output Example 1 Blocking Output Arrays are assigned all NaN values using single and double precision formats These are tested using the logical function routine isNaN use isnan_int implicit none This is Example 1 for NaN integer parameter n 3
205. ctor p matmul u_d transpose v_d Compute the right self adjoint factor q matmul v_d spread s_d 1 n transpose v_d ident zero do i l n ident i i one end do Check the results if sum abs matmul p transpose p ident sum abs p amp lt sqrt epsilon one then if sum abs a matmul p q sum abs a amp IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 29 lt sqrt epsilon one then write Example 2 for LIN_SOL_SVD is correct end if end if end Example 3 Reduction of an Array of Black and White Ann Xn array A contains entries that are either 0 or 1 The entry is chosen so that as a two dimensional object with origin at the point 1 1 the array appears as a black circle of radius n 4 centered at the point 7 2 n 2 A singular value decomposition A USV is computed where S is of low rank Approximations using fewer of these nonzero singular values and vectors suffice to reconstruct A Also see operator _ex15 Chapter 6 use lin_sol_svd_int use rand_gen_int use error_option_packet implicit none This is Example 3 for LIN_SOL_SVD integer i j k integer parameter n 32 real kind le0 parameter half 0 5e0 one le0 zero 0e0 real kind le0 a n n b n 0 x n 0 s n u n n amp v n n c n n Fill in value one for points inside the circle a zero do i l n do j l n if i n 2 2 j n 2 2 lt n 4
206. d BP HR The orthogonal matrix H is the product of n 1 row permutations each followed by a Householder transformation Column permutations P are chosen at each step to maximize the Euclidian length of the pivot column The matrix R is upper triangular Using the default tolerance qT ellBll where is machine relative precision each diagonal entry of R exceeds T in value Otherwise R is singular In that case A and B are interchanged and the orthogonal decomposition is computed one more time If both matrices are singular the problem is declared singular and is not solved The interchange of A and B is accounted for in the output diagonal matrices amp and B The ordinary eigenvalue problem is Cx Ax where C H AP R and RPv x If the matrices A and B are self adjoint and if in addition B is positive definite then a more efficient reduction than the default algorithm can be optionally used to solve the problem A Cholesky decomposition is obtained R RR PBP The matrix R is upper triangular and P is a permutation matrix This is equivalent to the ordinary self adjoint eigenvalue problem Cx Ax where RPv x and C R PAP R The self adjoint eigenvalue problem is then solved Additional Examples Example 2 Self Adjoint Positive Definite Generalized Eigenvalue Problem This example illustrates the use of optional flags for the special case where A and B are complex self adjoint matrices and B is positive defini
207. d 1l n Save double precision copies of the diagonals and the right hand side c_save c l n l n d_save d 1 n 1 n b_save b 1 n 1 n x_save x 1 n li in y_save l in 1l in d l n l in x_save amp c 1l n 1 n EOSHIFT x_save SHIFT 1 DIM 1 amp b 1l n 1 n EOSHIFT x_save SHIFT 1 DIM 1 Iterative refinement loop factorization_choice do nopt 0 1 Set the logical to flag the first time through solve_only fals x_sol d_zero change_old huge s_one iterative_refinement do This flag causes a copy of data to be moved to work arrays and a factorization and solve step to be performed if not solve_only then e l in 1 n c_save d l n 1 n d_save b 1l n 1 n b_save end if Compute current residuals y A x using current x y l in lin y_save amp d_save x_sol amp c_save EOSHIFT x_sol SHIFT 1 DIM 1 amp b_save EOSHIFT x_sol SHIFT 1 DIM 1 call lin_sol_tri c d b y iopt iopt x_sol x_sol y l1 in 1 n change_new sum abs y 1 n 1 n If size of change is not decreasing stop the iteration if change_new gt change_old exit iterative_refinement change_old change_new IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 187 iopt nopt 1 s_lin_sol_tri_solve_only solve_only tru end do iterative_refinement Use Gaussian Elimination if
208. d e amp inverse_out f Check the Convolution Theorem inverse transform a transform b convolution a b err maxval abs c f maxval abs c if err lt sqrt epsilon one then write Example 4 for FAST_DFT is correct end if end IMSL Fortran 90 MP Library 4 0 Chapter 3 Fourier Transforms 85 Fatal and Terminal Messages See the messages gls file for error messages for fast_dft These error mes sages are numbered 651 661 701 711 fast_2dft Computes the Discrete Fourier Transform 2DFT of a rank 2 complex array x Required Arguments No required arguments pairs of optional arguments are required These pairs are forward_in and forward_out or inverse _inand inverse_out Example 1 Transforming an Array of Random Complex Numbers An array of random complex numbers is obtained The transform of the numbers is inverted and the final results are compared with the input array use fast_2dft_int use rand_int implicit none This is Example 1 for FAST_2DFT integer parameter n 24 integer parameter m 40 real kind le0Q err one le0 complex kind le0 dimension n m a b c Generate a random complex sequence a rand a c a Transform and then invert the transform call c_fast_2dft forward_in a amp forward_out b call c_fast_2dft inverse_in b amp inverse_out a Check that inverse transform sequence sequenc err maxval
209. d in the call to the routine The argument is an array of derived type s_error or d_error see Chapter 5 The reasons for this design are described more fully in Hanson 1992 Primarily the use of separate arrays for each parallel call to routines will allow the user to summarize errors using the routine error_post in a non parallel part of an application This allows any number of parallel calls to be made without danger of jumbling or mixing error messages Most users call IMSL Fortran 90 MP Library routines but not in parallel If they do not include the epack argument error messages will print within the routines This is the same principle as for the Numerical Libraries When an error occurs with the argument epack used but the array has an inadequate size to hold the information describing the error output is flooded or blocked with a NaN Not a Number ANSI IEEE 1985 Further computational use of the output may result in an unhandled exception from the processor To test for NaN output the calling program unit can execute the following logical condition isNan floating_point_output TRUE See the isNaN function Chapter 6 The symbol floating_point_output will be any scalar or array output of the routine For complete information on errors include the argument epack in your program This argument is used to pass message numbers error severity level and associated data to the error post p
210. d to lie on the surface of the sphere integer i j integer parameter ngrid 6 nord 6 ndegree nord l amp nbkpt ngrid 2 ndegree ndata 1000 nvalues 50 NOPT 5 real kind 1d0 parameter zero 0d0 one 1d0 two 2d0 real kind 1d0 parameter TOLERANCE 1d 2 real kind 1d0 target spline_data 4 ndata 3 bkpt nbkpt amp coeff ngridtndegree 1 ngridtndegree 1 3 delta sizev amp pi A x nvalues y nvalues values nvalues nvalues amp data 4 ndata real kind 1d0 pointer pointer_bkpt type d_spline_knots knotsx knotsy type d_options OPTIONS NOPT Get the constant pi and a random radius gt 1 pi DCONST pi A one rand A Generate random latitude longitude pairs and evaluate the surface parameters at these points spline_data 1 2 1 pi two rand spline_data 1 2 1 one spline_data 1 2 2 spline_data 1 2 1 spline_data 1 2 3 spline_data 1 2 1 IMSL Fortran 90 MP Library 4 0 Chapter 4 Curve and Surface Fitting with Splines 117 Evaluate x y z parametric points spline_data 3 1 A cos spline_data 1 1 spline_data 3 2 A cos spline_data 1 2 spline_data 3 3 A sin spline_data 1 3 The values ar qually uncertain spline_data 4 one cos spline_data 2 1 sin spline_data 2 2 Define the knots for the tensor product data fitting problem delta two pi ngrid 1 bkpt
211. de computing the LU factorization of A using partial pivoting representing the determinant of A computing the inverse matrix A and solving A borAx b given the LU factorization of A lin_sol_lsq Solves a rectangular system of linear See Chapter 1 D9al equations D9c Ax b ina least squares sense Using optional arguments any of several related computations can be performed These extra tasks include computing and saving the factorization of A using column and row pivoting representing the determinant of A computing the generalized inverse matrix A or computing the least squares solution of Ax borA Ty d given the factorization of A An optional argument is provided for computing the following unscaled covariance matrix C A Tay A 2 e Appendix A List of Subprograms and GAMS Classification IMSL Fortran 90 MP Library 4 0 lin_sol_self lin_sol_svd lin_sol_tri lin_svd NaN parallel _ amp nonnegative_lsq parallel _ amp bounded_lsq IMSL Fortran 90 MP Library 4 0 Solves a system of linear equations Ax See Chapter 1 D2bla b where A is a self adjoint matrix Using D2b1b optional arguments any of several D2d1la related computations can be performed D2d1b These extra tasks include computing and saving the factorization of A using symmetric pivoting representing the determinant of A computing the inverse matrix A or computing the solution of Ax b given the factorization of A An op
212. delta i 1 nvalues y x values exp spread x 2 1 nvalues spread y 2 2 nvalues values surface_values 0 0 x y knotsx knotsy coeff amp values f_00 surface_values 0 0 zero zero knotsx knotsy coeff f_x00 surface_values 1 0 zero zero knotsx knotsy coeff f_y00 surface_values 0 1 zero zero knotsx knotsy coeff Compute the R M S error sizev norm pack values values values nvalues PASS sizev lt TOLERANCE PASS abs _00 one lt sqrt epsilon one and PASS PASS f_x00 lt sqrt epsilon one and PASS PASS f_y00 lt sqrt epsilon one and PASS if PASS then write Example 3 for SURFACE_FITTING is correct end if end Example 4 Constraining a Spline Surface to be non Negative The review of interpolating methods by Franke 1982 uses a test data set originally due to James Ferguson We use this data set of 25 points with unit uncertainty for each dependent variable Our algorithm does not interpolate the data values but approximately fits them in the least squares sense We reset the regularization parameter values of flatness and thinness Hanson 1995 Then the surface is fit to the data and evaluated at a grid of points Although the surface appears smooth and fits the data the values are negative near one corner Our scenario for the application assumes that the surface be non negative at all points of the r
213. dgspg djspg goes It only goes in one place here iwk wk but can vary where divid iopt 1 call ium d differences ar in 27 math used for partial derivatives ag ichap iget 1 iopt ival Direct user select case 1 This should respons case ido 14 nok eecur r write 2p 4 stop Unexpected return with ido case 3 Reset options to defaults required for this problem ido This is good housekeeping but not in in call iumag math ichap iput 50 in ival inr inr call sumag math ichap iput 20 inr sval exit Integration_Loop case 5 Evaluate partials of g y y t_y y t_ypr ypr t_g r_diag t_y r_off EOSHIFT t_y SHIFT 1 amp EOSHIFT r_off t_y SHIFT 1 amp a_diag t_ypr a_off EOSHIFT t_ypr SHIFT 1 amp EOSHIFT a_off t_ypr SHIFT 1 Move data from assumed size to assumed shape arrays do i l n wk ival 1 i 1 end do cycle Integration_Loop t_g i case 6 Evaluate partials of g y y Get value of c_j for partials iopt 1 inr 9 call sumag math ichap iget 1 iopt Subtract c_j from diagonals to compute The linear system is tridiagonal t_diag l n 1 r_diag sval 1 a_diag IMSL Fortran 90 MP Library 4 0 sval partials for y c_ j Chapter 6 Operators and Generic Functions The Parallel Option 191 t_upper l1
214. distinguish more than one error condition of the same type Note that the error code is printed with the message for all types Q 3 What are global errors A Error types 6 and 7 are global in the sense that E1POP never decides to stop based on their occurrence The function N1RGB 1 returns a 1 if processing should stop due to a global error Also N1RGB clears the global error indicators Q 4 Does E1MES actually print the message or just store it A All error messages are stored and printed if the user desires when the subprogram call stack returns to Level 1 Q 5 To store an integer and a real number for use in a message must unique positional index numbers be used A No for example 122 456 0 Dy VSA e mora wenan SA A 3 Q 6 How do I disable an error state AEONGy UMTS O O j Note that any of the settings can be changed In the following example the error type is reset to 5 and the other settings are left unchanged 1 r Li x Q 7 How long can the message be A An expanded message will be truncated after 1 024 characters For 316 Chapter 9 Error Handling and Messages The Parallel Option IMSL Fortran 90 MP Library 4 0 this reason long variable length items included by A1 L1 etc should be placed at the end of the message Q 8 Why is it that when I call E1POs to turn off printing and then call E1MES it prints anyway A When 1P0
215. ditional options efficiency and robustness Example 2 Complex Polynomial Equation Roots The roots of a complex polynomial equation f 2 yee eg 0 k l 66 Chapter 2 Singular Value and Eigenvalue Decomposition IMSL Fortran 90 MP Library 4 0 are required This algebraic equation is formulated as a matrix eigenvalue problem The equivalent matrix eigenvalue problem is solved using the upper Hessenberg matrix which has the value zero except in row number and along the first subdiagonal The entries in the first row are given by a j b i 1 n while those on the first subdiagonal have the value one This is a companion matrix for the polynomial The results are checked by testing for small values of fte l i 1 n at the eigenvalues of the matrix which are the roots of f z Also see operator_ex30 Chapter 6 use lin_eig_gen_int use rand_gen_int implicit none This is Example 2 for LIN_EIG_GEN integer i integer parameter n 12 real kind 1d0 parameter one 1 0d0 zero 0 0d0 real kind 1d0 err t 2 n type d_options iopti 1l d_options 0 zero complex kind 1d0 a n n b n e n f n fg n call rand_gen t b cmplx t 1lin t n 1 kind one Define the companion matrix with polynomial coefficients in the first row a zero do i 2 n a i i 1l one end do a 1 1 n b Note that the input companion matrix is upper Hessenberg iopti 1 d_
216. dling and dynamic storage allocation using a stack ACM Transactions on Mathematical Software 4 176 188 Franke Franke Richard 1982 Scattered Data Interpolation Tests of Some Methods Mathematics of Computation 37 157 pages 181 200 Fushimi Fushimi Masanori 1990 Random number generation with the recursion X X _3 X _3 Journal of Computational and Applied Mathematics 31 105 118 Golub and Van Loan Golub Gene H and Charles Van Loan 1989 Matrix Computations 2d ed Johns Hopkins University Press Baltimore Md Gropp Lusk and Skjellum Gropp William Ewing Rusty Lusk and Anthony Tony Skjellum 1994 Using MPI MIT Press Cambridge MA Hanson Hanson R J 1992 A Design of High Performance Fortran 90 Libraries Technical Report 9201 Visual Numerics Inc Houston Texas IMSL Fortran 90 MP Library 4 0 Hanson Hanson R J 1995 Constrained B Spline Surface Fitting to Discrete Data Technical Report 9503 Visual Numerics Inc Houston Texas Hanson and Krogh Hanson R J and F T Krogh 1981 Flexibility in mathematical software development using option arrays ACM SIGNUM Newsletter Special Issue ACM Hanson et al Hanson R J Art Belmonte Richard Lehoucgq and Jackie Stolle 1991 Improved Performance of Certain Matrix Eigenvalue Computations for the IMSL MATH LIBRARY Technical Report 9007 Visual Numerics Inc Houston Texas Henrici Henrici
217. dual and the number of iterations are new features Example 1 Distributed Equality and Inequality Constraint Solver The program PBLSQ_EX1 illustrates the computation of the minimum Euclidean length solution of an m x n system of linear inequality constraints Gy 2h Additionally the first f gt 0 of the constraints are equalities The solution algorithm is based on Algorithm LDP page 165 166 loc cit By allowing the dual variables to be free the constraints become equalities The rows of E G h are partitioned and assigned random values When the minimum Euclidean length solution to the inequalities has been calculated the residuals r Gy h 2 0 are computed with the dual variables to the BVLS problem indicating the entries of r that are exactly zero PROGRAM PBLSQ_EX1 Use Parallel_bounded_LSQ to solve an inequality constraint problem Gy gt h Force F of the constraints to be equalities This algorithm uses LDP of Solving Least Squares Problems page 165 Forcing equality constraints by freeing the dual is new here The constraints are allocated to the PHGOCS SIS C1251 Olver OWS WA Columnas Or whs array Asp 3 ee USE PBLSO_INT WS Hae MPs ise S ERORAS USE RAND_INT USE SHOW_INT IMPLICIT NONE INCLUDE mpif h INTEGER PARAMETER MP 500 NP 400 M NP 1 amp N MP F NP 10 REAL
218. dure to inverse iteration Larger values of perf_ratio are less likely to cause these exceptions Default perf_ratio 4 Description Routine 1in_eig_self is an implementation of the QR algorithm for self adjoint matrices An orthogonal similarity reduction of the input matrix to self adjoint tridiagonal form is performed Then the eigenvalue eigenvector decomposition of a real tridiagonal matrix is calculated The expansion of the matrix as AV VD results from a product of these matrix factors See Golub and Van Loan 1989 Chapter 8 for details Additional Examples Example 2 Eigenvalue Eigenvector Expansion of a Square Matrix A self adjoint matrix is generated and the eigenvalues and eigenvectors are computed Thus A vbv where V is orthogonal and D is a real diagonal matrix The matrix V is obtained using an optional argument Also see operator _ex26 Chapter 6 use lin_eig_self_int use rand_gen_int implicit none This is Example 2 for LIN_EIG_SELF integer parameter n 8 58 Chapter 2 Singular Value and Eigenvalue Decomposition IMSL Fortran 90 MP Library 4 0 fred Heat kind le0 parameter one 1le0 kind 1le0 a n n d n v_s n n y n n Generate a random self adjoint matrix call a a L rand_gen y reshape y n n a transpose a Comput th igenvalues and eigenvectors call L lin_eig_self a d v v_s Check the results for small r
219. e Parallel Option IMSL Fortran 90 MP Library 4 0 Use packaged option to reset the value of a small diagonal iopti l d_options d_lin_sol_self_set_small amp epsilon one abs E 1 nrack Use packaged option to save the factorization iopti 2 d_lin_sol_self_save_factors Suppress error messages and stopping due to singularity of the matrix which is expected iopti 3 d_lin_sol_self_no_sing_mess iopti 4 0 Pits gt rece Alt t nreck Ce nreack BYE a do tries 1 2 Galil ra SOL Selle AES a NEEN amp bor ee Peeks Snr ere S pivots ipivots iopt iopti When code is r ntered the already computed factorization is used iopti 4 d_lin_sol_self_solve_A Reset right hand side in the direction of the eigenvector Bic nrc UNI Ce Se ir ae end do end do Normalize the eigenvector IF MP_RANK 0 x UNIT x Check the results b ATEMP x x Glo inueerel lt il piawe err nrack amp ClOie_jorerexclbierc ox AL sin dl piaiereyelc Jo Al gigve il iaseeyelic IHN ie inieerelic results_are_true nrack amp abs err nrack lt sqrt epsilon one E 1 nrack enddo Check the results if ALL results_are_true and MP_RANK 0 amp write Parallel Example 7 is correct See to any error messages and quit MPI mp_nprocs mp_setup Final end IMSL Fortran 90 MP Library 4
220. e declaration directive IMPLICIT NONE is a reliable way of writing ScaLAPACK application code since the routines may have lengthy lists of arguments Using the modules is helpful to avoid the mistakes such as missing arguments or mismatches involving Type Kind or Rank TKR The modules are part of the Fortran 90 MP Library product There is a comprehensive module ScaLAPACK_Support that includes use of all the modules in the table below This module decreases the number of lines of code for checking the interface but at the cost of increasing source compilation time compared with using individual modules Contents of the Module l ScaLAPACK_Support All of the following modules l l ScaLAPACK_Int All interfaces to ScaLAPACK routines l PBLAS_Int All interfaces to parallel BLAS or PBLAS BLACS_Int All interfaces to basic linear algebra communication routines or BLACS TOOLS_Int Interfaces to ancillary routines used by ScaLAPACK but not in other packages LAPACK_Int All interfaces to LAPACK routines required by ScaLAPACK l ScaLAPACK_IO_Int All interfaces to ScaLAPACK_Read ScaLAPACK_Write utility routines See this Chapter MeT Node Tnt The module holding data describing the MPI communicator MP_LIBRARY_WORLD See Chapter 6 ScaLAPACK_READ This routine reads matrix data from a file and transmits it into the two dimensional block cyclic form required by ScaLAPACK ro
221. e error processor are found in Chapter 9 Messages are printed by nodes from largest rank to smallest which is the root node Use of the routine MPI_Finalize is made within MP_SETUP Final which shuts down MPI After MPI_Finalize is called the value of MP_NPROCS 0 This flags that MPI has been initialized and terminated It cannot be initialized again in the same program unit execution No MPI routine is defined when MP_NPROCS has this value Using Processors There are certain pitfalls to avoid when using Fortran 90 MP Library and box data types as implemented with MPI A fundamental requirement is to allow all processors to participate in parts of the program where their presence is needed 148 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 for correctness It is incorrect to have a program unit that restricts nodes from executing a block of code required when computing with the box data type On the other hand it is appropriate to restrict computations with rank 2 arrays to the root node This is not required but the results for the alternate nodes are normally discarded This will avoid gratuitous error messages that may appear at alternate nodes Observe that only the root has a correct result for a box data type function Alternate nodes have the constant value one as the result The reason for this is that during the computation of the functions sub probl
222. e ii infinite eigenvalues 71 initialization several 2D transforms 90 initialization several transforms 83 interface block ti internal write 123 139 inverse 2 iteration computing eigenvectors 14 40 59 matrix vi 3 9 11 13 generalized 18 20 Index iii transform 80 87 92 inverse matrix 2 isNaN 165 ISO ii iterative refinement vi 1 38 IVPAG routine 43 K Kershaw 37 L Laplace transform solution 31 larger data uncertainty example 47 76 least squares 9 18 24 25 26 31 32 82 89 library subprograms ii linear equations 9 linear least squares with non negativity constraints 246 247 248 254 256 linear solutions packaged options 4 linear trend cyclical 2D data 88 linear trend cyclical data 82 LU factorization of A 2 3 4 149 matrices adjoint iii covariance 13 18 21 inverse vi 2 3 9 11 13 generalized 18 20 22 inversion and determinant 1 5 orthogonal iii poorly conditioned 27 unitary iii upper Hessenberg 67 matrix pencil 47 71 75 means 126 message file building new direct access message file 126 changing messages 125 management 124 private message files 126 Metcalf ii method of lines 43 mistake iv e Index missing argument 233 Type Kind or Rank TKR 233 Modified Gram Schmidt algorithm 167 Moore Penrose 151 152 MPI 146 147 parallelism 146 N NaN Not a Number 165 quiet 164 signaling 164 Newton s method 32 50 norm 166 normalize 171
223. e matrix Also see operator_ex01 Chapter 6 for this example using the operator notation use lin_sol_gen_int use rand_gen_int use error_option_packet implicit none This is Example 1 for LIN_SOL_GEN integer parameter n 32 real kind le0O parameter one 1le0 real kind le0 err real kind le0O A n n b n n x n n res n n y n 2 Generate a random matrix call rand_gen y A reshape y n n Generate random right hand sides call rand_gen y b reshape y n n Compute the solution matrix of Ax b call lin_sol_gen A b x Check the res err results for small residuals b matmul A x maxval abs res sum abs A tabs b if err lt sqrt epsilon one then 2 e Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 write end if Example 1 for LIN_SOL_GEN is correct end Optional Arguments NROWS n Input Uses array A 1 n 1 n for the input matrix Default n size A 1 NRHS nb Input Uses array b 1 n 1 nb for the input right hand side matrix Default nb size b 2 Note that b must be a rank 2 array pivots pivots Output Input Integer array of size n that contains the individual row interchanges To construct the permuted order so that no pivoting is required define an integer array ip n Initialize ip i i i 1 n and then execute the loop after calling 1in_sol_gen k pivots i
224. e problem The continuous blending function MENR 20r is arbitrary and artfully chosen This is a mathematical change to 280 Chapter 8 Partial Differential Equations IMSL Fortran 90 MP Library 4 0 the problem required because of the stated discontinuity at t 0 Reverse communication is used for the problem data No additional user written subroutines are required when using reverse communication We also have chosen 10 of the initial grid points to be concentrated near x O anticipating rapid change in the solution near that point Optional changes are made to use a pure absolute error tolerance and non zero time smoothing program PDE_1D_MG_EX02 Inviscid Flow Over a Plate USE PDE_1ld_mg_int USE ERROR_OPTION_PACKET IMPLICIT NONE INTEGER PARAMETER NPDE 2 N1 10 N2 51 N N1 N2 INTEGER I IDO NFRAMES Define array space for the solution real kind 1d0 U NPDE 1 N TO TOUT DX1 DX2 DIFF real kind 1d0 ZERO 0D0 ONE 1D0 DELTA_T 1D 1 amp END 5D0 XMAX 25D0 real kind 1d0 U0 1D0 U1 0D0 TDELTA 1D 1 TOL 1D 2 TYPE D_OPTIONS IOPT 3 Start loop to integrate and record solution values IDO 1 DO SELECT CASE IDO Define values that determine limits and options CASE 1 TO ZERO TOUT DELTA
225. e similar matrix A The contents of v are updated by the transformations used in the algorithm Requires the simultaneous use of option _lin_eig_no_balance Default The array v is initialized to the identity matrix iopt IO _options _lin_eig_gen_no_sorting _dummy Does not sort the eigenvalues as they are isolated by solving the 2 x 2 or unit sized blocks This will have the effect of guaranteeing that complex conjugate pairs of eigenvalues are adjacent in the array E Default The entries of E are sorted so they are non increasing in absolute value Description The input matrix A is first balanced The resulting similar matrix is transformed to upper Hessenberg form using orthogonal transformations The double shifted QR algorithm transforms the Hessenberg matrix so that 2 x 2 or unit sized blocks remain along the main diagonal Any off diagonal that is classified as small in order to achieve this block form is set to the value zero Next the block upper triangular matrix is transformed to upper triangular form with unitary rotations The eigenvectors of the upper triangular matrix are computed using back substitution Care is taken to avoid overflows during this process At the end eigenvectors are normalized to have Euclidean length one with the largest component real and positive This algorithm follows that given in Golub and Van Loan 1989 Chapter 7 with some novel organizational details for ad
226. e_applied Input The point in the data interval where a constraint is to be applied type constraint_indicator Input The indicator for the type of constraint the spline function or its derivatives is to satisfy at the point where_applied The choices are the character strings lt gt and They respectively indicate that the spline value or its derivatives will be equal to not greater than not less than equal to the value of the spline at another point or equal to the negative of the spline value at another point These last two constraints are called periodic and negative periodic respectively The alternate independent variable point is value_applied for either periodic constraint There is a use of periodic constraints in Example 4 Optional Arguments derivative derivative_index Input This is the number of the derivative for the spline to apply the constraint The value 0 corresponds to the function the value 1 to the first derivative etc If this argument is not present in the list the value 0 is substituted automatically Thus a constraint without the derivative listed applies to the spline function periodic_point value_applied This optional argument improves readability by automatically identifying the second independent variable value for periodic constraints IMSL Fortran 90 MP Library 4 0 Chapter 4 Curve and Surface Fitting with Splines 99 spline_
227. each new value of h Note that even if the data A h and b are real subexpressions for the solution may involve complex intermediate values with x h finally a real quantity Also see operator_ex31 Chapter 6 use lin_eig_gen_int use lin_sol_gen_int use rand_gen_int implicit none This is Example 3 for LIN_EIG_GEN integer i integer parameter n 32 k 2 real kind le0 parameter one 1 0e0 zero 0 0e0 real kind le0 a n n b n k x n k temp n max n k h err type s_options iopti 2 complex kind 1e0 w n n t n n e n z n k call rand_gen temp a reshape temp n n call rand_gen temp b reshape temp n k iopti 1 s_options s_lin_eig_gen_out_tri_form zero iopti 2 s_options s_lin_eig_gen_no_balance zero Compute the Schur decomposition of the matrix call lin_eig_gen a vEw tri t amp Lopt iopti Choose a value so that Ath I is non singular h one 68 Chapter 2 Singular Value and Eigenvalue Decomposition IMSL Fortran 90 MP Library 4 0 Solve for Ath I x b using the Schur decomposition z matmul conjg transpose w b Solve intermediate upper triangular system with implicit additive diagonal h I This is the only dependence on h in the solution process do i n 1 1 z i l k z i 1 k t i i h Z Llsi l ltk 2 liea 1 rk amp spread t 1 i 1 i dim 2 ncopies k amp spread z i 1 k di
228. each problem set WORK 1 0 Each additional rank adds the dimension of the transform plus 15 Using the optional argument WORK increases the efficiency of the transform This function uses routines fast_dft fast_2dft and fast_3dft from Chapter 3 IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 163 The option and derived type names are given in the following tables Option Name for IFFT Option Value options_for_fast_dft 1 Derived Type Name of Unallocated Array s_options s_ifft_box_options d_options d_ifft_box_options s_options s_ifft_box_options_once d_options d_ifft_box_options_once Example Compute the inverse DFT of a random complex array x rand x x ifft_box y isNaN This is a generic logical function used to test scalars or arrays for occurrence of an IEEE 754 Standard format of floating point ANSI IEEE 1985 NaN or not a number Either quiet or signaling NaNs are detected without an exception occurring in the test itself The individual array entries are each examined with bit manipulation until the first NaN is located For non IEEE formats the bit pattern tested for single precision is transfer not 0 1 For double precision numbers x the bit pattern tested is equivalent to assigning the integer array i 1 2 not 0 then testing this array with the bit pattern of the integer array transfer x i This fun
229. eal kind le0 err A n n real kind le0 t k y n k y_prime n k complex kind le0 x n n z_O n y_O n d n Generate a random coefficient matrix A rand A Compute th igenvalu igenvector decomposition of the system coefficient matrix D EIG A W X Generate a random initial value for the ODE system y_O rand y_0 Solve complex data system that transforms the initial values X z_0 y_0 z_0 X ix y0 The grid of points where a solution is computed t i delta_t i 0 k 1 Compute y and y at the values t 1 k With th igenvalu igenvector decomposition AX XD this is an evaluation of EXP A t y_0 y t y X x exp spread d 2 k spread t 1 n spread z_0 2 k This is y derived by differentiating y t IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 175 y_prime X x spread d 2 k exp spread d 2 k spread t 1 n amp spread z_0 2 k Check results Is y Ay 0 err norm y_prime A x y norm y_prime tnorm A norm y if err lt sqrt epsilon one then write Example 4 for LIN_SOL_GEN operators is correct end if end Operator_ex05 use linear_operators implicit none This is Example 1 for LIN_SOL_SELF using operators and functions integer parameter m 64 n 32 real kind le0 one 1 0e0 err real kind le0 A n n b n n C m n d m n x n n G
230. ear problem in cylindrical coordinates Our example illustrates assigning m 1 in Equation 2 We provide an optional argument that resets this value from its default m 0 Reverse communication is used to interface with the problem data program PDE_1D_MG_EX04 Reactor Diffusion problem in cylindrical coordinates USE pde_ld_mg_int USE error_option_packet IMPLICIT NONE INTEGER PARAMETER NPDE 1 N 41 INTEGER IDO I NFRAMES Define array space for the solution real kind 1d0 T NPDE 1 N Z0 ZOUT real kind 1d0 ZERO 0D0 ONE 1D0 DELTA_Z 1D 1 amp ZEND 1D0 ZMAX 1D0 BETA 1D 4 GAMMA 1D0 EPS 1D 1 By ry TYPI D_OPTIONS IOPT 1 Start loop to integrate and record solution values IDO 1 DO SELECT CASE IDO Define values that determine limits CASE 1 Z0 ZERO ZOUT DELTA_Z T NPDE 1 1 ZERO T NPDE 1 N ZMAX OPEN FILE PDE_ex04 out UNIT 7 NFRAMES NINT ZEND DELTA_Z DELTA_Z WRITE 7 3I5 4D14 5 NPDE N NFRAMES amp T NPDE 1 1 T NPDE 1 N Z0 ZEND IOPT 1 PDE_1D_MG_CYL_COORDINATES Update to the next output point Write solution and check for final point CASE 2 IF ZO lt ZEND THEN WRITE 7 F10 5 ZOUT CI DO I 1 NPDE 1 WRITE 7 4E15
231. ecovery is possible and desirable then corrective action can be taken Otherwise the calling routine may pass the error state up one more level The severity of these conditions varies from note to fatal For each condition there is a possibility that corrective action can be taken by the calling routine and that the recovery option is desirable Only one such informational error state can be handled in this manner Situations involving multiple errors require alternative mechanisms such as extra arguments and that is not implemented e Terminal Class Usage errors such as incorrect or inconsistent argument values are in this class In most cases these errors result from blunders in developing software In normal processing a message is issued and execution is terminated by the calling routine detecting the error Serious error conditions are classified as terminal if in the opinion of the routine designer there is no reasonable chance or need for automatic recovery by the calling routine The calling routine or program needs revision and recompilation in order to correct the error e Global Class These error conditions are handled in a global manner A message is issued by the routine detecting the error but processing continues The decision on whether or not to terminate execution is made later by an upper level routine usually the main program at the end of a processing step The error handling routines and procedures discu
232. ectangle containing the independent variable data pairs Our algorithm for constraining the surface is simple but effective in this case The data fitting is repeated one more time but with positive constraints at the grid of points where it was previously negative US US US surface_fitting_int rand_int norm_int Ae el implicit none This is Example 4 for SURFACE_FITTING tensor product B splines approximation f x y Use the data set from 120 Chapter 4 Curve and Surface Fitting with Splines IMSL Fortran 90 MP Library 4 0 Franke due to Ferguson Without constraints the function becomes negative in a corner Constrain the surface at a grid of values so it is non negative integer i j q integer parameter ngrid 9 nord 4 ndegree nord l amp nbkpt ngrid 2 ndegree ndata 25 nvalues 50 real kind 1d0 parameter zero 0d0 one 1d0 real kind 1d0 parameter TOLERANCE 1d 3 real kind 1d0 target spline_data 4 ndata bkptx nbkpt amp bkpty nbkpt coeff ngrid tndegree 1 ngrid ndegree 1 amp x nvalues y nvalues values nvalues nvalues amp delta real kind 1d0 pointer pointer_bkpt type d_spline_knots knotsx knotsy type d_surface_constraints allocatable C real kind le0 data 3 ndata amp This is Ferguson s data 2 0 ge SO 5 245 5 2 49 7 647 3 2 amp 2 981 0 291 Zola 3 471 7 062 3 54 amp 3 961
233. ector decomposition of an ordinary or generalized eigenvalue problem 9 For the ordinary eigenvalue problem Ax ex the optional input B is not used With the generalized problem Ax eBx the matrix B is passed as the array in the right side of B The optional output D is an array required only for the generalized problem and then only when the matrix B is singular The array of real eigenvectors is an optional output for both the ordinary and the generalized problem It is used as v where the right side array will contain 158 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 the eigenvectors If any eigenvectors are complex the optional output w must be present In that case v should not be used Required Argument This function requires one argument and the argument must be a square rank 2 array or a rank 3 array with square first rank 2 sections The output is a rank 1 or rank 2 complex array of eigenvalues Modules Use the appropriate module eig_int or linear_operators Optional Variables Reserved Names This function uses lin_eig_self lin_eig_gen and lin_geig_gen to compute the decompositions See Chapter 2 Singular Value and Eigenvalue Decomposition lin_eig_self lin_eig_gen and lin_geig_gen The option and derived type names are given in the following tables Option Name for EIG Option Value options_f
234. ed Modules Use the appropriate one of the modules det_int or linear_operators Optional Variables Reserved Names This function uses lin_sol_lsq see Chapter 1 Linear Solvers lin_sol_lsq to compute the QR decomposition of A and the logarithmic value of det A which is exponentiated for the result 156 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 The option and derived type names are given in the following tables Option Name for DET Option Value s_det_for_lin_sol_lsq 1 d_det_for_lin_sol_lsq 1 c det_for_lin_sol_lsq 1 z det_for_lin_sol_lsq 1 Derived Type Name of Unallocated Array s_options s_det_options s_options d_options s_det_options_once d_det_options_once d_options d_det_options Example Compute the determinant of a matrix and its inverse b DET A c DET i A b 1 c DIAG Construct a square diagonal matrix from a rank 1 array or several diagonal matrices from a rank 2 array The dimension of the matrix is the value of the size of the rank 1 array Required Argument This function requires one argument and the argument must be a rank 1 or rank 2 array The output is a rank 2 or rank 3 array respectively The use of DIAG may be obviated by observing that the defined operations C diag x x A or D B x diag x are respectively the array operations C
235. eigenvector matrix V and verifying that the residuals R AV VE are small Also see operator_ex29 Chapter 6 use lin_eig_gen_int use rand_gen_int implicit none This is Example 1 for LIN_EIG_GEN integer parameter n 32 real kind 1d0 parameter one 1d0 real kind 1d0 A n n y n n err complex kind 1d0 E n V n n E_T n type d_error d_epack 16 d_error 0 0d0 Generate a random matrix call rand_gen y A reshape y n n Compute only the eigenvalues call lin_eig_gen A E Compute the decomposition A V V values obtaining eigenvectors call lin_eig_gen A E_T v V Use values from the first decomposition vectors from the second decomposition and check for small residuals err sum abs matmul A V V spread E DIM 1 NCOPIES n amp sum abs E if err lt sqrt epsilon one then write Example 1 for LIN_EIG_GEN is correct end if end IMSL Fortran 90 MP Library 4 0 Chapter 2 Singular Value and Eigenvalue Decomposition 63 Optional Arguments NROWS n Input Uses array A 1 n 1 n for the input matrix Default n size A 1 v V Output Returns the complex array of eigenvectors for the matrix A v_adj U Output Returns the complex array of eigenvectors for the matrix A T Thus the residuals S A U UE are small tri T Output Returns the complex upper triangula
236. eleted There is usage of the BLACS and PBLAS The problem sizes is such that the results are checked on one process scpk_ex2 Example 2 cked for c PAC US JMO NE EC AEE INCLUD aa INTEGE R Si P_INT ON WOTE o ah h PARAMETER for ScaLAPACK_RI Gl FAD and ScaLAPACK_WRITI The product of two matrices is computed with PBLAS Orrect ness K_SUPPORT amp r M 33 N 34 MB 16 NB 16 NIN 10 52 INTEGER C ROPT DA_B ONTXT A JA TDA_B DB T F NPROW NPCOL MYROW MYCOL LDA_C amp WEA p TIDY I IDA I IDRC JUHRIROIR Ih lp lap ts JB OURPHHNH ESC_A IMSL Fortran 90 MP Library 4 0 D g 10 ESGlB 9 DESCO Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers e 239 real kind 1d0 ALPHA BETA ERROR 1d0 SIZE_C realilkind Gel Os mmalelkoc aite aloe ae catmemssko messi a A B AEA olay Clie Clue MP_NPROCS MP_SETUP Routines with the BLACS_ prefix are from the BLACS library This is an adjunct library to the ScaLAPACK library CALL BLACS_PINFO MP_RANK MP_NPROCS Make initialization for BLACS CALL BLACS_GET 0 0 CONTXT Approximate processor grid to be nearly square PROW sqrt real MP_NPROCS NPCOL MP_NPROCS NPROW IF NPROW NPCOL lt MP_NPROCS THEN NPROW 1 NPCOL MP_NPROCS ND LE CALL
237. elta bkpty nbkpt bkpty 1 nvalues 1 kptx nord nbkpt ndegree bkptx 1 i delta i 0 ngrid 1 kpty nord nbkpt ndegree bkpty 1 i delta i 0 ngrid 1 IMSL Fortran 90 MP Library 4 0 Chapter 4 Curve and Surface Fitting with Splines 121 y okpty 1 i delta i 1 nvalues Evaluate the function at a rectangular grid Use non positive values to a constraint values surface_values 0 0 x y knotsx knotsy coeff Count the number of values lt zero Then constrain the spline so that it is gt TOLERANCE at those points where it was lt zero g count values lt zero allocate C q DO I 1 nvalues DO J 1 nvalues IF values I J lt zero THEN C q surface_constraints point x i y j type gt amp value TOLERANCE g q 1 END IF END DO END DO Fit the data with constraints and obtain the coefficients coeff surface_fitting spline_data knotsx knotsy amp CONSTRAINTS C deallocate C Evaluate the surface at a grid and check once again for non positive values All values should now be positive values surface_values 0 0 x y knotsx knotsy coeff if count values lt zero 0 then write Example 4 for SURFACE_FITTING is correct end if end Fatal and Terminal Error Messages See the messages gls file for error messages for surface_fitting These error messages a
238. ems are allocated to the alternate nodes by the root but for only the root to utilize the result If a user needs a value at the other nodes then the root must send it to the nodes This principle is illustrated in Parallel Example 3 Convergence information is computed at the root node and broadcast to the others Without this step some nodes would not terminate the loop even when corrections at the root become small This would cause the program to be incorrect Optional Data Changes To reset tolerances for determining singularity and to allow for other data changes non allocated hidden variables are defined within the modules These variables can be allocated first then assigned values which result in the use of different tolerances or greater efficiency in the executable program The non allocated variables whose scope is limited to the module are hidden from the casual user Default values or rules are applied if these arrays are not allocated In more detail the inverse matrix operator i applied to a square matrix first uses the LU factorization code 1in_sol_gen and row pivoting The default value for a small diagonal term is defined to be sqrt epsilon A sum abs A n n 1 If the system is singular a generalized matrix inverse is computed with the OR factorization code 1in_sol_lsq using this same tolerance Both row and column pivoting are used If the system is singular an error message will be prin
239. end do PRR 553g H H H H KTA Save double precision copies of the diagonals and the right hand side c_save c l n l n d_ save d 1 n 1 n b_save b 1 n 1 n x_save x l n 1 n y_save l n 1l in d l n l in x_save amp c 1 n 1 n EOSHIFT x_save SHIFT 1 DIM 1 amp b 1l n 1 n EOSHIFT x_save SHIFT 1 DIM 1 Iterative refinement loop factorization_choice do nopt 0 1 Set the logical to flag the first time through solve_only fals x_sol d_zero change_old huge s_one iterative_refinement do This flag causes a copy of data to be moved to work arrays and a factorization and solve step to be performed if not solve_only then ce l in 1 n c_save d l n 1 n d_save b 1l n 1 n b_save end if Compute current residuals y A x using current x y l in lin y_save amp d_save x_sol amp c_save EOSHIFT x_sol SHIFT 1 DIM 1 amp b_save EOSHIFT x_sol SHIFT 1 DIM 1 call lin_sol_tri c d b y iopt iopt x_sol x_sol y l1 in 1 n 38 Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 change_new sum abs y 1 n 1 n If size of change is not decreasing stop the iteration if change_new gt change_old exit iterative_refinement change_old change_new iopt nopt 1 s_options s_lin_sol_tri_solve_only s_zero solve_only tru end do iterative_refinement Use Ga
240. enerate two rectangular random matrices C rand C d rand d Form the normal equations for the rectangular system A C tx Cp b C ete d Compute the solution for Ax b A is symmetric x A ix b Check the results err norm b A x x norm A norm b if err lt sqrt epsilon one then write Example 1 for LIN_SOL_SELF operators is correct end if end Operator_ex06 use linear_operators implicit none This is Example 2 for LIN_SOL_SELF using operators and functions integer parameter m 64 n 32 real kind le0 one le0 zero 0e0 re real kind le0 A n n b n C m n d m cov n n x n Generate a random rectangular matrix and right hand side C rand C d rand d Form the normal equations for the rectangular system A C tx C b C tx d COV i CHOL A COV COV xt COV 176 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 Compute the least squares solution x C ix d Compare with solution obtained using the inverse matrix err norm x COV x b norm cov Scale the inverse to obtain the sample covariance matrix COV sum d C x x 2 m n COV Check the results if err lt sqrt epsilon one then write Example 2 for LIN_SOL_SELF operators is correct end if end Operator_ex07 use linear_operators implicit none T
241. ent variable values IMSL Fortran 90 MP Library 4 0 Chapter 4 Curve and Surface Fitting with Splines 97 xi Yozi i 1 ndata and uncertainty Each dependent variable value requires an estimate of its uncertainty O 5 Use the array function surface_fitting to compute the coefficients of the tensor product B spline 6 With the coefficients obtained in the previous step the array function surface_values evaluates the spline its derivatives or the square root of its variance The Derived Type Function surface_constraints The user defines the constraints of the tensor product spline at discrete points by use of an array of derived type Each entry of that array has components with the following definitions type _surface_constraints integer derivative_index 2 real kind where_applied 2 CHARACTER LEN constraint_indicator real kind value_applied real kind periodic_point 2 end type A generic function is packaged in the module SURFACE_FITTING_INT Its values are arrays of derived type _surface_constraints depending on the precision of the arguments The Evaluator Function surface_values After computation of the tensor product B spline coefficients values of the spline surface its various derivative functions or the square root of the variance of the curve are computed or evaluated with this function Since a major use of the values are likely to be for graphical display a
242. envalues of the matrix pencil This random matrix pencil example has all finite eigenvalues Also see operator _ex33 Chapter 6 use lin_geig_gen_int use rand_gen_int implicit none This is Example 1 for LIN_GEIG_GEN integer parameter n 32 IMSL Fortran 90 MP Library 4 0 Chapter 2 Singular Value and Eigenvalue Decomposition 71 real kind 1d0 parameter one 1d0 real kind 1d0 A n n B n n beta n beta_t n err y n n complex kind 1d0 alpha n alpha_t n V n n Generate random matrices for both A and B call A call B L rand_gen y reshape y n n L rand_gen y reshape y n n Comput call Compute call Use values from the first decomposition second decomposition the generalized L lin_geig_gen A B igenvalues alpha beta the full decomposition once again A V B V values L lin_geig_gen A B alpha_t beta_t amp v V vectors from the and check for small residuals err sum abs matmul A V amp matmul B V spread alpha beta DIM 1 NCOPIES n sum abs a abs b if err lt sqrt epsilon one then write Example 1 for LIN_GEIG_GEN is correct end if end Optional Arguments NROWS n Input Uses arrays A 1 n 1 n and B 1 n 1 n for the input matrix pencil Default n size A 1 v V Output Returns the complex array of generalized eigenvectors for the mat
243. er 2 Singular Value and Eigenvalue Decomposition 77 end if end if end Fatal Terminal and Warning Error Messages See the messages gls file for error messages for 1in_geig_gen These error messages are numbered 921 936 941 956 961 976 981 996 78 Chapter 2 Singular Value and Eigenvalue Decomposition IMSL Fortran 90 MP Library 4 0 Chapter 3 Fourier Transforms Introduction Contents Following are routines for computing Fourier Transfoms of rank 1 rank 2 and rank 3 complex arrays fast bo eee PREP PEE REET CEE PEE E TET EEE ECEE PEC CEE OPT ETE a ET Crrere rrr rer er errr rere 79 Example 1 Transforming an Array of Random Complex Numbers 79 Example 2 Cyclical Data with a Linear Trend ccceecceeeeeeeeeseeeeeeees 82 Example 3 Several Transforms with Initialization 0 ccccccesseeeeee 83 Example 4 Convolutions using Fourier Transforms cccceeeeeeseees 84 ae jome gt bal rrr Perreeeer reer PCree errr Tree ee etree a aj a reer errr 86 Example 1 Transforming an Array of Random Complex Numbers 86 Example 2 Cyclical 2D Data with a Linear Trend ccccccceesteeeeeeees 88 Example 3 Several 2D Transforms with Initialization 00 cccceeeeeees 90 BASS AEC Asesina tatesacnsccesucssaaesaznsaces des satesazededeeachiaavstaneaecediativaaaenettere 91 Example 1 Transforming an Array of Random Complex Numbers 9i fast_dft Computes the D
244. ergence and quit when it happens if norm delta_lamda lt amp sqrt epsilon one norm lamda EXIT solve_for_lamda Correct any bad moves to a positive restart IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 193 move rand move where lamda lt 0 lamda s 1 move end do solve_for_lamda Compute solutions and check lengths xX v x t_g spread s_sq DIM 2 NCOPIES k amp spread lamda DIM 1 NCOPIES n err norm sum x 2 DIM 1 alpha 2 norm alpha 2 if err lt sqrt epsilon one then write Example 2 for LIN_SVD operators is correct end if end Operator_ex23 use linear_operators implicit none This is Example 3 using operators for LIN_SVD integer parameter n 32 integer i real kind 1d0 parameter one 1d0 real kind 1d0 dimension n n d 2 n n x u_d 2 n 2 n amp Vid Vic UE VeS USE s_d n c n s n sc_c n sc_s n real kind 1d0 errl err2 Generate random square matrices for both A and B Construct D A is on the top B is on the bottom D rand D D 1 n A D nt l B Compute the singular value decompositions used for the GSVD S_D SVD D U u_d V v_d C SVD u_d 1l in 1 n u u_c v v_c S SVD u_d nt 1 1 n u u_s v v_s Rearrange c so it is non increasing Move singular vectors accordingly The use of temporary objects sc_c and x is required
245. error states the errors based on the STOP attribute for e Ifthe user attributes have been selected decides to stop or continue for error states of type to 4 based on the STOP attribute for the current error type e If in Library mode and if popping to user code a stop or continue decision is made based on a reference to N1RGB e Ifan IMSL routine references user written code the error handler uses the PRINT and STOP attributes set by the user This is accomplished by calling the routine E1USR A typical set of statements follow CALL E1LUSR ON Reference to user written code CALL ELUSR OFF The user s code is referenced between calls to E1USR If the user s code calls other IMSL routines and if those routines encounter error conditions then they will be handled properly If the user does not handle an error a type 4 error for example then the message will be printed and execution stopped when the CALL E1POP is executed by an IMSL routine and reaches Level 1 If the user has changed the attributes for type 4 errors the user is responsible for handling the recovery from such errors A stop or continue decision can be 306 Chapter 9 Error Handling and Messages The Parallel Option IMSL Fortran 90 MP Library 4 0 made for type 6 and type 7 errors by using the function N1RGB as follows If N1RGB 0 NE 0 STOP CALL ELP The function N1RGB r
246. erse communication is used for the problem data CANIGu JED ND IME ALO AKO WE ILIDYO 1U END DO TIME L TIMEE TIMES DATA EPS P ETA U 2 N TIMEL IF MP_RANK gt 0 THEN Send parameters and time to the root IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations 297 CALL MPI_SEND DATA 5 MPI_DOUBLE_PRECISION 0 MP_RANK MP_LIBRARY WORLD IERROR Receive back a go stop flag CALL MPI_RECV CONTINUE 1 MPI_INTEGER 0 MPI_ANY_TAG MP_LIBRARY_WORLD STATUS IERROR If root notes that time is up it sends node a quit flag IF CONTINUE 0 EXIT SIMULATE ELSE If root is working record its result and then stand ready for other nodes to send IF MPI_ROOT_WORKS WRITE 7 MP_RANK DATA If all nodes have reported then quit IF COUNT MPI_NODE_PRIORITY gt 0 0 EXIT SIMULATE See if time is up Some nodes still must report IF MPI_WTIME TIME gt SIM_TIME THEN CONT INUE 0 ELSE CONT INUE 1 END IF Root receives simulation data and finds which node sent it IF MP_NPROCS gt 1 THEN CALL MPI_RECV DATA 5 MPI_
247. erse transformed array Default value is 1 Description The fast_dft routine is a Fortran 90 version of the FFT suite of IMSL 1994 pp 772 776 The maximum computing efficiency occurs when the size of the array can be factored in the form ae er using non negative integer values i i iz i4 There is no further restriction on n l Additional Examples Example 2 Cyclical Data with a Linear Trend This set of data is sampled from a function x t at b y t where y t is a harmonic series The independent variable is normalized as 1 lt t lt 1 Thus the data is said to have cyclical components plus a linear trend As a first step the linear terms are effectively removed from the data using the least squares system solver 1in_sol_1sq Chapter 1 Then the residuals are transformed and the resulting frequencies are analyzed use fast_dft_int use lin_sol_lsq_int use rand_gen_int use sort_real_int implicit none This is Example 2 for FAST_DFT integer i integer parameter n 64 k 4 integer ip n real kind le0 parameter one le0 two 2e0 zero 0e0 real kind le0 delta_t pi real kind le0 y k z 2 indx k t n temp n complex kind le0 a_trend n 2 a b_trend n 1 b c k f n amp r n x n x_trend 2 1 Generate random data for linear trend and harmonic series call rand_gen z a z 1 b z 2 call rand_gen y This emphasizes harmonics 2 through k l c y o
248. es s_sq 1 lt sqrt epsilon one then write Example 4 for LIN_SVD operators is correct end if end Operator_ex25 use linear_operators implicit none This is Example 1 using operators for LIN_EIG_SELF integer parameter n 64 real kind le0 parameter one 1le0 real kind le0 A n n D n S n Generate a random matrix and from it a self adjoint matrix A rand A A A t A Compute th igenvalues of the matrix D EIG A For comparison compute the singular values and check for any error messages for either decomposition S SVD A Check the results Magnitude of eigenvalues should equal the singular values if norm abs D S lt sqrt epsilon one S 1 then write Example 1 for LIN_EIG_SELF operators is correct end if end 196 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 Operator_ex26 use linear_operators implicit none This is Example 2 using operators for LIN_EIG_SELF integer parameter n 8 real kind le0 parameter one le0 real kind le0 dimension n n A d n vs Generate a random self adjoint matrix A rand A A A t A Compute th igenvalues and eigenvectors D EIG A V v_s Check the results for small residuals if norm A x v_s v_s x diag D abs d 1 lt amp sqrt eps
249. es or the IMSL routine attributes The error control stack is pushed by referencing the subroutine in the following call CALL E1PSH name This reference performs the following tasks e Increments the stack pointer by 1 e Places name on the stack IMSL Fortran 90 MP Library 4 0 Chapter 9 Error Handling and Messages The Parallel Option 305 e Sets error type and error code to 0 for the current level e Sets the attribute flag so that the PRINT and STOP attributes for IMSL routines are used for error types 1 to 4 The user level attributes are used for types 5 to 7 In addition to the error control stack there is an error message with maximum length 1 024 The most recently issued message is retained in the message structure until it is either printed or deleted The error control stack is popped by referencing the subroutine E1POP as follows CALL E1POP name This reference performs the following tasks e Compares name with the name for the current level e Moves the error type and error code values to the previous level e Decreases the stack pointer by 1 Printing of error messages is triggered by the stack pointer reaching a return to us er code called Level 1 e Ifthe user attributes have been selected decides whether or not the message should be printed for error states of type 1 to 4 based on the PRINT attribute for the current error type e Decides to stop or continue for
250. es system will handle rank deficient problems A set of reference are available in Hanson 1995 and Lawson and Hanson 1995 The CONFT DCONFT routine uses QPROG loc cit p 959 which requires that the least squares equations be of full rank Additional Examples Example 2 Shaping a Curve and its Derivatives The function g x exp x 2 1 noise is fit by cubic splines on the grid of equally spaced points x i 1 Ax i 1 ndata The term noise is uniform random numbers from the normalized interval r Tt where T 0 01 The spline curve is constrained to be convex down for for 0 lt x lt 1 convex upward for 1 lt x lt 4 and have the second derivative exactly equal to the value zero at x 1 The first derivative is constrained with the value zero at x 0 and is non negative at the right and of the interval x 4 A sample table of independent variables second derivatives and square root of variance function values is printed use spline_fitting_int use show_int use rand_int use norm_int implicit none This is Example 2 for SPLINE_FITTING Use lst and 2nd derivative constraints to shape the splines integer i icurv integer parameter nbkptin 13 nord 4 ndegree nord l amp nbkpt nbkptin 2 ndegree ndata 21 ncoeff nbkpt nord real kind le0 parameter zero 0e0 one le0 half 5e 1 real kind le0 parameter range 4 0 ratio 0 02 tol ratio half real kind le0 para
251. esiduals if sum abs matmul a v_s v_s spread d 1 n d 1 lt amp sqrt epsilon one then write Example 2 for LIN_EIG_SELF is correct end end use use use use if Example 3 Computing a few Eigenvectors with Inverse Iteration A self adjoint n x n matrix is generated and the eigenvalues d are computed The eigenvectors associated with the first k of these are computed using the self adjoint solver 1in_sol_self and inverse iteration With random right hand sides these systems are as follows A djl y b The solutions are then orthogonalized as in Hanson et al 1991 to comprise a partial decomposition AV VD where V is an n X k matrix resulting from the orthogonalized v and D is the k x k diagonal matrix of the distinguished eigenvalues It is necessary to suppress the error message when the matrix is singular Since these singularities are desirable it is appropriate to ignore the exceptions and not print the message text Also see operator_ex27 Chapter 6 lin_eig_self_int lin_sol_self_int rand_gen_int error_option_packet implicit none This is Example 3 for LIN_EIG_SELF integer i j integer parameter n 64 k 8 real kind 1d0 parameter one 1d0 zero 0d0 real kind 1d0 big err real kind 1d0 a n n b n 1 d n res n k temp n n amp v n k y n n type d_options iopti 2 d_options 0 zero Ge
252. et to the smaller of NEW_SCEEN_SIZE and 72 All error message output is written to unit number given by the INTEGER variable ERROR_UNIT This value is obtained in the package by CALL UMACH 3 ERROR_UNIT If the value of ERROR_UNIT is non positive nothing is printed This test is made only on the root node The user or the defaults provided by the operating system must open the external file corresponding to ERROR_UNIT We now give examples that show how to use the error processor in applications Small program units are listed followed by the output Each example is executed in an MPI application with two nodes When using more than two nodes messages may appear from each node If that node has no messages nothing is printed Example 1 This program calls a subprogram that makes an error The error occurs after a call to MP_SETUP Messages and traceback information are gathered from the nodes and printed at the root Note that the names for the nodes are dependent on the local operating environment and hence will vary program errpexl USE TME ESE LUPSTNa IMPLICIT NONE Make calls to the VNI error processor while using MPI The error type shown is type 4 or FATAL An example is a call to a routine that expects a positive 310 Chapter 9 Error Handling and Messages The Parallel Option IMSL Fortran 90 MP Library 4 0 value for the INTEGER
253. et xdata 1 2 ydata 1 constraints 2 spline_constraints amp derivative 2 point bkpt last type amp value onetxdata ndata 2 ydata ndata Obtain the spline coefficients coeff spline_fitting data spline_data knots break_points amp constraints constraints Compute the evaluation points qx and weights qw for the Gauss Legendre quadrature This will give a precise quadrature for polynomials of degree lt nquad 2 call gqrul nquad iweight gqalpha qbeta nfix qxfix qx qw Compute pieces of the accumulated distribution function quad 0 zero do i l ndata 1 alpha_ bkpt nord i bkpt ndegreeti half beta_ bkpt nord i bkpt ndegreeti half Normalized abscissas are stretched to each spline interval Each polynomial piece is integrated and accumulated qxi alpha_ qxtbeta_ quad i sum qw spline_values 0 qxi break_points coeff alpha_ amp quad i 1 end do Normalize the coefficients and partial integrals so that the total integral has the value one coeff coeff quad ndata 1 quad quad quad ndata 1 rn rand rn x zero niterat 0 IMSL Fortran 90 MP Library 4 0 Chapter 4 Curve and Surface Fitting with Splines 107 solve_equation do Find the intervals where the x values are located LEFT_OF NDEGREE I NDEGREE do I I 1 if I gt LAST EXIT WHERE x g
254. eturns 1 if any type 6 or type 7 error states have been set since the previous N1RGB reference or since the beginning of execution and if the STOP attribute for that error type is set to YES Calls to routines E1PSH and E1POP are expensive since they require allocation of linked derived data types internal to the package We have provided a special name that ignores all stack manipulation until this name is popped from the stack Whence calls to the function NIRTY N1RCD and IERCD return the maximum error type or corresponding code regardless of the argument The case of the letters in the name is ignored Thus a typical set of statements are Sial CONGIURA SACI Reference to code that contains no call stack information but has o CALL the EIR r error processing OP NULLIFY_STACK Error States e The subroutine reference Call CALL EMES errtype errcode message is used to set an error state for the current level in the stack At least one routine name must be on the stack for this subprogram call to be defined The message is printed when control returns to Level 1 if the print attribute for that type is YES The printed message width can be shortened by subroutine E1HDR The name associated with the current stack level is combined with the message when it is printed Once an error state has been set any one of the settings error type error code
255. f v_c and v_s have the same span They are equivalent by taking the signs of the largest magnitude values positive do i l n sc_c i sign one v_c sum maxloc abs v_c 1l n i i sc_s i sign one v_s sum maxloc abs v_s 1l n i i end do v_c spread sc_c dim 1 ncopies n u_c u_c spread sc_c dim 1 ncopies n lt Q v_s v_s spread sc_s dim 1 ncopies n u_s spread sc_s dim 1 ncopies n F 0 In this form of the GSVD the matrix X can be unstable if D is ill conditioned x matmul v_d spread one s_d dim 1 ncopies n v_c Check residuals for GSVD A X u_c diag c_l c_n and B X u_s diag s_l sS_n errl sum abs matmul a x u_c spread c dim 1 ncopies n amp sum s_d err2 sum abs matmul b x u_s spread s dim 1 ncopies n amp sum s_d if errl lt sqrt epsilon one and amp rr2 lt sqrt epsilon one then write Example 3 for LIN_SVD is correct IMSL Fortran 90 MP Library 4 0 Chapter 2 Singular Value and Eigenvalue Decomposition 53 end if end Example 4 Ridge Regression as Cross Validation with Weighting This example illustrates a particular choice for the ridge regression problem The least squares problem Ax b is modified by the addition of a regularizing term to become min Ax B 2 lli The solution to this problem with row k deleted is denoted by x A Using nonnegative weights w
256. fault from the l to the J or norms Modules Use the appropriate modules norm_int or linear_operators Optional Variables Reserved Names If the norm is required this function uses 1in_sol_svd see Chapter 1 Linear Solvers 1in_sol_svd to compute the largest singular value of A For the other norms Fortran 90 intrinsics are used 166 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 The option and derived type names are given in the following tables Option Name for Norm Option Value u norm_for_lin_sol_svd 1 _reset_default_norm norm_for_lin_sol_svd a a a reset_default_norm Q c_reset_default_norm z_norm_for_lin_sol_svd 2 1 2 norm_for_lin_sol_svd 1 2 1 2 z_reset_default_norm Derived Type Name of Unallocated Array s_options s_norm_options s_options s_norm_options_once d_options d_norm_options d_options d_norm_options_once Example Compute three norms of an array Both assignments of n_2 yield the same value A n_1 norm A 1 n_2 norm A type 2 n_2 norm A n_inf norm A huge 1 ORTH Orthogonalize the columns of a rank 2 or rank 3 array The decomposition A QR is computed using a forward and backward sweep of the Modified Gram Schmidt algorithm Required Arguments The first argument must be an array of rank 2 or rank 3 An optional argument can
257. find the first value such that isNaN a i j or isNan b i j true See the isNaN function Chapter 6 Default Does not scan for NaNs iopt IO _options _lin_sol_lsq_no_sing_mess _dummy Do not print an error message when A is singular or k lt min m n IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 21 Description The routine 1in_sol_isq solves a rectangular system of linear algebraic equations in a least squares sense It computes the decomposition of A using an orthogonal factorization This decomposition has the form Ri xk 0 OAP 0 0 where the matrices Q and P are products of elementary orthogonal and permutation matrices The matrix R is k x k where k is the approximate rank of A This value is determined by the value of the parameter Small See Golub and Van Loan 1989 Chapter 5 4 for further details Note that the use of both row and column pivoting is nonstandard but the routine defaults to this choice for en hanced reliability Example 2 System Solving with the Generalized Inverse This example solves the same form of the system as Example 1 In this case the grid of evaluation points is equally spaced The coefficients are computed using the smoothing formulas by rows of the generalized inverse matrix A computed using the optional argument ainv Thus the coefficients are given by the matrix vector product c A y where y is the vector of values of the func
258. for LIN_SOL_SVD operators is correct end if end Operator_ex17 use linear_operators use lin_sol_tri_int implicit none This is Example 1 using operators for LIN_SOL_TRI IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 185 integer parameter n 128 real kind 1d0 parameter one 1d0 zero 0d0 real kind 1d0 err real kind 1d0 dimension 2 n n d b c X y t n type d_error d_lin_sol_tri_epack 08 d_error 0 zero Generate the upper main and lower diagonals of the n matrices A_i For each system a random vector x is used to construct the right hand side Ax y The lower part of each array remains zero as a result c zero d zero b zero x zero ec 1 n rand c 1l n d lin rand d l in b 1l in rand b lin x l in rand x l n Add scalars to the main diagonal of each system so that all systems are positive definite t sum ctdtb DIM 1 d 1 n 1 n d 1 n 1 n spread t DIM 1 NCOPIES n Set Ax y The vector x generates y Note the use of EOSHIFT and array operations to compute the matrix product n distinct copies as one array operation y lin lin d lin lin x lin 1l n amp c 1l n 1 n EOSHIFT x 1 n 1 n SHIFT 1 DIM 1 amp b 1l n 1 n EOSHIFT x 1 n 1 n SHIFT 1 DIM 1 Compute the solution returned in y The input values of c d b and y are ove
259. forth in subparagraph c I of the Rights in Technical Data and Computer Software clause at DFAR 252 227 7013 and in subparagraphs a through d of the Commercial Computer Software Restricted Rights clause at FAR 52 227 19 and in similar clauses in the NASA FAR Supplement when applicable Contractor Manufacturer is Visual Numerics Inc 9990 Richmond Avenue Suite 400 Houston Texas 77042 Fortran and C IMSL Application Development Tools Fortran 90 Subroutines n and Functions Fortran 90 MP Library User s Guide with M PI Enhanced Subroutines and Functions for Distributed Scientific Applications Version Revision History Year Part 2 0 Original Issue 1994 5351 3 0 Fixed bugs added significant changes to 1996 3743 functionality 4 0 Added two new chapters each adding major 1998 7959 functionality e ScaLAPACK Utilities and Large Scale Solvers plus seven examples Partial Differential Equations plus nine examples Bug Fixes and Improvements e Repairs were made in the parallel error processing suite described in Chapter 9 of the library e Significant performance improvements were made in the real arithmetic versions of the linear algebra codes lin_sol_gen lin_sol_self lin_eig_gen lin_sol_svd and lin_svd Contents NERO GU CUNO oiiaee ora aaaea ara araea aaura Di i Chapter 1 Linear SOlIVefS iissiosssrissios arinaa aaaea 1 Chapter 2 Singular Value and Eigenvalue Decomposition
260. g the value MPI_NODE_PRIORITY I lt 0 This means that node MPI_NODE_PRIORITY I will be sent the task schedule but will not perform any significant work as part of box data type function evaluations e The LOGICAL flag MPI_ROOT_WORKS designates whether or not the root node participates in the major computation of the tasks The root node communicates with the other nodes to complete the tasks but can be IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 147 designated to do no other work Since there may be only one processor this flag has the default value TRUE assuring that one node exists to do work When more than one processor is available users can consider assigning MPT_ROOT_WORKS FALSE This is desirable when the alternate nodes have equal or greater computational resources compared with the root node Example 4 illustrates this usage A single problem is given a box data type with one rack The computing is done at the node other than the root with highest priority This example requires more than one processor since the root does not work When the generic function MP_SETUP N is called where N is a positive integer a call to MP_SETUP is first made using no argument Use just one of these calls to MP_SETUP This initializes the MPI system and the other parameters described above The array MPI_NODE_PRIORITY is allocated with size MP_NPROCS
261. g variables do not have to be precomputed following each entry to routine fast_dft use fast_dft_int use rand_gen_int implicit none This is Example 3 for FAST_DFT The value of the array size for work is computed in the routine fast_dft as a first step IMSL Fortran 90 MP Library 4 0 Chapter 3 Fourier Transforms 83 integer parameter n 64 integer ido_value real kind le0 one 1e0 real kind le0 err y 2 n complex kind le0 dimension n a b save_a complex kind le0 allocatable work Generate a random complex array call rand_gen y a cmplx y 1 n y nt 1 2 n kind one save_a a Transform and then invert the sequence using the pre computed working values ido_value 0 do if allocated work deallocate work Allocate the space required for work if ido_value lt 0 allocate work ido_value call c_fast_dft forward_in a forward_out b amp ido ido_value work_array work if ido_value 1 exit end do Re enter routine with working values available in work call c_fast_dft inverse_in b inverse_out a amp ido ido_value work_array work Deallocate the space used for work if allocated work deallocate work Check the results err maxval abs save_a a maxval abs save_a if err lt sqrt epsilon one then write Example 3 for FAST_DFT is correct end if end Example
262. genvectors do i l 2 y_t lin 1 k y_t l n 1 k epsilon s_one call linvsol tri cct dite bat yoty amp Lopt iopt iopt nopt 1 s_options s_lin_sol_tri_solve_only s_zero end do Orthogonalize the eigenvectors This is the most intensive part of the computing do j 1 k 1 Forward sweep of HMGS orthogonalization temp s_one sqrt sum y_t 1 n 4 2 y_t lin j3 y_t 1 n j temp y_t lin j 1 k lin j 1 k amp spread matmul 1 y_t y_t 1lin j y_t lin jt 1 k amp 40 Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 spread y_t lin j DIM 2 NCOPI end do DIM 1 NCOPIE ES k 4 temp s_one sqrt sum y_t 1 n k 2 y_t 1l n k y_t 1 n k temp do j k 1 1 1 y_t lin j 1 k y_t lin j 1 spread matmul y_t 1 n j y spread y_t 1 n j DIM 2 NCOF end do t Plt Backward sweep of HMGS k amp lin jt 1l k amp DIM 1 NCOPIE ES k See if the performance ratio is smaller than the value one If it is not the code will re solve the systems using Gaussian Elimination This is an exceptional event It is a necessary complication for achieving reliable results res l n 1 k spread d DIM 2 NCOPIES k y_t l in 1 k amp spread b DIM 2 NCOPIES k amp EOSHIFT y_t 1 n 1 k SHIFT 1 DIM 1 amp EOSHIFT spread b DIM 2 NCOPIES k y_t 1 n 1 k SHIFT 1 amp y_t 1 n 1 k spread EVAL_T 1 k
263. gonal 12 Single Program Multiple Data SPMD 231 singular value decomposition SVD 26 170 IMSL Fortran 90 MP Library smoothing formulas 22 solvable 75 solving general system 2 linear equations 9 rectangular least squares 26 system 18 sorting an array example 123 134 square matrices eigenvalue eigenvector expansion 58 polar decomposition 29 38 subprograms library ii optional arguments vi SVD 48 52 SVRGN 135 T testing suite v transfer 166 transpose 151 tridiagonal 34 matrix 37 matrix solving example 1 42 two dimensional data fitting 24 two dimensional smoothing check list 97 U unitary matrix iii upper Hessenberg matrix 67 using library subprograms ii V Van Loan 5 12 22 25 50 52 54 58 61 66 variances 126 variational equation 42 Ww World Wide Web URL for ScaLAPACK User s Guide 231 232 Index v
264. gr 1 delta sizev amp x nvalues y nvalues values nvalues nvalues amp f_00 f_x00 f_y00 real kind 1d0 pointer pointer_bkpt type d_spline_knots knotsx knotsy type d_surface_constraints C NC LOGICAL PASS Generate random x y pairs and evaluate the xampl xponential function at these values spline_data 1 2 two rand spline_data 1 2 spline_data 3 exp sum spline_data 1 2 2 dim 1 spline_data 4 one Define the knots for the tensor product data fitting problem delta two ngrid 1 bkpt l ndegree zero bkpt nbkpt ndegreet l nbkpt two bkpt nord nbkpt ndegree i delta i 0 ngrid 1 Assign the degr of the polynomial and the knots pointer_bkpt gt bkpt knotsx d_spline_knots ndegree pointer_bkpt knotsy knotsx Define the constraints for the fitted surface C 1 surface_constraints point zero zero type value one C 2 surface_constraints derivative 1 0 amp IMSL Fortran 90 MP Library 4 0 Chapter 4 Curve and Surface Fitting with Splines 119 point zero zero type value zero C 3 surface_constraints derivative 0 1 amp point zero zero type value zero Fit the data and obtain the coefficients coeff surface_fitting spline_data knotsx knotsy amp CONSTRAINTS C Evaluate the residual spline function at a grid of points inside the square delta two nvalues 1 x i
265. gramne pie ewe a rando cechus le pi DCONST pi A onet rand A Generate random latitude longitude pairs and evaluate the surface parameters at these points spline_data 1 2 1 pi two rand spline_data 1 2 1 one S pilbbnemclaitsay e229 SO lenea e lam ale SpileinemedatsarGles2y us i SD l hmesdat lieu els Evaluate x y zZ parametric points spline_data 3 1 A cos spline_data 1 1 cos spline_data 2 1 spline_data 3 2 A cos spline_data 1 2 sin spline_data 2 2 spline_data 3 3 A sin spline_data 1 3 The values ar qually uncertain spline_data 4 one IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 227 Define the knots for the tensor product data fitting problem delta two pi ngrid 1 bkpt l ndegree pi bkpt nbkpt ndegreetl nbkpt pi bkpt nord nbkpt ndegree piti delta i 0 ngrid 1 Assign the degr of the polynomial and the knots pointer_bkpt gt bkpt knotsx d_spline_knots ndegree pointer_bkpt knotsy knotsx Fit a data surface for each coordinate Set default regularization parameters to zero and compute residuals of the individual points These are returned in DATA 4 allocate C 2 ngrid Sew the ends of the parametric surfaces together do i 0 ngrid 1l C it1l surface_constraints point pi piti delta amp type periodic pi piti delta end do
266. grid y l m 1 exp x cos pi_over_2 x Fill in the least squares matrix for the Chebyshev polynomials A 0 one A 1 x do i 2 n A i 2 x A i 1 A i 2 end do Solve for the series coefficients call lin_sol_lsq A y c Generate an equally spaced grid on the interval delta_x 2 real m 1 kind one do i l m x i one i 1 delta_x end do Evaluate residuals using backward recurrence formulas u zero v zero do i n 0 1 w 2 x u v c i 1 v zau u w end do y l m 1 exp x cos pi_over_2 x u x v Check that n l sign changes in the residual curve occur x one x sign x y 1 m 1 if count x l m 1 x 2 m gt n 1 then write Example 1 for LIN_SOL_LSQ is correct IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 19 end if end Optional Arguments MROWS m Input Uses array A 1 m 1 n for the input matrix Default m size A 1 NCOLS n Input Uses array A 1 m 1 n for the input matrix Default n size A 2 NRHS nb Input Uses the array b 1 1 nb for the input right hand side matrix Default nb size b 2 Note that b must be a rank 2 array pivots pivots Output Input Integer array of size 2 min m n 1 that contains the individual row followed by the column interchanges The last array entry contains the approximate rank of A trans trans Output Input Array of size
267. h A and B 202 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 C rand C D rand D A cC h C B D hx Dy B B h B 2 ALPHA EIG A B B W V Check that residuals are small Use a real array for alpha since th igenvalues are known to be real err norm A x V B x V x diag alpha 1 amp norm A 1 norm B 1 norm alpha 1 if err lt sqrt epsilon one then write Example 2 for LIN_GEIG_GEN operators is correct end if end Operator_ex35 use rand_int use eig_int use isnan_int use norm_int use lin_sol_lsq_int implicit none This is Example 3 using operators for LIN_GEIG_GEN integer parameter n 6 real kind 1d0 parameter one 1d0 zero 0d0 real kind 1d0 dimension n n A B d_beta n complex kind 1d0 alpha n Generate random matrices for both A and B A rand A B rand B Make columns of A and B zero so both are singular A l n n 0 B l n n 0 Set the option a larger tolerance than default for lin_sol_lisq Skip showing any error messages allocate d_eig_options 6 d_eig_options 1l skip_error_processing d_eig_options 2 options_for_lin_geig_gen d_eig_options 3 3 d_eig_options 4 d_lin_geig_gen_for_lin_sol_lsq d_eig_options 5 1 d_eig_options 6 d_options d_lin_sol_lsq_set_small amp sqrt epsilon one norm B
268. h that isNaN a i j or isNan b i j true See the isNaN function Chapter 6 Default Does not scan for NaNs iopt IO _options _lin_sol_gen_no_sing_mess _dummy Do not point an error message when the matrix A is singular iopt IO _options _lin_sol_gen_A_is_sparse _dummy Uses an indirect updating loop for the LU factorization that is efficient for sparse matrices where all matrix entries are stored 4 Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 Description The lin_sol_gen routine solves a system of linear algebraic equations with a nonsingular coefficient matrix A It first computes the LU factorization of A with partial pivoting such that LU A The matrix U is upper triangular while the following is true L A L P L P yV L PA U The factors P and L are defined by the partial pivoting Each P is an interchange of row i with row j 2 i Thus P is defined by that value of j Every T L I me is an elementary elimination matrix The vector m is zero in entries 1 i This vector is stored as column in the strictly lower triangular part of the working array containing the decomposition information The reciprocals of the diagonals of the matrix U are saved in the diagonal of the working array The solution of the linear system Ax b is found by solving two simpler systems y L band x Uy more mathematical details are found in Golub and Van Loan 1989 Chapter 3
269. h the same vector INDEX 1 N Output Assumed size array of length N containing the NSETP indices of columns in the positive solution and the remainder that are at their constraint The number of positive components in the solution Xis give by the Fortran intrinsic function value NSETP COUNT X gt 0 All processors exit with the same array IPART 1 2 1 max 1 MP_NPROCS Input Assumed size array containing the partitioning describing the matrix A The value MP_NPROCS is the number of processors in the communicator except when MPI has been finalized with a call to the routine MP_SETUP Final This causes MP_NPROCS to be assigned 0 Normally users will give the partitioning to processor of rank MP_RANK by setting IPART 1 MP_RANK 1 first column index and IPART 2 MP_RANK 1 last column index The number of columns per node is typically based on their relative computing power To avoid a node with rank MP_RANK doing any work except communication set IPART 1 MP_RANK 1 0 and IPART 2 MP_RANK 1 1 In this exceptional case there is no reference to the array A at that node Optional Argument IOPT Input Assumed size array of derived type S_OPTIONS or D_OPTIONS This argument is used to change internal parameters of the algorithm Normally users will not be concerned about this argument so they would not include it in the argument list for the routine Packaged Options for PARALLEL
270. he Parallel Option 211 See to any eror messages and quit MPI mp_nprocs mp_setup Final end Parallel Example 7 In this example alternate nodes are used for computing with the EIG function Inverse iteration is used to obtain eigenvectors for the second most dominant eigenvalue for each rack of the box The factorization and solving steps for the eigenvectors are executed only at the root node use linear_operators use mpi_setup_int implicit none This is Parallel Example 7 for box data types operators l amne se byAVCIE LONS o integer tries nrack integer parameter integer ipivots n 1 m 8 n 4 k 2 nr 4 real kind 1d0 one 1D0 err nr E n nr real kind 1d0 dimension m n nr S real kind 1d0 dimension n n nr A ATEMP real kind 1d0 dimension n 1 nr DPR type d_options iopti 4 logical dimension nr results_are_true Serto rowr WIP IES mp_nprocs mp_setup Generate a random rectangular matrix if mp_rank 0 C rand C Generate a random right hand side for use in the inverse iteration if mp_rank 0 b rand b Compute a positive definite matrix AIS CITE E ASNA che eatery An 2 Obtain just the eigenvalues E EIG A ATEMP A Compute A eigenvalue I as the coefficient matrix Us number k igenvalu do nrack 1 nr IF MP_RANK gt 0 EXIT 212 Chapter 6 Operators and Generic Functions Th
271. he appropriate module fft_box_int or linear_operators Optional Variables Reserved Names The optional argument is WORK a COMPLEX array of the same precision as the data For rank 1 transforms the size of WORK is n 15 To define this array for each problem set WORK 1 0 Each additional rank adds the dimension of the transform plus 15 Using the optional argument WORK increases the efficiency of the transform This function uses routines fast_dft fast_2dft and fast_3dft from Chapter 3 IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 161 The option and derived type names are given in the following tables Option Name for FFT Option Value options_for_fast_dft 1 Derived Type Name of Unallocated Array s_options s_fft_box_options s_options s_fft_box_options_once d_options d_fft_box_options d_options d_fft_box_options_once Example Compute the DFT of a random complex array x rand x y fft_box x IFFT The inverse of the Discrete Fourier Transform of a complex sequence Required Argument The function requires one argument x If x is an assumed shape complex array of rank 1 2 or 3 the result is the complex array of the same shape and rank consisting of the inverse DFT Modules Use the appropriate module ifft_int or linear_operators Optional Variables Reserved Names The optional argument is WORK a COMPLE
272. he probability density amp pdf x cos x 1 2 pi pi lt x lt pi call show x_30 end Fatal and Terminal Error Messages See the messages gls file for error messages for rand_gen These error mes sages are numbered 521 528 541 548 sort_real Sorts a rank 1 array of real numbers x so the y results are algebraically nondecreasing y S y2 S Vy Required Argument x Input Rank 1 array containing the numbers to be sorted y Output Rank 1 array containing the sorted numbers Example 1 Sorting an Array An array of random numbers is obtained The values are sorted so they are nondecreasing use sort_real_int use rand_gen_int implicit none This is Example 1 for SORT_REAL integer parameter n 100 real kind le0 dimension n x y Generate random data to sort call rand_gen x Sort the data so it is non decreasing call sort_real x y Check that the sorted array is not decreasing 134 e Chapter 5 Utilities IMSL Fortran 90 MP Library 4 0 if count y 1l n 1 gt y 2 n 0 then write Example 1 for SORT_REAL is correct end if end Optional Arguments nsize n Input Uses the sub array of size n for the numbers Default value n size x iperm iperm Input Output Applies interchanges of elements that occur to the entries of iperm If the values iperm i i i 1 n are assigned prior to call then the output array is moved to i
273. he regularizing parameter multiplying the squared integral of the unknown function The argument _value is replaced by the default value The default is _value 0 iopt IO _options amp surface_fitting_flatness _value This resets the square root of the regularizing parameter multiplying the squared integral of the partial derivatives of the unknown function The argument _value is replaced by the default value The default is _value sqrt epsilon _value size where size YIdata 3 data 4 V ndata 1 iopt IO _options surface_fitting_tol_equal _value This resets the value for determining that equality constraint equations are rank deficient The default is _value 1074 iopt IO _options IMSL Fortran 90 MP Library 4 0 Chapter 4 Curve and Surface Fitting with Splines 115 surface_fitting_tol_least _value This resets the value for determining that least squares equations are rank deficient The default is 2_value 107 iopt IO _options surface_fitting_residuals dummy This option returns the residuals surface data in data 4 That row of the array is overwritten by the residuals The data is returned in the order of cell processing order or left to right in x and then increasing in y The allocation of a temporary for data 1 4 is avoided which may be desirable for problems with large amounts of data The default is to not evaluate the residuals and to le
274. he right hand side of the equal sign contains the inverse matrix See Example 2 in Chapter 1 Linear Solvers of lin_sol_gen for an example of computing the inverse matrix Each of the primary routines have arguments epack and iopt As noted the epack argument is of derived type s_error or d_error The prefix s_ or d_ is chosen depending on the precision of the data type for that routine The optional argument iopt is part of the interface to each routine and its use is to modify internal algorithm choices or other parameters Optional Data vi e Introduction This additional optional argument is further distinguished a derived type array that contains a number of parameters to modify the internal algorithm of a routine This derived type has the name _options where _ is either s_ g3 or d_ The choice depends on the precision of the data type The declaration of this derived type is packaged within the modules for each generic suite of codes The definition of the derived types is type _options integer idummy real kind rdummy end type where the _ is either s_ or a andthe kind value matches the desired data type indicated by the choice of s or g Example 3 in Chapter 1 Linear Solvers of 1in_sol_gen illustrates the use of iterative refinement to compute a double precision solution based on a single precis
275. he size of the change to each e due to changing the matrix data The reciprocal p jal U Vi is defined as the condition number of e Also see operator_ex32 Chapter 6 use lin_eig_gen_int use rand_gen_int implicit none This is Example 4 for LIN_EIG_GEN integer i integer parameter n 17 real kind 1d0 parameter one 1d0 real kind 1d0 a n n c n n variation n y n n temp n amp norm_of_a eta complex kind 1d0 dimension n n e n d n u v Generate a random matrix call rand_gen y a reshape y n n Compute th igenvalues left and right eigenvectors call lin_eig_gen a v v v_adj u Compute condition numbers and variations of eigenvalues norm_of_a sqrt sum a 2 n do i l n variation i norm_of_a abs dot_product u l n i amp v i n i end do Now perturb the data in the matrix by the relative factors eta sqrt epsilon and solve for values again Check the differences compared to th stimates They should not exceed the bounds eta sqrt epsilon one do i l n call rand_gen temp c lin i a lin i 2 temp 1 eta a l n i end do call lin_eig_gen c d Looking at the differences of absolute values accounts for switching signs on the imaginary parts if count abs d abs e gt eta variation 0 then write Example 4 for LIN_EIG_GEN is correct end if end
276. he starting independent variable value TO This routine can also provide a non uniform grid at the initial value SUBROUTINE initial_conditions NPDE N U Integer NPDE N REAL kind T0 U END SUBROUTINE Optional Update the grid of values in array locations U NPDE 1 j j 2 N 1 This grid is input equally spaced but can be updated as desired provided the values are increasing Required Provide initial values UC j j be for all components of the system at the grid of values U NPDE 1 j 7 1 N Tf the optional step of updating the initial grid is performed then the initial values are evaluated at the updated grid pde_system_definition Input The name of an external subroutine written by the user when using forward communication It gives the differential equation as expressed in Equation 2 272 Chapter 8 Partial Differential Equations IMSL Fortran 90 MP Library 4 0 SUBROUTINE pde_system_definition amp t x NPDE u dudx c q r IRES Integer NPDE IRES REAL kind T0 t x u dudx REAL kind T0 c Q r END SUBROUTINE Evaluate the terms of the system of Equation 2 A default value of m 0 is assumed but this can be changed to one of the other choices m 1 or 2 Use the optional argument IOPT for that purpose Put the values in the arrays as indicated c j k C x t u uy r j r x t u uy qi q x t u
277. hen write Example 2 for LIN_SOL_LSQ is correct end if end Example 3 Two Dimensional Data Fitting This example illustrates the use of radial basis functions to least squares fit arbitrarily spaced data points Let m data values y be given at points in the unit square p Each p is a pair of real values Then n points q are chosen on the unit square A series of radial basis functions is used to represent the data n 2 4 N f p Ye jp a 3 j l 2 2 where 6 is a parameter This example uses 5 1 but either larger or smaller values can give a better approximation for user problems The coefficients c are obtained by solving the following m x n linear least squares problem f y IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 23 This example illustrates an effective use of Fortran 90 array operations to eliminate many details required to build the matrix and right hand side for the c For this example the two sets of points p and q are chosen randomly The values y are computed from the following formula orllp lP The residual function r p 4e vr f p is computed at an N x N square grid of equally spaced points on the unit square The magnitude of r p may be larger at certain points on this grid than the residuals at the given points pil Also see operator_ex11 Chapter 6 use lin_sol_lsq_int use rand_gen_int implicit none This is Example 3 for
278. his is Example 3 using operators for LIN_SOL_SELF integer tries integer parameter m 8 n 4 k 2 integer ipivots n 1 real kind 1d0 one 1 0d0 err real kind 1d0 a n n b n 1 c m n x n 1 amp e n ATEMP n n type d_options iopti 4 Generate a random rectangular matrix C rand C Generate a random right hand side for use in the invers iteration b rand b Compute the positive definite matrix A C tx C A At t A 2 Obtain just the eigenvalues E EIG A Use packaged option to reset the value of a small diagonal iopti 4 0 iopti 1l d_options d_lin_sol_self_set_small amp epsilon one abs E 1 Use packaged option to save the factorization iopti 2 d_lin_sol_self_save_factors Suppress error messages and stopping due to singularity of the matrix which is expected iopti 3 d_lin_sol_self_no_sing_mess IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 177 ATEMP A Compute A eigenvalue I as the coefficient matrix Us igenvalue number k A A e k EYE n do tries 1 2 call lin_sol_self A b x amp pivots ipivots iopt iopti When code is r ntered the already computed factorization is used iopti 4 d_lin_sol_self_solve_A Reset right hand side in the direction of the eigenvector B UNIT x end do Normalize the eigenvector x
279. his product This file is read by error_post using the contents of the derived type argument epack containing the message number error severity level and associated data The message is converted into character strings accepted by the error processor and then printed The number of pending messages that print depends on the settings of the parameters PRINT and STOP IMSL MATH LIBRARY User s Manual IMSL 1994 pp 1194 1195 These values are initialized to defaults such that any Level 5 or Level 4 message causes a STOP within the error processor after a print of the text To change these defaults so that more than one error message prints use the routine ERSET documented and illustrated with examples in IMSL MATH LIBRARY User s Manual IMSL 1994 pp 1196 1198 The method of using a message file to store the messages is required to support shared memory parallelism Managing the Message File For most applications of this product there will be no need to manage this file However there are a few situations which may require changing or adding messages New system wide messages have been developed for applications using the IMSL Fortran 90 MP Library All or some of the existing messages need to be translated to another language A subset of users need to add a specific message file for their applications using the IMSL Fortran 90 MP Library 124 e Chapter 5 Utilities IMSL Fortran 90 MP Library 4 0 Following is
280. hod correction to solve the secular equations for lamda lamda lamda delta_lamda if sum abs delta_lamda lt amp sqrt epsilon one sum lamda amp exit solve_for_lamda This is intended to fix up negative solution approximations call rand_gen rand where lamda lt 0 lamda s 1 rand end do solve_for_lamda Compute solutions and check lengths x matmul v t_g spread s_sq dim 2 ncopies k amp spread lamda dim 1 ncopies n err sum abs sum x 2 dim 1 alpha 2 sum abs alpha 2 IMSL Fortran 90 MP Library 4 0 Chapter 2 Singular Value and Eigenvalue Decomposition 51 if err lt sqrt epsilon one then write Example 2 for LIN_SVD is correct end if end Example 3 Generalized Singular Value Decomposition The n Xn matrices A and B are expanded in a Generalized Singular Value Decomposition GSVD Two n x n orthogonal matrices U and V anda nonsingular matrix X are computed such that AX Udiag c and BX Vdiag s 5 The values s and c are normalized so that oo 9 so e 1 l L The c are nonincreasing and the s are nondecreasing See Golub and Van Loan 1989 Chapter 8 for more details Our method is based on computing three SVDs as opposed to the QR decomposition and two SVDs outlined in Golub and Van Loan As a bonus an SVD of the matrix X is obtained and you can use this information to answer further questions about its conditi
281. i 1 A i 2 end do Compute the generalized inverse of the least squares matrix Compute the series coefficients using the generalized invers as smoothing formulas inv i A c inv x y Evaluate residuals using backward recurrence formulas u zero v zero do i n 0 1 w 2 x u v c i v au u w end do Compute residuals at the grid y exp x cos pi_over_2 x u x v Check that n 2 sign changes in the residual curve occur This test will fail when n is larger x one x sign x y if count x l m 1 x 2 m n 2 then write Example 2 for LIN_SOL_LSQ operators is correct end if end Operator_ex11 use operation_ix use operation_tx use operation_x use rand_int use norm_int implicit none This is Example 3 for LIN_SOL_LSQ using operators and functions integer i j integer parameter m 128 n 32 k 2 n_eval 16 real kind 1d0 parameter one 1d0 delta_sqr 1d0 real kind 1d0 A m n b m c n p k m q k n amp res n_eval n_eval w n_eval delta Generate a random set of data and center points in k 2 space p rand p q rand q IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 181 Compute the coefficient matrix for the least squares system A sqrt sum spread p 3 n spread q 2 m 2 dim 1 delta_sqr Compute the right hand side of functio
282. i 4 d_options z_lin_sol_self_no_pivoting zero call lin_geig_gen a b alpha beta v v amp iopt iopti Check that residuals are small Use the real part of alpha since the values are known to be real err sum abs matmul a v matmul b v amp spread real alpha kind one beta dim 1 ncopies n amp sum abs a abs b if err lt sqrt epsilon one then write Example 2 for LIN_GEIG_GEN is correct end if end IMSL Fortran 90 MP Library 4 0 Chapter 2 Singular Value and Eigenvalue Decomposition 75 Example 3 A Test for a Regular Matrix Pencil In the classification of Differential Algebraic Equations DAE a system with linear constant coefficients is given by Ax Bx f Here A and B aren Xn matrices and fis an n vector that is not part of this example The DAE system is defined as solvable if and only if the quantity det A B does not vanish identically as a function of the dummy parameter u A sufficient condition for solvability is that the generalized eigenvalue problem Av ABv is nonsingular By constructing A and B so that both are singular the routine flags nonsolvability in the DAE by returning NaN for the generalized eigenvalues Also see operator _ex35 Chapter 6 use lin_geig_gen_int use rand_gen_int use error_option_packet use isnan_int implicit none This is Example 3 for LIN_GEIG_GEN integer parameter n 6 real ki
283. i colurcion a zer y d_zero change_old huge one Use packaged option to save the factorization iopti 1l s_lin_sol_self_save_factors 214 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 TOP CS2 h zero ITERATIVE_REFINEMENT DO Geis sp 8 Cr bsi 58 Aleit 85 3 E 0 s7Garril simria 2 9 Sidwell ginny 8 43 ID sites sy lsu 858 if mp_rank 0 then do nrack 1 nr carii dim Sol Seu QE p Spica 5 amp Glipa pica Ins sy aake Divo CS noi VOCS LOCE rO enddo Sf ik oP SY endif change_new norm h All processo ca p EN ERROR Moaste winen GA NAE eh Use option t rs share the root s test for convergence 11 mpi_bcast change new nr MET REAL O7 MP_LIBRARY_WORLD anges are no longer decreasing ALL change_new gt change_old amp xit iterative_refinement ange_old change_new o re enter code with factorization saved solve only iopti 2 s_lin_sol_self_solve_A end do iterative_refinement if mp_rank write See to any e mp_nproc end 0 amp sits iweneenlikell japxeimjolle 8 ais COrracr rror message and quit MPI s mp_setup Final Parallel Example 9 This is a variation of Parallel Example 8 A single problem is converted to a box data type with one rack The use of the function call MP_SETUP M N a
284. i i_left i_right working i i_bin end do end do Map the random numbers into the distribution array This is made approximately proportional to the histogram do i 1 n_samples i_map nint rn i n_work 1 1 distribution working i_map amp distribution working i_map 1 end do Check the agreement between the distribution of the generated random numbers and the original histogram write A advance no Original write 1016 histogram scale write A advance no Generated write 1016 distribution if maxval abs histogram 1 scale distribution 1 amp lt tolerance n_samples then IMSL Fortran 90 MP Library 4 0 Chapter 5 Utilities e 131 write A Example 3 for RAND_GEN is correct end if Generate 20 integers in 1 10 according to the distribution induced by the histogram call rand_gen rn_20 Map from the uniform distribution to the induced distribution do i 1 n_samples_20 i_map nint rn_20 i n_work 1 1 rand_num_20 i working i_map end do call show rand_num_20 amp Twenty integers generated according to the histogram end Example 4 Generating with a Cosine Distribution We generate random numbers based on the continuous distribution function p x 1 cos x 22 mS x lt 7 Using the cumulative q x J p t dt 1 2 x sin x 27 T we gener
285. iate complex valued calculations Note that the computation of the complex matrix X and the diagonal matrix D is performed using the IMSL MATH LIBRARY FORTRAN 77 routine EVCRG This is an illustration of combining parts of FORTRAN 77 and Fortran 90 code The information is made available to the Fortran 90 compiler by using the FORTRAN 77 interface for EVCRG Also see operator_ex04 Chapter 6 where the Fortran 90 function EIG has replaced the call to EVCRG use lin_sol_gen_int use rand_gen_int use Numerical_Libraries implicit none This is Example 4 for LIN_SOL_GEN integer parameter n 32 k 128 real kind le0 parameter one 1 0e0 t_max 1 delta_t t_max k 1 real kind le0O err A n n atemp n n ytemp n 2 real kind le0 t k y n k y_prime n k complex kind le0 EVAL n EVEC n n complex kind le0 x n n z_O n 1 y_O n 1 d n integer i Generate a random matrix in an F90 array call rand_gen ytemp atemp reshape ytemp n n Assign data to an F77 array A atemp Use IMSL Numerical Libraries F77 subroutine for the eigenvalu igenvector calculation CALL EVCRG N A N EVAL EVEC N Generate a random initial value for the ODE system call rand_gen ytemp 1 n y_O 1l n 1 ytemp 1 n T Assign the eigenvalue eigenvector data to F90 arrays d EVAL x EVEC 8 Chapter 1 Linear Solvers IMSL
286. ich you jumped Let s try it click on the following green color underlined text see error_post If you clicked on the green color underlined text in the example above the section on error_post opened To return to this page click the on the toolbar Visual Numerics Inc Corporate Headquarters 1300 W Sam Houston Pkwy Ste 150 Houston Texas 77042 2444 USA PHONE 713 784 3131 FAX 713 781 9260 e mail marketing houston vni com Visual Numerics S A de C V Cerrada de Berna 3 Tercer Piso Col Juarez Mexico D F C P 06600 MEXICO PHONE 52 5 514 9730 or 9628 FAX 52 5 514 4873 Visual Numerics Inc 7 F 510 Sect 5 Chung Hsiao E Road Taipei Taiwan 110 ROC PHONE 886 2 727 2255 FAX 886 2 727 6798 e mail info vni com tw World Wide Web site http www vni com Visual Numerics International Ltd New Tithe Court 23 Datchet Road SLOUGH Berkshire SL3 7LL UNITED KINGDOM PHONE 44 0 1753 790600 FAX 44 0 1753 790601 e mail info vniuk co uk Visual Numerics International GmbH Zettachring 10 D 70567 Stuttgart GERMANY PHONE 49 711 13287 0 FAX 49 711 13287 99 e mail vni visual numerics de Visual Numerics Korea Inc HANSHIN BLDG Room 801 136 1 MAPO DONG MAPO GU SEOUL 121 050 KOREA SOUTH PHONE 82 2 3273 2632 or 2633 FAX 82 2 3273 2634 e mail leevni chollian dacom co kr Visual Numerics SARL Tour Europe 33 Place des Corolles F 92049 PARIS LA D
287. ide any useful information it is necessary for INTRIESI gt 1 The value INTRIESI 1 is acceptable but only one time sample and no standard deviation is obtained Values of NTRIES gt 0 result in the printing of results as shown in Table A The numbers in the table will vary depending on the machine and other factors that impact performance of Fortran codes Benchmark of rand_gen F90 and rnun F77 Date of benchmark Y Mo D H M S 1994 511 8 58 58 1 3 6000E 00 3 2000E 00 Average 2 4 8990E 01 4 00008 01 St Dev 3 1 8000E 01 1 6000E 01 Total Ticks 4 1 0000E 04 1 0000E 04 Size 5 5 0000E 00 5 0000E 00 Repeats 6 5 0000E 01 5 0000E 01 Ticks per sec 1 nn FW N 2 8000E 00 4 0000E 01 1 4000E 01 1 0000E 04 5 0000E 00 5 0000E 01 Benchmark of rand_gen F90 and drnun F77 Date of benchmark Y Mo D H M S 1994 5 11 3 2000E 00 4 0000E 01 1 6000E 01 1 0000E 04 5 0000E 00 5 0000E 01 8 58 59 Average St Dev Total Ticks Size Repeats Ticks per sec D 2 e Appendix D Benchmarking or Timing Programs Table A Benchmark Summary rand_gen rnun drnun IMSL Fortran 90 MP Library 4 0 If NIRIES lt 0 the 6 x 2 functions return the tabular values shown with INTRIES samples No printing is performed with NTRIES lt 0 To compute a related benchmark such as the rate random numbers per second for single precision
288. ide the square delta two nvalues 1 x i delta i 1 nvalues y x values exp spread x 2 1 nvalues spread y 2 2 nvalues values surface_values 0 0 x y knotsx knotsy coeff amp values Compute the R M S error sizev norm pack values values values nvalues if sizev lt TOLERANCE then write Example 1 for SURFACE_FITTING is correct end if end 114 Chapter 4 Curve and Surface Fitting with Splines IMSL Fortran 90 MP Library 4 0 Optional Arguments constraints surface_constraints Input A rank 1 array of derived type _surface_constraints that defines constraints the tensor product spline is to satisfy covariance G Output An assumed shape rank 2 array of the same precision as the data This output is the covariance matrix of the coefficients It is optionally used to evaluate the square root of the variance function iopt iopt Input Output Derived type array with the same precision as the input array used for passing optional data to surface_fitting The options are as follows Packaged Options for surface_fitting Prefix None Option Name surface_fitting_smallness surface_fitting_flatness surface_fitting_tol_equal surface_fitting_residuals surface_fitting_print surface_fitting_tol_least O 6 O surface_fitting_thinness iopt IO _options amp surface_fitting_smallnes _value This resets the square root of t
289. ilon one then write Example 2 for LIN_EIG_SELF operators is correct end if end Operator_ex27 use linear_operators implicit none This is Example 3 using operators for LIN_EIG_SELF integer i integer parameter n 64 k 08 real kind 1d0 parameter one 1d0 zero 0d0 real kind 1d0 err real kind 1d0 dimension n n A D n amp res n k v n k Generate a random self adjoint matrix A rand A A A t A Compute just the eigenvalues D EIG A V rand V Ready options to skip error processing and reset tolerance for linear solver allocate d_invx_options 5 do i l k Use packaged option to reset the value of a small diagonal d_invx_options 1l skip_error_processing IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 197 d_invx_options 2 ix_options_for_lin_sol_gen d_invx_options 3 2 d_invx_options 4 d_options d_lin_sol_gen_set_small epsilon one abs d i d_invx_options 5 d_lin_sol_gen_no_sing_mess Compute the eigenvectors with inverse iteration V l i A EYE n d i ix V 1 i end do deallocate d_invx_options Orthogonalize the eigenvectors V ORTH V Check the results for both orthogonality of vectors and small residuals res 1 k 1 k V tx V EYE k err norm res 1 k 1 k res A x V
290. in this message New messages added to the system wide error message file should be placed at the end of the file Message numbers 5000 through 10000 have been reserved for user added messages Currently messages through 1400 are used by IMSL Gaps in message number ranges are permitted however the message numbers must be in ascending order within the file The message numbers used for each IMSL Fortran 90 MP Library subroutine are documented in this manual and in online help If existing messages are being edited or translated make sure not to alter the message_number lines This prevents conflicts with any new messages gls file supplied with future versions of IMSL Fortran 90 MP Library IMSL Fortran 90 MP Library 4 0 Chapter 5 Utilities e 125 Building a New Direct access Message File The prepmess executable must be available to complete the message changing process For information on building the prepmess executable from prepmess f consult the installation guide for this product Once new messages have been placed in the messages gls file make a backup copy of the messages daf file Then remove messages daf from the current directory Now enter the following command prepmess gt prepmess_output A new messages daf file is created Edit the prepmess_output file and look near the end of the file for the new error messages The prepmess program processes each message through the error message system as a validity check
291. in value one for points inside the circle with r 64 a zero r reshape i n 2 2 j n 2 2 i l n amp j 1 n n n where r lt n 4 2 a one CAS ve Transform and then invert the sequence using the pre computed working values ido_value 0 do if allocated work deallocate work 90 Chapter 3 Fourier Transforms IMSL Fortran 90 MP Library 4 0 Allocate the space required for work if ido_value lt 0 allocate work ido_value Transform the image and then invert it back call c_fast_2dft forward_in a amp forward_out b IDO ido_value work_array work if ido_value 1 exit end do call c_fast_2dft inverse_in b amp inverse_out a IDO ido_value work_array work Deallocate the space used for work if allocated work deallocate work Check that inverse transform image image err maxval abs c a maxval abs c if err lt sqrt epsilon one then write Example 3 for FAST_2DFT is correct end if end Fatal and Terminal Messages See the messages gls file for error messages for fast _2dft These error mes sages are numbered 670 680 720 730 fast_3dft Computes the Discrete Fourier Transform 2DFT of a rank 3 complex array x Required Arguments No required arguments pairs of optional arguments are required These pairs are forward_in and forward_out or inverse _in and inverse_out Example 1 Transfor
292. inant 156 determinant of A 2 DFT Discrete Fourier Transform 79 Differential Algebraic Equations 75 differential algebraic solver 43 diffusion equation 1 42 direct access message file 126 discrete Fourier transform 160 161 163 inverse 162 Index i E efficient solution method 68 eigenvalue 158 eigenvalue eigenvector decomposition 58 61 158 expansion eigenexpansion 47 58 eigenvalues self adjoint matrix 14 56 62 eigenvectors 1 40 56 59 61 62 epack argument v equality constraint least squares 25 errors printing error messages 123 301 Euclidean length 171 172 evaluator function one dimensional smoothing 97 two dimensional smoothing 98 EVASB routine 40 example least squares by rows distributed 251 linear constraints distributed 256 257 linear inequalities distributed 248 linear system distributed ScaLAPACK 243 matrix product distributed PBLAS 240 Newton s Method distributed 259 transposing matrix distributed 236 237 examples accuracy estimates of eigenvalues 69 accurate least squares solution with iterative refinement 16 analysis and reduction of a generalized eigensystem 61 complex polynomial equation Roots 66 computing eigenvalues 47 56 63 computing eigenvectors with inverse iteration 47 59 computing generalized eigenvalues 71 computing the SVD 47 48 constraining a spline surface to be non negative interpolation to data 120 ii e Index constraining points using spline surfa
293. ingular system A ADx b for an eigenvector x Following the computation of a normalized eigenvector os Il the consistency condition A y Ay is checked Since a singular system is expected suppress the fatal error message that normally prints when the error post processor routine error_post is called within the routine 1in_sol_self Also see operator_ex07 Chapter 6 use lin_sol_self_int use lin_eig_self_int use rand_gen_int 14 Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 use error_option_packet implicit none This is Example 3 for LIN_SOL_SELF integer i tries integer parameter m 8 n 4 k 2 integer ipivots n 1 real kind 1d0 parameter one 1 0d0 zero 0 0d0 real kind 1d0 err real kind 1d0 a n n b n 1 c m n x n 1 y m n amp e n atemp n n type d_options iopti 4 Generate a random rectangular matrix call rand_gen y c reshape y m n Generate a random right hand side for use in the invers iteration call rand_gen y 1 n b reshape y n 1 Compute the positive definite matrix a matmul transpose c c Obtain just the eigenvalues call lin_eig_self a e Use packaged option to reset the value of a small diagonal iopti d_options 0 zero iopti 1l d_options d_lin_sol_self_set_small amp epsilon one abs e l1 Use packaged option to save the factorization iopti 2 d_options d_lin_sol_self
294. interval delta_x 2 real m 1 kind one x one i delta_x i 0 m 1 Evaluate residuals using backward recurrence formulas u zero v zero do i n 0 1 w 2 x u v c i v au u w end do Compute residuals at the grid y exp x cos pi_over_2 x u x v Check that n 1 sign changes in the residual curve occur This test will fail when n is larger x one x sign x y if count x 1 m 1 x 2 m gt n 1 then write Example 1 for LIN_SOL_LSQ operators is correct end if end Operator_ex10 use linear_operators implicit none This is Example 2 for LIN_SOL_LSQ using operators and functions integer i integer parameter m 128 n 8 real kind 1qd0 parameter one 1d0 zero 0d0 real kind 1d0 A m O n c O n pi_over_2 x m y m amp u m v m w m delta_x inv 0O n m real kind 1d0 external DCONST Generate an array of equally spaced points on the interval 1 1 delta_x 2 real m 1 kind one x one i delta_x i 0 m 1 Get the constant PI 2 from IMSL Numerical Libraries pi_over_2 DCONST PI 2 Compute data values on the grid y exp x cos pi_over_2 x 180 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 Fill in the least squares matrix for the Chebyshev polynomials A 0 one A 1 x do i 2 n A i 2 x A
295. ion For example the name rand_gen is the suffix for the routine that generates a Fortran 90 rank 1 array of random numbers The routine name has the prefix of the data type for the routine These separate parts of the name are joined with the underscore character _ Thus the full prefix and suffix joined together form the complete name of the single precision version of the random number generator s_rand_gen A generic name is also supported in this case rand_gen In most cases the strings s_ a c_ or z_ can be deleted The documentation for the routines omits the prefix and hence the entire suite of routines for that subject is documented Examples that appear in the documentation use the generic name To further illustrate this principle note the 1in_sol_gen documentation see Chapter 1 for solving general systems of linear algebraic equations A description is provided for just one data type There are four documented routines in this subject area s_lin_sol_gen d_lin_sol_gen c_lin_sol_gen and z_lin_sol_gen The appropriate routine is identified by the Fortran 90 compiler Use of a module is required with the routines The naming convention for modules joins the suffix int to the generic routine name Thus the line use lin_sol_gen_int is inserted near the top of any routine that calls the subprogram lin_sol_gen These routines constitute single precision double precision comple
296. ion factorization of the matrix This is communicated to the routine using an optional argument with optional data For efficiency of iterative refinement perform the factorization step once then save the factored matrix in the array A and the pivoting information in the rank 1 integer array ipivots By default the factorization is normally discarded To enable the routine to be re entered with a previously computed factorization of the matrix optional data are used as array entries in the iopt optional argument The packaging of lin_sol_gen includes the definitions of the self documenting integer parameters lin_sol_gen_save_LU and lin_sol_gen_solve_aA These parameters have the values 2 and 3 but the programmer usually does not need to IMSL Fortran 90 MP Library 4 0 be aware of it The following rules apply to the iopt iopt optional argument 1 Define a relative index for example 10 for placing option numbers and data into the array argument iopt Initially set 10 1 Before a call to the IMSL Library routine follow Steps 2 through 4 2 The data structure for the optional data array has the following form iopt IO _options Option_number Optional_data iopt IO 1 _options Option_number Optional_data The length of the data set is specified by the documentation for an individual routine The Optional_data is output in some cases and may be not used in other cases The square braces denote
297. ional messages and data from each routine that uses the epack optional argument provided p is large enough to hold data for a new message The value p 8 is sufficient to hold the longest single terminal fatal or warning message that an IMSL Fortran 90 routine generates The location at entry epack 1 idummy contains the number of data items for all messages When the error_post routine exits this value is set to zero IMSL Fortran 90 MP Library 4 0 Chapter 5 Utilities e 123 Locations in array positions 2 idummy contain groups of integers consisting of a message number the error severity level then the required integer data for the message Floating point data if required in the message is passed in locations rdummy matched with the starting point for integer data The extent of the data for each message is determined by the requirements of the larger of each group of integer or floating point values Optional Arguments new_unit nunit Input Unit number of type integer associated for reading the direct access file of error messages for the IMSL Fortran 90 routines Default nunit 4 new_path path Input Pathname in the local file space of type character 64 needed for reading the direct access file of error messages Default string for path is defined during the installation procedure for the IMSL Fortran 90 routines Description A default direct access error message file daf file is supplied with t
298. is argument is not present an unformatted or list directed read is used iopt nput Derived type array with the same precision as the array A used for passing optional data to ScaLAPACK_READ The options are as follows Packaged Options for scaLAPACK_READ Option Name ScaLAPACK_READ_UNIT ScaLAPACK_READ_FROM_PROCI ScaLAPACK_READ_BY_ROWS Option Prefix Option Value 234 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers IMSL Fortran 90 MP Library 4 0 iopt T0 ScaLAPACK_READ_UNIT Sets the unit number to the value in iopt IO 1 idummy The default unit number is the value 11 iopt IO ScaLAPACK_READ_FROM_PROCESS Sets the process number that reads the named file to the value in iopt IO 1 idummy The default process number is the value 0 iopt IO ScaLAPACK_READ_BY_ROWS Read the matrix by rows from the named file By default the matrix is read by columns Algorithm Subroutine ScaLAPACK_READ reads columns or rows of a problem matrix so that it is usable by a ScaLAPACK routine It uses the two dimensional block cyclic array descriptor for the matrix to place the data in the desired assumed size arrays on the processors The blocks of data are read then transmitted and received The block sizes contained in the array descriptor determines the data set size for each blocking send and receive pair The number of these synchronization points is prop
299. is going on A This is will happen when no name was pushed on the stack Before your call to E1MES call E1PSH ROUTINE_NAME where ROUTINE_NAME is any name you choose Then after the return from the routine call E1POP ROUTINE_NAMBE The message will print IMSL Fortran 90 MP Library 4 0 Chapter 9 Error Handling and Messages The Parallel Option 317 at this synchronization point Q 14 Please explain the difference between the function values NIRTY 0 and NIRTY 1 A The value NIRTY 1 is the maximum error type noted in any routine called by a user s code More precisely this is the maximum error type bracketed by a call to E1PSH and E1POP which could be in a user s code The value N1RTY 0 is the maximum error type noted before a call to E1POP This allows a programmer to make a series of tests and possible calls to E1MES Then the value NIRTY 0 is used to indicate what error condition occurred in the tests Support for Threads Our design supports multiple threads at each node of a distributed machine These features are not yet fully tested We have used calls to routines that provide a simple interface to threaded computations The routines are CALL E1LOCK LOCK_STATE If LOCK_STATE point forward 1 allow exactly one execution access from this If LOCK_STATE 0 give up the exclusive execu
300. iscrete Fourier Transform DFT of a rank 1 complex array x Required Arguments No required arguments pairs of optional arguments are required These pairs are forward_in and forward_out or inverse_in and inverse_out Example 1 Transforming an Array of Random Complex Numbers An array of random complex numbers is obtained The transform of the numbers is inverted and the final results are compared with the input array use fast_dft_int use rand_gen_int implicit none IMSL Fortran 90 MP Library 4 0 Chapter 3 Fourier Transforms 79 This is Example 1 for FAST_DFT integer parameter n 1024 real kind le0 parameter one 1le0 real kind le0 err y 2 n complex kind le0 dimension n a b c Generate a random complex sequence call rand_gen y a cmplx y 1 n y n 1 2 n kind one c a Transform and then invert the sequence back call c_fast_dft forward_in a amp forward_out b call c_fast_dft inverse_in b amp inverse_out a Check that inverse transform sequence sequenc err maxval abs c a maxval abs c if err lt sqrt epsilon one then write Example 1 for FAST_DFT is correct end if end Optional Arguments forward_in x Input Stores the input complex array of rank 1 to be transformed forward_out y Output Stores the output complex array of rank 1 resulting from the transform inverse_in y Input Stores the input complex array of
301. ix A List of Subprograms and GAMS Classification A 3 rand_gen ScaLAPACK Read ScaLAPACK Write show sort_real spline fitting surface_fitting balanc cbalanc norm2 cnorm2 mnorm2 cmnorm2 nrm2 cnrm2 build _error_ structure perfect_shift pwk tri_solve french_curve A 4 e Appendix A List of Subprograms and GAMS Classification Generates a rank 1 array of random numbers The output array entries are positive and less than 1 in value Read matrix data from a file and place in a two dimensional block cyclic form on a process grid Write matrix data to a file starting with a two dimensional block cyclic form on a process grid Print rank 1 and rank 2 arrays with indexing and text Sorts a rank 1 array of real numbers x so the y results are algebraically nondecreasing Y Sy gt S Yn Solves constrained least squares fitting of one dimensional data by B splines Solves constrained least squares fitting of two dimensional data by tensor products of B splines Balances a general matrix before computing the eigenvalue eigenvector decomposition Computes the Euclidean length of a vector or matrix avoiding out of scale intermediate subexpressions Fills in flags values and update the data structure for error conditions that occur in Library routines Prepares the structure so that calls to routine error_post will display the reason for the error Computes eigenvectors using actual
302. k aU eel eee L Y This is frequently a stiff differential algebraic system It is solved using the integrator DASPG and its subroutines including D2SPG These are documented in the IMSL Fortran Numerical Library Chapter 5 Note that DASPG is restricted to use within PDE_1D_MG until the routine exits with the flag IDO 3 If DASPG is needed during the evaluations of the differential equations or boundary conditions use of a second processor and inter process communication is required The only options for DASPG set by PDE_1D_MG are the Maximum BDF Order and the absolute and relative error values ATOL and RTOL Users may set other options using the Options Manager This is described in Chapter 5 for DASPG and generally in Chapter 10 of the IMSL Fortran Numerical Library PDE 1D MG INT Invoke a module with the statement USE PDE_1D_MG_INT near the second line of the program unit The integrator is provided with single or double precision arithmetic and a generic named interface is provided We do not recommend using 32 bit floating point arithmetic here The routine is called within the following loop and is entered with each value of 1DO The loop continues until a value of IDO results in an exit IDO 1 DO CASE IDO 1 Do required initialization steps CASE IDO 2 Save solution update TO and TOUT IF Finished with integration IDO 3 CASE IDO 3 EXI Normal CASE
303. kpt ndegreetl nbkpt amp xdata ndata i delta_x i 1 ndegree Assign the degr of pointer_bkpt gt b the polynomial and the knots kpt break_points s_spline_knots ndegree pointer_bkpt These are the natural conditions for interpolating cubic splines The derivatives match those of function at the ends constraints 1 spline_constraints amp derivative 2 point bkpt nord constraints 2 spline_constraints amp the interpolating type value one 102 Chapter 4 Curve and Surface Fitting with Splines derivative 2 point bkpt nbkpt ndegree type amp value onet xdata ndata 2 ydata ndata knots break_points amp coeff spline_fitting data spline_data IMSL Fortran 90 MP Library 4 0 constraints constraints yvalues spline_values 0 xvalues break_points coeff diff norm yvalues ycheck huge 1 delta_x nord if diff lt one then write Example 1 for SPLINE_FITTING is correct end if end Optional Arguments constraints spline_constraints Input A rank 1 array of derived type _spline_constraints that give constraints the output spline is to satisfy covariance G Output An assumed shape rank 2 array of the same precision as the data This output is the covariance matrix of the coefficients It is optionally used to evaluate the square root of the variance function iopt iopt Input Outpu
304. l Example 9 is correct See to any error messages and quit MPI mp_nprocs mp_setup Final end Parallel Example 10 This illustrates the computation of a box data type least squares polynomial data fitting problem The problem is generated at the root node The alternate nodes are used to solve the least squares problems Results are checked at the root node Any number of nodes can be used use linear_operators use mpi_setup_int use Numerical_Libraries only DCONST implicit none This is Parallel Example 10 for ix integer i nrack integer parameter m 128 n 8 nr 4 real kind 1d0 parameter one 1d0 zero 0d0 veel kiac ALCO A G Ogi pie G Osing ilpiae o Oven 2 E x m 1 nr y m l nr u m 1 nr v m 1 nr amp te iii ik page Celta 5 l SSCs score NIP IES mp_nprocs mp_setup Generate a random grid of points and transform I AGO ici sligieeiyel il ib if mp_rank 0 x rand x x x 2 one Get the constant PI 2 from IMSL Numerical Libraries Dove nie DWCONSIN 7 PY 7 f2 Generate function data on the grid y exp x cos pi_over_2 x Fill in the least squares matrix for the Chebyshev polynomials Ate Oo Somes At ole aay ee ee ay IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 217 do i 2 n INCOR al 8 St Dee IL 9 ACS abl 9 Ai 2 se end do Solve for the series coeffi
305. l reciprocals of the matrix R are saved in the diagonal entries of A when the Cholesky method is used iopt IO _options _lin_sol_self_no_pivoting _dummy Does no row pivoting The array pivot s if present satisfies pivots i i 1fori 1 n 1 when using Aasen s method When using the Cholesky method pivots i ifori 1 n iopt IO _options _lin_sol_self_use_Cholesky _dummy The Cholesky decomposition PAP T R R is used instead of the Aasen method iopt IO P_options _lin_sol_self_solve_A _dummy Uses the factorization of A computed and saved to solve Ax b iopt IO _options _lin_sol_self_scan_for_NaN _dummy Examines each input array entry to find the first value such that isNaN a i j or isNan b i j true See the isNaN function Chapter 6 Default Does not scan for NaNs iopt IO _options _lin_sol_self_no_sing_mess _dummy Do not print an error message when the natrix A is singular Description The 1in_sol_self routine solves a system of linear algebraic equations with a nonsingular coefficient matrix A By default the routine computes the factorization of A using Aasen s method This decomposition has the form PAP LTI where P is a permutation matrix L is a unit lower triangular matrix and T is a tridiagonal self adjoint matrix The solution of the linear system Ax b is found by solving simpler systems u L Pb Tv u and x P L v
306. les and values between calls to fast_dft The value for size w must be at least as large as the value ido for the value of ido lt 0 iopt iopt Input Output Derived type array with the same precision as the input array used for passing optional data to fast_dft The options are as follows Packaged Options for fast_dft fast_dft_scan_for_NaN Option Prefix Option Value fast_dft_near_power_of_2 fast_dft_scale_forward CZ fast_dft_scale_inverse 4 iopt IO _options _fast_dft_scan_for_NaN _dummy Examines each input array entry to find the first value such that isNaN x i true See the isNaN function Chapter 6 Default Does not scan for NaNs iopt IO _options _fast_dft_near_power_of_2 _dummy Nearest power of 2 2 n is returned as an output in iopt IO 1 Sidummy iopt IO _options _fast_dft_scale_forward real_part_of_scale iopt IO 1 _options _dummy imaginary_part_of_scale Complex number defined by the factor cmplx real_part_of_scale imaginary_part_of_scale is multiplied by the forward transformed array Default value is 1 iopt IO _options _fast_dft_scale_inverse real_part_of_scale IMSL Fortran 90 MP Library 4 0 Chapter 3 Fourier Transforms e 81 iopt IO 1 _options _dummy imaginary_part_of_scale Complex number defined by the factor cmplx real_part_of_scale imaginary_part_of_scale is multiplied by the inv
307. level subroutines for each data type and array shape Output is directed to the unit number IUNIT That number is obtained with the subroutine UMACH IMSL MATH LIBRARY User s Manual IMSL 1994 pp 1204 1205 Thus the user must open this unit in the calling program if it desired to be different from the standard output unit If the optional argument IMAGE buffer is present the output is not sent to a file but to a character string within buffer These characters are available to output or be used in the application Additional Examples Example 2 Writing an Array to a Character Variable This example prepares a rank 1 array for further processing in this case delayed writing to the standard output unit The indices and the amount of precision are reset from their defaults as in Example 1 An end of line sequence of the characters CR NL ASCII 10 13 is used in place of the standard ASCH 10 This is not required for writing this array but is included for an illustration of the option use show_int use rand_int implicit none IMSL Fortran 90 MP Library 4 0 Chapter 5 Utilities e 139 This is Example 2 for SHOW integer parameter n 7 real kind le0 s_x 1 n type s_options options 7 CHARACTER LEN 72 2 4 BUFFER The data types printed are real kind le0Q random numbers s_x rand s_x Show 7 digits per number and according to the natural or declared size of the array Prepare the ou
308. lize MPI and then make an error Finalize MPI and print any error messages that occurred since the last printing All nodes report errors to the root node MP_NPROCS MP_SETUP Final After MPI is finalized a single set of messages print The other nodes are inoperativ CALL C_Name 1 END PROGRAM SUBROUTINE C_Name I USE MPI_NODE_INT This routine generates an error messag IMPLICIT NONE INTEGER I TYPE TYPE 4 IF I lt 0 THEN Push the name onto the stack CALL E1PSH C_Name Drop a value into the message CARE S Hvac Lye Prepare the message for printing CAMERE IME SICEYRE 5 She The agument should be positive amp H Te new has valwe Stal Pop the name off the stack CALL E1POP C_Name Had an invalid argument so RETURN RETURN END IF END SUBROUTINI EJ Output for Example 3 SEATA ERROR 4 trons CoName now has value 2 FORWARD Calls C_Name 4 are PATA ERROR or rons C2 Name now has value 2 FORWARD Calls C_Name 4 Gi TEATAL ERROR 2 on rank 1 314 Chapter 9 Error Handling and Messages The Parallel Option iGomsikeiwescd nam Sdn come raisons The agument should be positive Error Types and Codes 3 The agument should be positive Error Types and Codes 3 C_Name INE INE The IMSL Fortran 90 MP Library 4 0 agument
309. llocates and defines the array MPI_NODE_PRIORITY the node priority order By setting MPI_ROOT_WORKS false the computation of the residual is off loaded to the node with highest priority wherein we expect the results to be computed the fastest The remainder of the computation including the factorization and solve step are executed at the root node This example requires two nodes to execute use linear_operators use mpi_ LTO ele Culetes INCLUDE IMSL Fortran 90 MP setup_int none Wy eae IE NK Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 215 This is Parallel Example 9 showing iterative refinement with only one non root node working There is only one problem in this example integer parameter m 8 n 4 nr 1 real kind le0 one le0 zero 0e0 real kind 1d0 d_zero 0d0 integer ipivots n m 1 nrack ierror real kind le0 A m n nr b m 1 nr F n m ntm nr amp op idan il rewe 5 aa AL sate real kind le0 change_new nr change_old nr real Keatiael AkClO elim tere IO Gitar ae se Geral tl rele type s_options EOE IL 2 Setup for MPI Establish a node priority order Restrict the root from significant computing Tllustrates the best performing non root node computing a single task mp_nprocs mp_setup mtn MPT_ROOT_WORKS false Generate a random matrix and right hand side A rand A b ra
310. llows more values of the parameters to be studied in a given time than with a single processor This code is valuable as a study guide when an application needs to estimate timing and other output parameters The simulation program FE Electrodynamics time is controlled at the root node An integration is started after receiving results within the first SIM_TIM TIM SI pP D zal l D_MG de US BD EX09 parameter study ie US ALG MEANES il US US 1 HEE Pike CLUD O _mg_ Ue RAND_INT SHOW_ IT E Tr IN INT ONI WMqijoyaLIe aay N i 1 in I F GER PARAM ER NPD E 2 ih Gin i TO Alf i i T Il GER ALLOCA Define array space for TORU TOGU SIM_TIME i FOA E 60D TOUL Bi SIM_TIM DATA 5 S kind 1d0 kind 1d0 the numb IERROR ry ie ABLI the U Z solu PD RO 0DO ie je 0 kind 1d0 kind 1d0 kind 1d0 D_OPTIONS Pal If NP_NPRO BIL INO als AL 10 _NOD WO BTE S p ONTINUE OUNTS BEONE E 1 N conds to E seconds The elapsed time will be longer than E by the slowest processor s time for its last integration N 21 STATUS MPI_STATUS_SIZ ry LO Aw 1D0 DELTA_T 10D0 run the simula
311. loped reminiscent of matrix algebra This allows the Fortran 90 user to express mathematical formulas in terms of operators Thus important aspects of object oriented programming are provided as a part of this chapter s design A comprehensive Fortran 90 module linear_operators defines the operators and functions Its use provides this simplification Subroutine calls and the use of type dependent procedure names are largely avoided This makes a rapid development cycle possible at least for the purposes of experiments and proof of concept The goal is to provide the Fortran 90 programmer with an interface operators and functions that are useful and succinct The modules can be used with existing Fortran programs but the operators provide a more readable program Frequently this approach requires more hidden working storage The size of the executable program may be larger than alternatives using subroutines There are applications wherein the operator and function interface does not have the functionality that is available using subroutine libraries To retain greater flexibility some users will continue to require the traditional techniques of calling subroutines A parallel computation for many of the defined operators and functions has been implemented Most of the detailed communication is hidden from the user Those functions having this data type computed in parallel are marked in bold type The section Parallelism Using MPT
312. lution X The block size is chosen so that each participating processor receives the same number of columns except any remaining columns sent to the processor with largest rank This processor contains the right hand side before the broadcast This example illustrates connecting a BLACS context handle and the Fortran 90 MP Library MPI communicator MP_LIBRARY_WORLD described in Chapter 6 PROGRAM PNLSQ_EX2 Use Parallel_Nonnegative_LSQ to solve a least squares problem A x b with x gt 0 This algorithm uses a distributed version of NNLS found in the book Solving Least Squares Problems page 165 The data is read from a file by rows and sent to the processors as array columns USE PNLSQ_INT USE SCALAPACK_IO_INT USE BLACS_INT USE MPI_SETUP_INT USE RAND_INT USE ERROR_OPTION_PACKET IMPLICIT NONE INCLUDE mpif h INTEGER PARAMETER M 128 N 32 NP N 1 NIN 10 real kind 1d0 ALLOCATABLE DIMENSION amp Caan 3 AGr Br GC W X real kind 1d0 RNORM ERROR INTEGER ALLOCATABLE INDEX IPART INTEGER I J K L DN JSHIFT IERROR amp CONTXT NPROW MYROW MYCOL DESC_A 9 TYPE d_OPTIONS IOPT 1 Routines with the BLACS_ prefix are from the BLACS library CALL BLACS_PINFO MP_RANK MP_NPROCS Make initialization for BLACS CALL BLACS_GET 0 0 CONTX
313. m dimensions are n size D 1 2 and k size D 2 z Example 1 Solution of Multiple Tridiagonal Systems The upper main and lower diagonals of n systems of size n x n are generated randomly A scalar is added to the main diagonal so that the systems are positive definite A random vector x is generated and right hand sides y A y are computed The routine is used to compute the solution using the A and y The results should compare closely with the x used to generate the right hand sides Also see operator_ex17 Chapter 6 use lin_sol_tri_int use rand_gen_int use error_option_packet implicit none 34 Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 This is Example 1 for LIN_SOL_TRI integer i integer parameter n 128 real kind 1d0 parameter one 1d0 zero 0d0 real kind 1d0 err real kind 1d0 dimension 2 n n d b c res n n amp t n X Y Generate the upper main and lower diagonals of the n matrices A i For each system a random vector x is used to construct the right hand side Ax y The lower part of each array remains zero as a result c zero d zero b zero x zero do i 1 n call rand_gen call rand_gen call rand_gen call rand_gen end do FEES PRR PR pps B aek Add scalars to the main diagonal of each system so that all systems are positive definite t sum ctdtb DIM 1 d 1 n 1 n d 1 n 1 n spread t DIM 1 NCOPIES n
314. m 1 ncopies i 1 end do Compute the solution It should be the same as x but will not be exact due to rounding errors The quantity real z kind one is the real valued answer when the Schur decomposition method is used z matmul w z Compute the solution by solving for x directly do i l n a i i a i i h end do call lin_sol_gen a b x Check that x and z agree approximately err sum abs x z sum abs x if err lt sqrt epsilon one then write Example 3 for LIN_EIG_GEN is correct end if end Example 4 Accuracy Estimates of Eigenvalues Using Adjoint and Ordinary Eigenvectors A matrix A has entries that are subject to uncertainty This is expressed as the realization that A can be replaced by the matrix A nB where the value n is small but still significantly larger than machine precision The matrix B satisfies IIBII lt IIAII A variation in eigenvalues is estimated using analysis found in Golub and Van Loan 1989 Chapter 7 p 344 Each eigenvalue and eigenvector is expanded in a power series in 7 With e n e n n and normalized eigenvectors the bound ete HAL r il lt Ui Vi IMSL Fortran 90 MP Library 4 0 Chapter 2 Singular Value and Eigenvalue Decomposition 69 is satisfied The vectors u and v are the ordinary and adjoint eigenvectors associated respectively with e and its complex conjugate This gives an upper bound on t
315. m t ime_parallel_i is compiled and linked with the single and double precision timing functions s_parallel_i_bench and d_parallel_i_bench This routine evaluates the time to compute 5 inverse matrices of size 50 by 50 using the defined operator i The Average is the mean of the individual elapsed times for 5 calls to the routines obtaining 5 inverses in each call The St Dev is the standard deviation for that Average This value indicates the variability of the Average In order for this value to provide any useful information it is necessary for INTRIESI gt 1 The value INTRIESI is acceptable but only one time sample and no standard deviation is obtained Values of NTRIES gt 0 result in the printing of results as shown in Table C The numbers in the table will vary depending on the machine and other factors that impact performance of Fortran codes If NTRIES lt 0 the 7 x 2 functions return the tabular values shown with INTRIES samples No printing is performed with NTRIES lt 0 D 6 Appendix D Benchmarking or Timing Programs IMSL Fortran 90 MP Library 4 0 Single precision benchmark of parallel i and non parallel i Date of benchmark Y Mo D H M S 1996 12 23 Root not working Number of Processors 4 1 1 5815E 00 2 5031E 01 7 9077E 00 5 0000E 01 5 0000E 00 5 0000E 00 Dn mr A U N 4 0241E 00 1 8035E 02 2 0121E 01 5 0000E 01 5 0000E 00 5 0000E 00
316. met vetee cee es eeee eee ve RearalleliExam ples cenene aee E essen eeseecrtare sere eeae Grtemieneits 206 Matrix Algebra Operations Consider a Fortran 90 code fragment that solves a linear system of algebraic equations Ay b then computes the residual r b Ay A standard mathematical notation is often used to write the solution y A b A user thinks matrix and right hand side yields solution The code shows the computation of this mathematical solution using a defined Fortran operator ix and random data obtained with the function rand This operator is read inverse matrix times The residuals are computed with another defined Fortran 142 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 operator x read matrix times vector Once a user understands the equiva lence of a mathematical formula with the corresponding Fortran operator it is possible to write this program with little effort The last line of the example before end is discussed below USE linear_operators integer parameter n 3 real A n n y n b n r n AS reanGu As ea slo e amc o A S A ae 9 eS 1 A aka WD end The IMSL Fortran 90 MP Library provides additional lower level software that implements the operation ix the function rand matrix multiply x and others not used in this example Standard matrix products and inverse operations of mat
317. meter delta_x range ndata 1 delta_b range nbkptin 1 real kind le0 target xdata ndata ydata ndata ynoise ndata amp sddata ndata spline_data 3 ndata bkpt nbkpt amp values ndata derivatl ndata derivat2 ndata amp coeff ncoeff root_variance ndata diff real kind le0 dimension ncoeff ncoeff sigma_squared real kind le0 pointer pointer_bkpt type s_spline_knots break_points type s_spline_constraints constraints nbkptint2 xdata i 1 delta_x i 1 ndata ydata exp half xdata 2 ynoise ratio ydata rand ynoise half ydata ydatatynoise 104 Chapter 4 Curve and Surface Fitting with Splines IMSL Fortran 90 MP Library 4 0 sddata ynoise spline_data 1 xdata spline_data 2 ydata spline_data 3 sddata bkpt i nord delta_b i 1 nbkpt Assign the degr of the polynomial and the knots pointer_bkpt gt bkpt break_points s_spline_knots ndegree pointer_bkpt icurv int one delta_b 1 At first shape the curve to be convex down do i 1 icurv 1 constraints i spline_constraints amp derivative 2 point bkpt i ndegree type lt value zero end do Force a curvature change constraints icurv spline_constraints amp derivative 2 point bkpt icurvtndegree type value zero Finally shape the curve to be convex up do i icurv 1 nbkptin constraints i spline_constraints amp
318. ming an Array of Random Complex Numbers An array of random complex numbers is obtained The transform of the numbers is inverted and the final results are compared with the input array use fast_3dft_int implicit none This is Example 1 for FAST_3DFT integer i j k integer parameter n 64 real kind le0 parameter one le0 zero 0e0 real kind le0 r n n n err complex kind 1e0 a n n n b n n n c n n n IMSL Fortran 90 MP Library 4 0 Chapter 3 Fourier Transforms e 91 Fill in value one for points inside the sphere with radius 16 a zero do i l1 n do j 1 n do k 1 n r i j k i n 2 2 3 n 2 2 k n 2 2 end do end do end do where r lt n 4 2 a one c a Transform the image and then invert it back call c_fast_3dft forward_in a amp forward_out b call c_fast_3dft inverse_in b amp inverse_out a Check that inverse transform image image err maxval abs c a maxval abs c if err lt sqrt epsilon one then write Example 1 for FAST_3DFT is correct end if end Optional Arguments forward_in x Input Stores the input complex array of rank 3 to be transformed forward_out y Output Stores the output complex array of rank 3 resulting from the transform inverse_in y Input Stores the input complex array of rank 3 to be inverted inverse_out x Output Stores the output complex array of rank 3 resulting from the inverse
319. n 1 r_off sval 1 a_off t_lower EOSHIFT t_upper SHIFT 1 DIM 1 cycle Integration_Loop case 7 Compute the factorization iopti 1l s_lin_sol_tri_factor_only call lin_sol_tri t_upper t_diag t_lower amp t_sol iopt iopti cycle Integration_Loop case 8 Solve the system iopti 1 s_lin_sol_tri_solve_only Move data from the assumed size to assumed shape arrays t_sol 1 n 1 wk ival 1 ival 1 n 1 call lin_sol_tri t_upper t_diag t_lower amp t_sol iopt iopti Move data from the assumed shape to assumed size arrays wk ival 1 ival 1 n 1 t_sol l1 n 1 cycle Integration_Loop case 2 Correct initial value to reach u_1 at t tend u_0 u_O u_O y n 2 u_1 u_0 y n 2 1 Finish up internally in the integrator ido 3 cycle Integration_Loop end select end do Integration_Loop write The equation u_t u_xx with u 0 t u0 write reaches the value u_l1 at time tend write Example 4 for LIN_SOL_TRI operators is correct end Operator_ex21 use linear_operators implicit none This is Example 1 using operators for LIN_SVD integer parameter n 32 real kind 1d0 parameter one 1d0 real kind 1d0 err real kind 1d0 dimension n n A U V S n Generate a random n by n matrix 192 e Chapter 6 Operators and Generic Functions The Parallel Option IM
320. n are returned as an outputs in iopt IO 1 idummy and iopt IO 2 idummy iopt IO _options _fast_2dft_scale_forward real_part_of_scale iopt IO 1 _options _dummy imaginary_part_of_scale Complex number defined by the factor cmplx real_part_of_scale imaginary_part_of_scale is multiplied by the forward transformed array Default value is 1 iopt IO _options _fast_2dft_scale_inverse real_part_of_scale iopt IO 1 _options _dummy imaginary_part_of_scale Complex number defined by the factor cmplx real_part_of_scale imaginary_part_of_scale is multiplied by the inverse transformed array Default value is 1 Description The fast_2dft routine is a Fortran 90 version of the FFT suite of IMSL 1994 pp 772 776 Additional Examples Example 2 Cyclical 2D Data with a Linear Trend This set of data is sampled from a function x s t a bs ct y s t where y s t is an harmonic series The independent variables are normalized as l lt s lt 1and 1 lt tr lt 1 Thus the data is said to have cyclical components plus a linear trend As a first step the linear terms are effectively removed from the data using the least squares system solver 1in_sol_1sq Chapter 1 Then the residuals are transformed and the resulting frequencies are analyzed 88 Chapter 3 Fourier Transforms IMSL Fortran 90 MP Library 4 0 fast_2dft_int lin_sol_lsq_int use sort_real_int use rand_int implicit none use use
321. n values b exp sum p 2 dim 1 Compute the least squares solution An error message du to rank deficiency is ignored with the flags allocate d_invx_options 1 d_invx_options 1 skip_error_processing c A ix b Check the results if norm A tx b A x c norm A norm c amp lt sqrt epsilon one then write Example 3 for LIN_SOL_LSQ operators is correct end if Evaluate residuals known function approximation at a square grid of points This evaluation is only for k 2 delta one real n_eval 1 kind one w i delta i 0 n_eval 1 res exp spread w 1 n_eval 2 spread w 2 n_eval 2 do j l n res res c j sqrt spread w 1 n_eval q 1 j3 2 amp spread w 2 n_eval q 2 3j 2 delta_sqr end do Unload option type for good housekeeping deallocate d_invx_options end Operator_ex12 use linear_operators implicit none This is Example 4 for LIN_SOL_LSQ using operators and functions integer parameter m 64 n 32 real kind le0 one le0 A m 1 n b m 1 x n Generate a random matrix and right hand side A rand A b rand b Heavily weight desired constraint All variables sum to one A mt 1 one sqrt epsilon one b m 1 one sqrt epsilon one Compute the least squares solution with this heavy weight x A ix Db Check the constraint if abs sum x one no
322. n_svd_use_gauss_elim 6 E To A lin_svd_set_perf_ratio 7 iopt IO _options _lin_svd_set_small Small If a singular value is smaller than Small it is defined as zero for the purpose of computing the rank of A Default the smallest number that can be reciprocated safely iopt IO _options _lin_svd_overwrite_input _dummy Does not save the input array A iopt IO _options _lin_svd_scan_for_NaN _dummy Examines each input array entry to find the first value such that isNaN a i j true See the isNaN function Chapter 6 Default The array is not scanned for NaNs iopt IO _options _lin_svd_use_qr _dummy Uses a rational QR algorithm to compute eigenvalues Accumulate the IMSL Fortran 90 MP Library 4 0 Chapter 2 Singular Value and Eigenvalue Decomposition 49 singular vectors using this algorithm Default singular vectors computed using inverse iteration iopt IO _options _lin_svd_skip_Orth _dummy If the eigenvalues are computed using inverse iteration skips the final orthogonalization of the vectors This method results in a more efficient computation However the singular vectors while a complete set may not be orthogonal Default singular vectors are orthogonalized if obtained using inverse iteration iopt IO _options _lin_svd_use_gauss_elim _dummy If the eigenvalues are computed using inverse iteration uses standard elimination with partial pivoting to solve
323. nd 1d0 parameter one 1 0d0 zero 0 0d0 real kind 1d0 a n n b n n beta n y n n type d_options iopti 1 type d_error epack 1 complex kind 1d0 alpha n Generate random matrices for both A and B call rand_gen y a reshape y n n call rand_gen y b reshape y n n Make columns of A and B zero so both are singular a l n n 0 b 1 n n 0 Set internal tolerance for a small diagonal term iopti 1 d_options d_lin_geig_gen_set_small sqrt epsilon one Compute the generalized eigenvalues call lin_geig_gen a b alpha beta amp Llopt iopti epack epack See if singular DAE system is detected The size of epack is too small for the message so output is blocked with NaNs if isnan alpha then write Example 3 for LIN_GEIG_GEN is correct end if end 76 Chapter 2 Singular Value and Eigenvalue Decomposition IMSL Fortran 90 MP Library 4 0 IMSL Fortran 90 MP Library 4 0 Example 4 Larger Data Uncertainty than Working Precision Data values in both matrices A and B are assumed to have relative errors that can be as large as el where is the relative machine precision This example illustrates the use of an optional flag that resets the tolerance used in routine lin_sol_isq for determining a singularity of either matrix The tolerance is reset to the new value Bl and the generalized eigenvalue problem i
324. nd b Save double precision copies of the matrix and right hand side D A c b Fill in augmented matrix for accurately solving the least squares problem using iterative refinement F zero ago nrack Lent Pilem lim nrack BYE m end do 2 Lei merle 8 S AS a Gitels isi 8 ott JA l Scare SOlrrion aie weise y d_zero change_old huge one Use packaged option to save the factorization sigejoneal il ey lim sol selik Sava stexeic oer iopti 2 0 h zero ITERATIVE _REF INEMEN DO Gis 78 Cilem 3 8 wiles 843 UD oto yard Simro 8 8 G r sii 858 ID sibs wile 87 8 IF MP_RANK 0 THEN Galil dlatin soll seule F838 pinta CSAs pmue y e h nr pivots ipivots iopt iopti ary END IF change_new norm h All processors share the root s test for convergence 216 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 call mpi_bcast change_new nr mpi_real 0 mp_library_world ierror Exit when changes are no longer decreasing if ALL change_new gt change_old amp exit ITERATIVE_REFINEMENT change_old change_new Use option to re enter code with factorization saved solve only iopti 2 s_lin_sol_self_solve_A end do ITERATIVE_REFINEMENT if mp_rank 0 amp write Paralle
325. nditioning of the problem Matrix and Utility Functions Several decompositions and functions required for numerical linear algebra follow The convention of enclosing optional quantities in brackets J is used The functions that use MPI for parallel execution of the box data type are marked in bold Defined Array Functions Matrix Operation S SVD A U U V V A uUSvV E EIG A B B D D AV VE AVD BVE V V W W AW WE AWD BWE R CHOL A A R R Q ORTH A R R A OR 0 O UUNIT A u a lah F DET A det A determinant K RANK A rank A rank 144 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 Defined Array Functions P NORM A type i Matrix Operation aij p All max i l p Al s largest singular value n p Mlonge max lap j l C COND A 51 Srank A Z EYE N Z Iy A DIAG X A diag x X DIAGONALS A x ay Y FFT X WORK W X IFFT Y WORK W Discrete Fourier Transform Inverse Y FFT_BOX X WORK W X IFFT_BOX Y WORK W Discrete Fourier Transform for Boxes Inverse A RAND A random numbers 0 lt A lt 1 L isNan A test for NaN if L then In certain functions the optional arguments are inputs while other optional arguments are outputs To illustrate the example of
326. ne Determine sampling interval delta_t two n 82 Chapter 3 Fourier Transforms IMSL Fortran 90 MP Library 4 0 t one i delta_t i 0 n 1 Compute pi pi atan one 4E0 indx i pi i 1 k Make up data set as a linear trend plus harmonics x a b t amp matmul exp cmp1x zero spread t 2 k spread indx 1 n kind one c Define least squares matrix data for a linear trend a_trend 1 1 one a_trend 1 2 t b_trend 1 1 x Solve for a linear trend call lin_sol_lsq a_trend b_trend x_trend Compute harmonic residuals r x reshape matmul a_trend x_trend n Transform harmonic residuals call c_fast_dft forward_in r forward_out f ip i i 1 n The dominant frequencies should be 2 through k 1 Sort the magnitude of the transform first call s_sort_real abs f temp iperm ip The dominant frequencies are output in ip 1 k Sort these values to compare with 2 through k l call s_sort_real real ip 1 k temp ip 1 k 1 i1 2 k 1 Check the results if count int temp 1 k ip 1 k 0 then write Example 2 for FAST_DFT is correct end if end Example 3 Several Transforms with Initialization In this example the optional arguments ido and work_array are used to save working variables in the calling program unit This results in maximum efficiency of the transform and its inverse since the workin
327. near system of equations 2 solving parametric linear systems with scalar change 68 sort and final move with a permutation 136 sorting an array 134 splines model a random number generator 106 system solving with Cholesky method 13 system solving with the generalized inverse 1 22 tensor product spline fitting of data 113 test for a regular matrix pencil 75 transforming array of random complex numbers 79 86 91 tridiagonal matrix solving 42 two dimensional data fitting 24 using inverse iteration for an eigenvector 1 14 examples list error messages 310 operator 173 parallel 206 exclusive OR 128 F factorization LU 2 FFT Fast Fourier Transform 82 88 94 FORTRAN 77 40 combining with Fortran 90 ii viii interface 40 Fortran 90 IMSL Fortran 90 MP Library combining with FORTRAN 77 viii language ii rank 1 array ii rank 2 array vi real time clock 129 Fushimi 128 129 G Galerkin principle 43 generalized eigenvalue 47 61 71 158 feedback shift register GFSR 127 inverse matrix 18 20 22 generalized inverse system solving 1 22 generator 123 129 132 generic root name ii getting started iv GFSR algorithm 128 Golub 5 12 22 25 50 52 54 58 61 66 GSVD 52 H Hanson 58 harmonic series 82 88 Hessenberg matrix upper 62 67 High Performance Fortran HPF 231 histogram 123 130 Householder 74 IEEE 164 165 IMSL Fortran 90 MP Library generic root nam
328. nerate a random self adjoint matrix call rand_gen y a a IMSL Fortran reshape y n n a transpose a 90 MP Library 4 0 Chapter 2 Singular Value and Eigenvalue Decomposition 59 Compute just the eigenvalues call lin_eig_self a d do i l k Define a temporary array to hold the matrices A igenvalue l temp a do j 1 n temp j j temp j j d i end do Use packaged option to reset the value of a small diagonal iopti l d_options d_lin_sol_self_set_small amp epsilon one abs d i Use packaged option to skip singularity messages iopti 2 d_options d_lin_sol_self_no_sing_mess amp zero call rand_gen b 1 n 1 call lin_sol_self temp b v 1l i i amp Lopt iopti end do Orthogonalize the eigenvectors do i l k big maxval abs v 1 i v l i v 1 1 big v l i v 1 i sqrt sum v 1 i 2 if i k cycle v 1 i 1 k v 1 i 1 k amp spread meena v 1 1 v 1 it1l k 1 n amp spread v 1 1 2 k i end do do i k 1 1 1 v l itl k v 1 itl k amp spread matmul v 1 1 v 1 it1l k 1 n amp spread v 1 1 2 k i end do Check the results for both orthogonality of vectors and small residuals res 1 k 1 k matmul transpose v v do i 1 k res i i res i i on end do err sum abs res k 2 res matmul a v v spread d 1 k 1 n if err lt sqrt epsilon one then if
329. nes Optionally this information can be provided by reverse communication These forms of the interface are explained below and illustrated with examples Users may turn directly to the examples if they are comfortable with the description of the algorithm Algorithm Summary f u x t x lt X lt Xp The equation t gt to is approximated at N time dependent grid values 0 S lt x t lt x t lt lt xy R Using the total differential dx u u dt dt transforms the differential equation to du dx i ee u Sree Using central divided differences for the factor xleads to the system of ordinary differential equations in implicit form 266 Chapter 8 Partial Differential Equations IMSL Fortran 90 MP Library 4 0 dU Uses U dx dt xix dt F t gt tpi 1 N U F The terms i respectively represent the approximate solution to the partial differential equation and the value of f u x t at the point x x 1 1 The truncation error is second order in the space variable X The above ordinary differential equations are underdetermined so additional equations are added for the variation of the time dependent grid points It is necessary to discuss these equations since they contain parameters that can be adjusted by the user Often it will be necessary to modify these parameters to solve a difficult problem For this purpose the following quantities are defined AX Xi Z
330. nk 1 arrays the number of displayed digits is reset from 4 to the value 7 and the subscripts for the array are reset so they match their declared extent when printed The output is not shown use show_int use rand_int implicit none This is Example 1 for SHOW integer parameter real kind 1le0 real kind 1d0 complex kind le0 complex kind 1d0 integer i_x n type ee eae si d gi The data types printed are real kind le0 complex kind 1e0 PEER real kind 1d0 complex kind 1d0 and INTEG R Fill with randsom numbers and then print the contents s_x rand s_x s_m rand d_x rand d_x d_m rand c_x rand c_x c_m rand z_x rand z_x z_m rand z_1 i_x 100 rand s_ xl en 3 call show s_x Rank 1 call show s_m Rank 2 call show d_x Rank 1 call show d_m Rank 2 call show c_x Rank 1 call show c_m Rank 2 call show z_x Rank 1 call show z_m Rank 2 call show i_x Rank 1 call show i_m Rank 2 Show 7 digits per number and of th in each case with a label s_m d_m _m m 100 rand s_m REAL REAL DOUBL DOUBLI COMP LI COMPL DOUBL DOUBLE INTEG INTEG r r oN x1 w E a o mS COMPL COMPL ER ER according to the natural or declared siz IMSL Fortran 90 MP Library 4 0
331. nly 1 use_lin_sol_lsq_only skip_error_processing ix_options_for_lin_sol_gen 2 3 ix_options_for_lin_sol_lsq 4 5 Derived Type Name of Unallocated Array le options le anvont s_options s_invx_options_once d_options d_invx_options d_options d_invx_options_once s_options s_xinv_options s_options s_xinv_options_once d_options d_xinv_options d_options d_xinv_options_once IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 153 Examples Compute the matrix times vector y Alyx y A ix x Compute the vector times matrix y xA y x xi A Compute the matrix expression D B A C D B A ix C CHOL Compute the Cholesky factorization of a positive definite symmetric or self adjoint matrix A The factor is upper triangular R TR A Required Argument This function requires one argument This argument must be a rank 2 or rank 3 array that contains a positive definite symmetric or self adjoint matrix For rank 3 arrays each rank 2 array for fixed third subscript is a positive definite symmetric or self adjoint matrix In this case the output is a rank 3 array of Cholesky factors for the individual problems Modules Use the appropriate one of the modules chol_int or linear_operators Optional Variables Reserved Names This function uses lin_sol_self See Chapter 1 Linear Solvers
332. o_pi end do Obtain uniform random numbers in 0 1 call rand_gen rn Use Newton s method to solve the nonlinear equation accumulated_distribution_function random_number 0 x zero k 0 solve_equation do f sin x x two_pithalf rn fprime onetcos x two_pi dx f fprime x x dx k k 1 if maxval abs dx lt sqrt epsilon one amp or k gt COUNT exit solve_equation end do solve_equation Map the random numbers x array into the counts array do i 1 n_samples i_map int x i omegatoffset 1 counts i_map counts i_map tone end do Normalize the counts array counts counts n_samples Check that the generated random numbers are indeed based on the original distribution if maxval abs counts 1 probabilities 1 amp lt tolerance then write a Example 4 for RAND_GEN is correct end if Generate 30 random numbers in pi pi according to the probability density cos x 1 2 pi pi lt x lt pi call rand_gen rn_30 x_30 0 0 k 0 solve_equation_30 do f_30 sin x_30 x_30 two_pithalf rn_30 fprime_30 onetcos x_30 two_pi dx_30 f_30 fprime_30 IMSL Fortran 90 MP Library 4 0 Chapter 5 Utilities e 133 x_30 x_30 dx_30 if maxval abs dx_30 lt sqrt epsilon one amp or k gt COUNT exit solve_equation_30 end do solve_equation_30 write A Thirty random numbers generated amp according to t
333. ock cyclic matrix The data type for A is any of five Fortran intrinsic types integer single precision real double precision real single precision complex and double precision complex Optional Arguments Format Input A character variable containing a format to be used for writing the file that receives matrix data If this argument is not present an unformatted or list directed write is used iopt Input Derived type array with the same precision as the array A used for passing optional data to ScaLAPACK_WRITE Use single precision when A is type INTEGER The options are as follows Packaged Options for scaLAPACK_WRITE Option Prefix Option Name Option Value S d ScaLAPACK WRITE_UNIT 1 s_ d_ ScaLAPACK WRITE FROM PROCESS 2 s_ d_ ScaLAPACK WRITE _BY_ROWS 3 iopt IO ScaLAPACK_WRITE_UNIT Sets the unit number to the integer component of iopt IO 1 idummy The default unit number is the value 11 E iopt IO ScaLAPACK_WRITE_FROM_PROCESS Sets the process number that writes the named file to the integer component of iopt IO 1 idummy The default process number is the value 0 iopt IO ScaLAPACK_WRITE_BY_ROWS Write the matrix by rows to the named file By default the matrix is written by columns Algorithm Subroutine ScaLAPACK_WRITE writes columns or rows of a problem matrix output by a
334. of EXP A t y_0O y t y X x exp spread d nr 2 k spread t 1 n spread z_0 nr 2 k This is y derived by differentiating y t y_prime X x amp spread d nr 2 k exp spread d nr 2 k spread t 1 n amp spread z_0 nr 2 k Check iesulles ts WY Ay OF err norm y_prime A x y sizes norm y_prime norm A norm y if ALL err lt sqrt epsilon one sizes and MP_RANK 0 amp write Parallel Example 4 is correct See to any error messages and quit MPI MP_NPROCS MP_SETUP Final end Parallel Example 5 6 comments The computations performed in these examples are for linear least squares solutions There is use of the box data type and MPI Otherwise these are similar to Parallel Examples 1 2 except they use alternate operators and functions Any number of nodes can be used Parallel Example 5 use linear_operators use mpi_setup_int implicit none This is Parallel Example 5 using box data types operators GUM IEVUACTE LOIS 5 integer parameter m 64 n 32 nr 4 real kind le0 one le0 err nr realki e0 e elatiansiousialoyot Ganny 8a IN Ion Se real kind le0 dimension m n nr Cy cl Setup for MPI mp_nprocs mp_setup Generate two rectangular random matrices only at the root node if mp_rank 0 then 210 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0
335. olving a large least squares system with simple bounds using parallel computing Move data from a file to Block Cyclic form for use in ScaLAPACK Move data from Block Cyclic form following use in ScaLAPACK to a file Routine for integrating an initial value PDE problem with one space variable Remarks The GAMS classification scheme is detailed in Boisvert et al 1985 Other categories for mathematical software are available on the Internet through the World Wide Web The current address is http gams nist gov IMSL Fortran 90 MP Library 4 0 Appendix A List of Subprograms and GAMS Classification A 5 See Chapter 7 See Chapter 7 See Chapter 7 See Chapter 7 See Chapter 8 Ela E2b Klalb Kla2 D9al Kla2a Kla2a N4 N4 Ral I2a2 Appendix B List of Examples Example lin_sol_gen_ex1 lin_sol_gen_ex2 lin_sol_gen_ex3 lin_sol_gen_ex4 lin_sol_self_exl lin_sol_self_ex2 lin_sol_self_ex3 lin_sol_self_ex4 lin_sol_lsq_ex1 lin_sol_lsq_ex2 lin_sol_lsq_ex3 lin_sol_lsq_ex4 lin_sol_svd_exl lin_sol_svd_ex2 IMSL Fortran 90 MP Library 4 0 Readers can locate a sample program that will help them when using IMSL Fortran 90 MP Library within their application codes Not all examples are listed here Note the Operator Examples section in Chapter 6 The 37 programs in this suite use defined operations and generic functions to implement many of the examples shown below The Parallel Examples
336. on Use the function exp x 2 y 2 on the square 0 2 x 0 2 for samples The spline order is nord and the number of cells is ngrid 1 2 There are ndata data values in the square integer i integer parameter ngrid 9 nord 4 ndegree nord l amp nbkpt ngrid 2 ndegree ndata 2000 nvalues 100 real kind 1d0 parameter zero 0d0 one 1d0 two 2d0 real kind 1d0 parameter TOLERANCE 1d 3 real kind 1d0 target spline_data 4 ndata bkpt nbkpt amp coeff ngridtndegree 1 ngridtndegr 1 delta sizev amp x nvalues y nvalues values nvalues nvalues real kind 1d0 pointer pointer_bkpt type d_spline_knots knotsx knotsy Generate random x y pairs and evaluate the exampl xponential function at these values spline_data 1 2 two rand spline_data 1 2 spline_data 3 exp sum spline_data 1 2 2 dim 1 spline_data 4 one Define the knots for the tensor product data fitting problem delta two ngrid 1 bkpt l ndegree zero bkpt nbkpt ndegreet l nbkpt two bkpt nord nbkpt ndegree i delta i 0 ngrid 1 Assign the degr of the polynomial and the knots pointer_bkpt gt bkpt knotsx d_spline_knots ndegree pointer_bkpt knotsy knotsx Fit the data and obtain the coefficients coeff surface_fitting spline_data knotsx knotsy Evaluate the residual spline function at a grid of points ins
337. on on writing a more compact and readable code see Chapter 6 1 1 Important Note Please refer to the Table of Contents for locations of chapter references example references and function references IMSL Fortran 90 MP Library 4 0 Introduction i User Background To use this product you should be familiar with the Fortran 90 language as well as the FORTRAN 77 language which is in practice a subset of Fortran 90 A summary of the ISO and ANSI standard language is found in Metcalf and Reid 1990 A more comprehensive illustration is given in Adams et al 1992 Those routines implemented in the IMSL Fortran 90 MP Library provide a simpler more reliable user interface than is possible with FORTRAN 77 IMSL Numerical Libraries products Features of the IMSL Fortran 90 MP Library include the use of descriptive names short required argument lists packaged user interface blocks for the Fortran 90 routines interface blocks for the entire FORTRAN 77 Numerical Libraries a suite of testing and benchmark software and a collection of examples Source code is provided for the benchmark software and examples The IMSL Fortran 90 MP Library routines have lots of flexibility in their design On the other hand the design includes the feature of being able to ignore these extras if they are not needed Using Library Subprograms ii e Introduction Each routine in the IMSL Library has a generic root name that abbreviates its funct
338. one real n kind one do j 1 n 1 32 e Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 t 4 3 1 delta_t end do Compute collocation points s m solve_equations do s s exp s one s g g exp s if sum abs on xp s s g lt amp epsilon one sum g amp exit solve_equations end do solve_equations Evaluate the integrals over the quadrature points a exp spread t 1l n 1 m spread s 2 n amp exp spread t 2 nt 1 1 m spread s 2 n amp spread s 2 n b 1 1 9g Compute the singular value decomposition call lin_sol_svd a b f nrhs 0 amp rank k u U_S v V_S s S_S Singular values that are larger than epsilon determine the rank k k count S_S gt epsilon one oldrms huge one g matmul transpose U_S b 1 m 1 Find the minimum number of singular values that gives a good approximation to f t 1 do i 1 k f 1 n 1 matmul V_S 1 1 1 g 1 1 S_S 1 i f f one rms sum f 2 n if rms gt oldrms exit oldrms rms end do write Using this number of singular values amp amp i4 the approximate R M S error is lpel2 4 amp i 1 oldrms if sqrt oldrms lt delta_t 2 then write Example 4 for LIN_SOL_SVD is correct end if end Fatal Terminal and Warning Error Messages See the messages gls file for error messages for 1in_sol_svd These err
339. onger than it needs to be due to the unused MPI subprograms now in the linked executable If the extra size of the executable is a problem then link with the older version The user is cautioned about manipulating these routines beyond specification Disabling the printing of messages or the subprogram stack handler can mask serious error conditions Modification or replacement of functionality by the user within Fortran 90 MP Library can cause problems that are elusive and is definitely not recommended The routines described in this chapter are an integral part of the IMSL Fortran 90 MP Library and are subject to change by Visual Numerics Inc IMSL Fortran 90 MP Library 4 0 Chapter 9 Error Handling and Messages The Parallel Option 301 Error Classes The routines in the IMSL FORTRAN Libraries give rise to three classes of error conditions informational terminal and global The correct processing of an error condition depends on its class The classes are defined as follows e Information Class During processing certain exceptional conditions arise which may be interpreted as errors The detection of singularity by a linear equation solver is an example It is appropriate for the routine detecting a condition to inform the calling routine of the existence of the exception by setting an appropriate error state and then returning The calling routine is then able to interpret the information and decide on the appropriate action If r
340. oning This form of the decomposition assumes that the matrix A D B has all its singular values strictly positive For alternate problems where some singular values of D are zero the GSVD becomes UTA diag c W and V B diag s 5 W The matrix W has the same singular values as the matrix D Also see operator _ex23 Chapter 6 use lin_svd_int use rand_gen_int implicit none This is Example 3 for LIN_SVD integer parameter n 32 integer i real kind 1d0 parameter one 1 0d0 real kind 1d0 a n n b n n dA 2 n n x n n u_d 2 n 2 n amp 52 Chapter 2 Singular Value and Eigenvalue Decomposition IMSL Fortran 90 MP Library 4 0 v_d n n vic n n u_c n n v_s n n u_s n n amp y n n s_d n c n s n sc_c n sc_s n amp errl err2 Generate random square matrices for both A and B call rand_gen y a reshape y n n call rand_gen y b reshape y n n Construct D A is on the top B is on the bottom d 1 n 1 n a d n 1 2 n 1 n b Compute the singular value decompositions used for the GSVD call lin_svd d call lin_svd u_ call lin_svd u_ F S_O U_O a Len len eine en By esp Vas Rearrange c so it is non increasing Move singular vectors accordingly The use of temporary objects sc_c and x is required Sc c nsli 1 s e x vethen ns is 1 Gc x yvlet Lin nrlt l1 wie amp The columns o
341. ons it is necessary to evaluate the integrals which are computed with the values of u x t on the grid The integrals are approximated using the trapezoid rule commensurate with the truncation error in the integrator Rationale This is a non linear integro differential problem involving non local conditions for the differential equation and boundary conditions Access to evaluation of IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations 283 these conditions is provided using reverse communication It is not possible to solve this problem with forward communication given the current subroutine interface Optional changes are made to use an absolute error tolerance and non zero time smoothing The time smoothing valuet 1 prevents grid lines from crossing program PDE_1D_MG_EX03 Population Dynamics Model USE PDE_1ld_mg_int USE ERROR_OPTION_PACKET IMPLICIT NONE INTEGER PARAMETER NPDE 1 N 101 INTEGER IDO I NFRAME Define array space for the solution real kind 1d0 U NPDE 1 N MID N 1 TO TOUT V_1 V2 real kind 1d0 ZERO 0D0 HALF 5D 1 ONE 1D0 amp TWO 2D0 FOUR 4D0 DELTA_T 1D 1 TEND 5D0 A 5D0 TYPE D_OPTIONS IOPT 3 Start loop to integrate and record solution values IDO 1 DO SELECT CASE IDO Define values that determine limits CASE 1 O ZE OUT U NPD OPEN NFRA WRI
342. onstraining Some Points using a Spline Surface 119 Example 4 Constraining a Spline Surface to be non Negative 120 Introduction The following describes routines for fitting or smoothing sets of discrete data by a sum of B splines in one dimension or a tensor product of B splines in two dimensions First time users are advised to see IMSL 1994 pp 413 414 and de Boor 1978 for the basics about B splines The sense of the approximation is weighted least squares data fitting We have included the capability of enforcing constraints on the resulting function For the two dimensional problem we provide regularization of the least squares surface fitting problem and we allow users to change the default values of the parameters We provide controls for users to shape resulting curves or surfaces based on other information about the problem that cannot be easily expressed as least squares data fitting For instance a user may want the fitted curve to be monotone decreasing everywhere non negative and with a specified sign for the second derivative in sub intervals Example 2 for the routine spline_fitting presents a curve fitting problem IMSL Fortran 90 MP Library 4 0 Chapter 4 Curve and Surface Fitting with Splines 95 with these constraints Example 4 for the routine surface_fitting gives an example of constraining a surface to be non negative One Dimensional Smoothing Check List For data fitting or smoothing
343. options z_lin_eig_gen_in_Hess_form zero Compute complex eigenvalues of the companion matrix call lin_eig_gen a Lopt iopti f one fg one Use Horner s method for evaluation of the complex polynomial and size gauge at all roots g fg abs e abs b i Check for small errors at all roots err sum abs f fg n if err lt sqrt epsilon one then IMSL Fortran 90 MP Library 4 0 Chapter 2 Singular Value and Eigenvalue Decomposition 67 write Example 2 for LIN_EIG_GEN is correct end if end Example 3 Solving Parametric Linear Systems with a Scalar Change The efficient solution of a family of linear algebraic equations is required These systems are A hI x b Here A is ann Xn real matrix Z is the identity matrix and b is the right hand side matrix The scalar h is such that the coefficient matrix is nonsingular The method is based on the Schur form for matrix A AW WT where W is unitary and T is upper triangular This provides an efficient solution method for several values of h once the Schur form is computed The solution steps solve for y the upper triangular linear system T hl y W b Then x x h Wy This is an efficient and accurate method for such parametric systems provided the expense of computing the Schur form has a pay off in later efficiency Using the Schur form in this way it is not required to compute an LU factorization of A hI with
344. or messages are numbered 401 412 421 432 441 452 461 472 IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 33 lin_sol_tri Solves multiple systems of linear equations A x Vad lwak Each matrix A i is tridiagonal with the same dimension n The default solution method is based on LU factorization computed using cyclic reduction or optionally Gaussian elimination with partial pivoting Required Arguments c Input Output Array of size 2n x k containing the upper diagonals of the matrices Aj Each upper diagonal is entered in array locations c 1 n 1 j The data C n 1 k are not used D Input Output Array of size 2n x k containing the diagonals of the matrices A Each diagonal is entered in array locations D 1 n j B Input Output Array of size 2n x k containing the lower diagonals of the matrices Aj Each lower diagonal is entered in array locations B 2 n j The data B 1 1 k are not used y Input Output Array of size 2n x k containing the right hand sides y Each right hand side is entered in array locations 1 n j The computed solution x is returned in locations Y 1 n j Note The required arguments have the Input data overwritten If these quantities are used later they must be saved in user defined arrays The routine uses each array s locations n 1 2 n 1 k for scratch storage and intermediate data in the LU factorization The default values for proble
345. or error message can be changed without changing the others An actual argument value of 1 for the error type or error code causes the particular item to retain its current setting e The next reference changes the message and retains the type and code settings E1M CALL ES 1 1 new message e The next reference changes the error code and retains the type and message settings 1M CALL Bo tale eer erode S Y e The next reference removes the error state E1M IMSL Fortran 90 MP Library 4 0 Chapter 9 Error Handling and Messages The Parallel Option 307 e Values can be inserted into messages by the use of one of the following subroutines CALL E1STL ii literalstring CALL E1STA ai characterarray CATO Gal ainetllauyS CALL E1STR ri rvalue CALL E1STD di dvalue CALL ELSTC ci cvalue CALL BUSTA t2i 2value The current values of the parameters are expanded and placed in the text of the message This happens at the respective places indicated with the symbols Li Ai Ii S Ri S Di Ci and Zi Case of the letters L A I R D C and Z is not important The trailing indices i are integers between 1 and 9 with one exception Use of a negative value for ii in a call to E1STL keeps trailing blanks in literalstring To improve readability of messages we have provided that when the
346. or_lin_eig_self options_for_lin_geig_gen options_for_lin_eig_gen nM WwW N e skip_error_processing Derived Type Name of Unallocated Array s_options s_eig_options s_options s_eig_options_once d_options d_eig_options d_options d_eig_options_once Example Compute the maximum magnitude eigenvalue of a square matrix A The values are sorted by EIG to be non increasing in magnitude EIG A max_magnitude abs E 1 Compute the eigenexpansion of a square matrix B EIG B W W B W x diag E xi W IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 159 EYE Create a rank 2 square array whose diagonals are all the value one The off diagonals all have value zero Required Argument This function requires one integer argument the dimension of the rank 2 array The output array is of type and kind REAL KIND 1E0 Modules Use the appropriate module eye_int or linear_operators Optional Variables Reserved Names This function has neither packaged optional variables nor reserved names Example Check the orthogonality of a set of n vectors Q e norm BYE n Q hx Q FFT The Discrete Fourier Transform of a complex sequence and its inverse transform Required Argument The function requires one argument x If x is an assumed
347. ortional to M x N MBx NB A temporary local buffer is allocated for staging the matrix data It is of size M by NB when reading by columns or N by MB when reading by rows ScaLAPACK_WRITE This routine writes the matrix data to a file The data is transmitted from the two dimensional block cyclic form used by ScaLAPACK This routine contains a call to a barrier routine so that if one process is writing the file and an alternate process is to read it the results will be synchronized All processors in the BLACS context call the routine Required Arguments File _Name Input A character variable naming the file to receive the matrix data This file is opened with STATUS UNKNOWN If any access violation happens a type terminal error message will occur If the file already exists it will be overwritten After the contents are written the file is closed This file is written with a loop logically equivalent to groups of writes WRITE BUFFER I J I 1 M J 1 NB or optionally WRITE BUFFER I J J 1 N I 1 MB DESC_A Unput IMSL Fortran 90 MP Library 4 0 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers 235 The nine integer parameters associated with the ScaLAPACK matrix descriptor Values for NB MB LDA are contained in this array A LDA Input This is an assumed size array with leading dimension LDA containing this processor s piece of the bl
348. ot will achieve convergence which controls program flow out of the loop Therefore the nodes must share the root s view of convergence and that is the reason for the broadcast of the update from root to the nodes Note that when writing an explicit call to an MPI routine there must be the line INCLUDE mpif h placed just after the IMPLICIT NONE statement Any number of nodes can be used use linear_operators use mpi_setup_int implicit none IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 207 UNCTUDE EMOR This is Parallel Example 3 for i and iterative refinement with box date types operators and functions integer parameter n 32 nr 4 integer IERROR real kind 1le0 one le0 zero 0e0 real kind 1le0 INGoipiap iia Joey ik awe xe Gary I praise real kind le0 change_old nr change_new nr real kind 1d0 CL wSico UCl Clint ik inte 5 ID Gat pie wre y als Lp inti l SS ror MIB IL MP_NPROCS MP_SETUP Generate a random matrix and right hand side A rand A b rand b Save double precision copies of the matrix and right hand side D A iS b Get single precision inverse to compute the iterative refinement A ily IA Start solution at zero Update it to a more accurate solution with each iteration y d_zero change_old huge one ITERATIVE_REFINEMENT DO Compute the
349. ould be positive It now has value 0 FORWARD Calls MP_SETUP B_Name Error Types and Codes 0 0 4 2 Example 3 This example shows an error when the program unit is in three states The STOP conditions for all error types are changed to NO using the call to routine E1POS program errpex3 In the first state MPI has not been initialized Thus each node writes its own identical copy of the error message The lines may be jumbled in some environments even though that is not the case here There is no indication about the node where the message occurred In the second state MPI is initialized Error messages are gathered and printed as shown in Example 1 In the third state MPI has been used and finalized The executable running on the alternate node is gone and further calls to MPI routines are invalid One error message from the remaining executable prints USE MESES Ey U aaNet IMPLICIT NONE Make calls to the and after using M VNI error processor before whil PARS An example is a call to a routine that expects a positive value for t Set SEOP at he INT CALL E1POS 0 1 EGER argument tribute to NO 0 Before MPI is initialized each node prints the error Lines may be jumbled IMSL Fortran 90 MP Library 4 0 Chapter 9 Error Handling and Messages The Parallel Option 313 CALL C_name 2 Pae MP_NPROCS MP_SETUP CALL C_Name 0 ia
350. pack Generally the assignments and logical operations refer only to component idummy The assignment s_epack 1 0 is equivalent to s_epack 1 s_error 0 0E0 Thus the floating point component rdummy is assigned the value 0EO The assignment statement I s_epack 1 for I an integer type is equivalent to I s_epack 1 idummy The value of component rdummy is ignored in this assignment For the logical operators a single element of any of the IMSL Fortran 90 MP Library derived types can be in either the first or second operand Derived Type Overloaded Assignments and Tests s_options I s_options 1 s_options 1 I j lt lt gt gt s_options I d_options 1 d_options 1 I1 lt lt gt gt d_epack I s_epack 1 s_epack 1 I lt lt gt gt d_epack I d_epack 1 d_epack 1 I lt lt gt gt In the examples operator_ex01 _ex37 the overloaded assignments and tests have been used whenever they improve the readability of the code 172 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 Operator Examples This section presents an equivalent implementation of the examples in Linear Solvers Singular Value and Eigenvalue Decomposition and a single example from Fourier Tranforms Chapters and 2 and a single example from Chapte
351. parallel_hx_bench f90 d_parallel_hx_bench f90 8 time_parallel_xh f90 s_parallel_xh_bench f90 d_parallel_xh_bench f90 9 time_parallel_chol f90 s_parallel_chol_bench f90 d_parallel_chol_bench f90 10 time_parallel_cond f90 s_parallel_cond_bench f90 d_parallel_cond_bench f90 11 time_parallel_rank f90 s_parallel_rank_bench f90 d_parallel_rank_bench f90 Table D Parallel and non Parallel Box Comparisons D 8 Appendix D Benchmarking or Timing Programs IMSL Fortran 90 MP Library 4 0 s_parall d_parall s_parall d_parall d_parall d_parall d_parall d_parall et et 13 time_paral lel_ lel_ 14 time_paral s_parallel_ el_ 15 time_paral s_parallel_ eis 16 time_paral lel_ lel 17 time_paral s_parallel el Number Program Units Function Timed 12 time_paral s_parall lel_det f 90 _det_bench f90 det_bench f90 DET A lel_orth f90 orth_bench f90 orht_bench f90 lel_svd 90 SVD A U U V V ORTH A R R svd_bench f90 svd_bench f90 lel_norm f90 NORM A TYPE 1 norm_bench f90 norm_bench f90 lel_eig f90 EIG A W W eig_bench f90 eig_bench f90 lel ff t 90 FFT_BOX A _fft_bench f90 _fft_bench f90 IFFT_BOX A Table D continued Parallel and non Parallel Box Comparisons IMSL Fortran 90 MP Library 4 0 Appendix D Benchmarking or Timing Programs
352. pe of constraint the tensor product spline function or its partial derivatives is to satisfy at the point where_applied The choices are the character strings lt gt and They respectively indicate that the spline value or its derivatives will be equal to not greater than not less than equal to the value of the spline at another point or equal to the negative of the spline value at another point These last two constraints are called periodic and negative periodic respectively 110 Chapter 4 Curve and Surface Fitting with Splines IMSL Fortran 90 MP Library 4 0 Optional Arguments surface_values derivative derivative_index 1 2 Input These are the number of the partial derivatives for the tensor product spline to apply the constraint The array 0 0 corresponds to the function the value 1 0 to the first partial derivative with respect to x etc If this argument is not present in the list the value 0 0 is substituted automatically Thus a constraint without the derivatives listed applies to the tensor product spline function periodic periodic_point 1 2 This optional argument improves readability by identifying the second pair of independent variable values for periodic constraints This rank 2 array function returns a tensor product array result given two arrays of independent variable values Use the optional input argument for the covariance matrix when the square root of
353. pivoting Default Use cyclic reduction to compute the factorization Description The routine 1in_sol_tri solves k systems of tridiagonal linear algebraic equations each problem of dimension n x n No relation between k and n is required See Kershaw pages 86 88 in Rodrigue 1982 for further details To deal with poorly conditioned or singular systems a specific regularizing term is added to each reciprocated value This technique keeps the factorization process efficient and avoids exceptions from overflow or division by zero Each occurrence of an array reciprocal a is replaced by the expression a ty where the array temporary t has the value 0 whenever the corresponding entry satisfies lal gt Small Alternately t has the value 2 x jolt Every small denominator gives rise to a finite jolt Since this tridiagonal solver is used in the routines lin_svd and lin_eig_self for inverse iteration regularization is required Users can reset the values of Small and jolt for their own needs Using the default values for these parameters it is generally necessary to scale the tridiagonal matrix so that the maximum magnitude has value approximately one This is normally not an issue when the systems are nonsingular The routine is designed to use cyclic reduction as the default method for computing the LU factorization Using an optional parameter standard elimination and partial pivoting will be used to compute the factoriz
354. presented with several examples Many of the program features are exercised The problems complete without any change to the optional arguments except where these changes are required to describe or to solve the problem In many applications the solution to a PDE is used as an auxiliary variable perhaps as part of a larger design or simulation process The truncation error of the approximate solution is commensurate with piece wise linear interpolation on the grid of values at each output point To show that the solution is reasonable a graphical display is revealing and helpful We have not provided graphical output as part of our documentation but users may already have the Visual Numerics Inc product PV WAVE not included with Fortran 90 MP Library Examples 1 8 write results in files PDE_ex0 out that can be visualized with PV WAVE We provide a script of commands pde_1d_mg_plot pro for viewing the solutions see example below The grid of values and each consecutive solution component is displayed in separate plotting windows The script and data files written by examples 1 8 on aSUN SPARC system are in the directory for Fortran 90 MP Library examples When inside PV_WAVE execute the command line pde_1d_mg_plot filename PDE_ex0 out to view the output of a particular example IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations 275 Code for PV WAVE Plotting Examples Directory
355. ps Newton s method h Se s is applied to the array function h s e sg 1 where the following is true Pelee Note the coefficient matrix for the solution values takes whose entry at the intersection of row i and column j is equal to the value IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 31 t j l fea fj is explicitly integrated and evaluated as an array operation The solution analysis of the resulting linear least squares system Af g is obtained by computing the singular value decomposition A USV An approximate solution is computed with the transformed right hand side b U To followed by using as few of the largest singular values as possible to minimize the following squared error residual z 2 LU j j 1 This determines an optimal value k to use in the approximate solution k w ae f i jl Also see operator_ex16 Chapter 6 use lin_sol_svd_int use rand_gen_int use error_option_packet implicit none This is Example 4 for LIN_SOL_SVD integer i j k integer parameter m 64 n 16 real kind le0 parameter one le0 zero 0 0e0 real kind le0 g m s m t n 1 a m n b m 1 amp f n 1 U_S m m V_S n n S_S n amp rms oldrms real kind le0 delta_g delta_t delta_g one real m 1 kind one Compute which collocation equations to solve do i 1 m g i i delta_g end do Compute equally spaced quadrature points delta_t
356. putes a factorization of a random matrix using single precision arithmetic The double precision solution is corrected using iterative refinement The corrections are added to the developing solution until they are no longer decreasing in size The initialization of the derived type array iopti 1 2 s_option 0 0 0e0 leaves the integer part of the second element of iopti at the value zero This stops the internal processing of options inside lin_sol_gen It results in the LU factorization being saved after exit The next time the routine is entered the integer entry of the second element of iopt results in a solve step only Since the LU factorization is saved in arrays A and ipivots atthe final step solve only steps can occur in subsequent entries to lin_sol_gen Also see operator_ex03 Chapter 6 use lin_sol_gen_int use rand_gen_int implicit none This is Example 3 for LIN_SOL_GEN integer parameter n 32 real kind le0 parameter one 1 0e0 zero 0 0e0 real kind 1d0 parameter d_zero 0 0d0 integer ipivots n real kind le0 a n n b n 1 x n 1 w n 2 real kind le0Q change_new change_old 6 e Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 real kind 1d0 c n 1 d n n y n 1 type s_options Llopti 2 s_options 0 zero Generate a random matrix call rand_gen w a reshape w n n Generate a random right hand side call rand_gen b 1 n 1 Save double pre
357. r 1 for no change O assign NO 1 assign YES 2 assign default settings In IMSL routines the routine 1PSH defined in the Error Control section sets the default PRINT and STOP attributes and E1POS is usually not needed This routine provides the flexibility to handle special cases The Library user can set PRINT and STOP attributes by calling ERSET as follows CAT RIS Evins O a Ce melas clei where the change only applies to a single type i error 1 lt i lt 5 corresponding to severity Note Alert Warning Fatal and Terminal Calls to ERSET are defined only after at least one routine name has been pushed onto the subprogram stack There is no restriction for calls to E1POS e Ifi 0 the change applies to all error types e As input values pattr or sattr 1 for no change 0 assign NO 1 assign YES 2 assign default settings The routine ERSET is specifically designed to be an easy to use interface to the PRINT and STOP tables for Library users If i 3 then the specified attributes are set for error types 3 and 6 Similarly if i 4 then the specified attributes are set for error types 4 and 7 The PRINT and STOP attribute settings user default values and values used by IMSL routines 304 Chapter 9 Error Handling and Messages The Parallel Option IMSL Fortran 90 MP Library 4 0 are listed below In an IMSL routine error types 5 6 and 7 are
358. r 3 In all cases the examples have been tested for correctness using equivalent mathematical criteria On the other hand these criteria are not identical to the corresponding examples in all cases In Example 1 for 1in_sol_gen err maxval abs res sum abs A abs b is computed In the operator revision of this example operator_ex0l err norm b A x x norm A norm x norm b is computed Both formulas for err yield values that are about epsilon A To be safe the larger value sqrt epsilon A is used as the tolerance The operator version of the examples are shorter and intended to be easier to read To match the corresponding examples in Chapters 1 2 and 3 to those using the operators consult the following table Chapiers 1 2 and 3 Examples Corresponding Operators lin_sol_gen_exl X27 x3 x4 operator_ex01 x02 x03 x04 lin_sol_self_exl x2 X3 x4 operator_ex05 x06 RORY x08 lin_sol_lsq_exl x2 KS x4 operator_ex09 x10 XILL R12 lin_sol_svd_exl X2 X3 x4 operator_ex13 x14 x15 x16 lin_sol_tri_exl x2 x3 x4 operator_ex17 x18 LIO x20 lin_svd_exl X2 Koy x4 operator_ex21 x22 x23 x24 lin_eig_self_exl May x3 x4 operator_ex25 x26 x27 x28 lin_eig_gen_exl RZ y X3y x4 operator_ex29 x30 X313 x32 lin_geig_gen_exl Mey x3 x4 operator_ex33 x34 x35 x36 fast_dft_ex4 operator_ex
359. r 12 for more details In the Example 2 code Newton s method is used to solve for each reg ularizing parameter of the k systems The solution is then computed and its length is checked Also see operator_ex22 Chapter 6 50 Chapter 2 Singular Value and Eigenvalue Decomposition IMSL Fortran 90 MP Library 4 0 use lin_svd_int use rand_gen_int implicit none This is Example 2 for LIN_SVD integer parameter m 64 n 32 k 4 real kind 1d0 parameter one 1d0 zero 0d0 real kind 1d0 a m n s n u m m v n n y m max n k amp b m k x n k g m k alpha k lamda k amp delta_lamda k t_g n k s_sq n phi n k amp phi_dot n k rand k err Generate a random matrix for both A and B call rand_gen y a reshape y m n call rand_gen y b reshape y m k Compute the singular value decomposition cadi linsvd ay Sy Up Y Choose alpha so that the lengths of the regularized solutions are 0 25 times lengths of the non regularized solutions g matmul transpose u b x matmul v spread one s dim 2 ncopies k g 1l in 1 k alpha 0 25 sqrt sum x 2 dim 1 t_g g l n 1 k spread s dim 2 ncopies k s_sq s 2 lamda zero solve_for_lamda do x one spread s_sq dim 2 ncopies k amp spread lamda dim 1 ncopies n phi t_g x 2 phi_dot 2 phi x delta_lamda sum phi dim 1 alpha 2 sum phi_dot dim 1 Make Newton met
360. r matrix T associated with the reduction of the matrix A to Schur form Optionally a unitary matrix W is returned in array V such that the residuals Z AW WT are small iopt iopt Input Derived type array with the same precision as the input matrix Used for passing optional data to the routine The options are as follows Packaged Options for 1in_eig_gen Option Prefix Option Name Option Value s d c z_ lin_eig_gen_set_small 1 s_ d_ c_ z_ lin_eig_gen_overwrite_input 2 lie deez lin_eig_gen_scan_for_NaN 3 s_ d_ c_ z_ lin_eig_gen_no_balance 4 Cac ek ome ae lin_eig_gen_set_iterations 5 EES Mae ore Ame lin_eig_gen_in_Hess_form 6 le dj c z_ lin_eig_gen_out_Hess_form 7 Ee Cy Ze lin_eig_gen_out_block_form 8 acc d res Ze lin_eig_gen_out_tri_form 9 lin_eig_gen_continue_with_V 10 scadea lin_eig_gen_no_sorting 11 SE S ME S tA iopt IO _options _lin_eig_gen_set_small Small This is the tolerance used to declare off diagonal values effectively zero compared with the size of the numbers involved in the computation of a shift Default Small epsilonQ the relative accuracy of arithmetic 64 Chapter 2 Singular Value and Eigenvalue Decomposition IMSL Fortran 90 MP Library 4 0 iopt IO _options _lin_eig_gen_overwrite_input _dummy Does not save the input array A Default The array is saved iopt IO _options _lin
361. ra 1996 MPI The Complete Reference MIT Press Cambridge MA Struik Struik Dirk J 1961 Lectures on Classical Differential Geometry Second Edition Addison Wesley Reading MA Verwer et al Verwer J G Blom J G Furzeland R M and Zegeling P A 1989 A Moving Grid Method for One Dimensional PDEs Based on the Method of Lines Adaptive Methods for Partial Differential Equations Flaherty J E et al Eds SIAM Publications Philadelphia PA Visual Numerics Products IMSL Math Library Special Functions 1994 Part Number 5111A Visual Numerics Inc Houston TX PV WAVE Reference Manual Version 6 0 1996 Part Numbers 3566 3567 Visual Numerics Inc Houston TX IMSL Fortran 90 MP Library 4 0 Wahba Wahba Grace 1990 Spline Models for Observational Data SIAM Publications Philadelphia PA Wilmott et al Wilmott P Howison S Dewynne J 1995 The Mathematics of Financial Derivatives Cambridge University Press New York NY IMSL Fortran 90 MP Library 4 0 Appendix C References C 5 Appendix D Benchmarking or Timing Programs Scalar Program Descriptions An important question for users concerns the performance of Fortran 90 subprograms compared to equivalent subprograms from the FORTRAN 77 IMSL MATH LIBRARY We have provided a set of main programs shown in Table B These main programs call Fortran 90 array functions in single and double precision that compare
362. rank 1 to be inverted inverse_out x Output Stores the output complex array of rank 1 resulting from the inverse transform ndata n Input Uses the sub array of size n for the numbers Default value n size x ido ido Input Output Integer flag that directs user action Normally this argument is used only when the working variables required for the transform and its inverse are saved in the calling program unit Computing the working variables and saving them in internal arrays within fast_dft is the default This initialization step is expensive There is a two step process to compute the working variables just once Example 3 illustrates this usage The general algorithm for this usage is 80 Chapter 3 Fourier Transforms IMSL Fortran 90 MP Library 4 0 to enter fast_dft with ido 0 A return occurs thereafter with ido lt 0 The optional rank 1 complex array w with size w gt ido must be re allocated Then re enter fast_dft The next return from fast_dft has the output value ido 1 The variables required for the transform and its inverse are saved in w Thereafter when the routine is entered with ido 1 and for the same value of n the contents of w will be used for the working variables The expensive initialization step is avoided The optional arguments ido and work_array must be used together work_array w Output Input Complex array of rank 1 used to store working variab
363. rators is correct end Operator_ex03 use linear_operators implicit none This is Example 3 for LIN_SOL_GEN using operators integer parameter n 32 real kind le0 one le0 zero 0e0 A n n b n x n real kind le0 change_new change_old real kind 1d0 d_zero 0d0 c n d n n y n Generate a random matrix and right hand side A rand A b rand b Save double precision copies of the matrix and right hand side D A c b Compute single precision inverse to compute the iterative refinement A i A Start solution at zero Update it to an accurate solution with each iteration y d_zero 174 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 change_old huge one iterative_refinement do Compute the residual with higher accuracy than the data Bes ee DY eke oy Compute the update in single precision x A x b yrxty change_new norm x Exit when changes are no longer decreasing if change_new gt change_old exit iterative_refinement change_old change_new end do iterative_refinement write Example 3 for LIN_SOL_GEN operators is correct end Operator_ex04 use linear_operators implicit none This is Example 4 for LIN_SOL_GEN using operators integer parameter n 32 k 128 integer i real kind le0 parameter one le0 t_max 1 delta_t t_max k 1 r
364. re given in the following tables Option Name for conp Option Value s_cond_set_small s_cond_for_lin_sol_svd d_cond_set_small c_cond_set_small c_cond_for_lin_sol_svd 1 2 1 d_cond_for_lin_sol_svd 2 1 2 z_cond_set_small 1 2 z_cond_for_lin_sol_svd IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 155 Derived Type Name of Unallocated Array s_options s_cond_options s_options s_cond_options_once d_options d_cond_options d_options d_cond_options_once Example Compute the condition number B A tx A c COND B c COND A 2 DET Compute the determinant of a rectangular matrix A The evaluation is based on the QR decomposition Ri xk 0 0 0 QAP and k rank A Thus det A s x det R where s det Q x det P 1 Required Argument This function requires one argument This argument must be a rank 2 or rank 3 array that contains a rectangular matrix For rank 3 arrays each rank 2 array for fixed third subscript is a separate matrix In this case the output is a rank 1 array of determinant values for each problem Even well conditioned matrices can have determinants with values that have very large or very tiny magnitudes The values may overflow or underflow For this class of problems the use of the logarithmic representation of the determinant found in lin_sol_gen or lin_sol_lsq is requir
365. re numbered 1151 1152 1161 1162 1370 1393 122 Chapter 4 Curve and Surface Fitting with Splines IMSL Fortran 90 MP Library 4 0 Chapter 5 Utilities Contents TTOT POS T 123 bat o Wis To Phe rene reererer reer theron aea rere tee re 126 Example 1 Running Mean and Variance 126 Example 2 Seeding Using and Restoring the Generator 129 Example 3 Generating Strategy with a Histogram 05 130 Example 4 Generating with a Cosine Distribution 132 Sore AY Sa Ie si tanchs iaccterdnisateatacaasaceadacenesacaaieaias aanatiasaceatiaieacecaaaseee 134 Example 1 Sorting an Array cccccecceeeeeeeeeeeceeeeeeeaeeeeaeeeeeeeee 134 Example 2 Sort and Final Move with a Permutation 136 AEN E A AA E cs aehansa A E 137 Example 1 Pnrntng an Ala sssesena meas nein oasis 137 Example 2 Writing an Array to a Character Variable 139 error_post Prints error messages that are generated by IMSL Fortran 90 routines Required Argument epack Input Output Derived type array of size p containing the array of message numbers and associated data for the messages The definition of this derived type is packaged within the modules used as interfaces for each suite of routines The declaration is type _error integer idummy real kind _ rdummy end type The choice of _ is either s_ or d_ depending on the accuracy of the data This array gets addit
366. residual with higher accuracy than the data pS C OD oe Vy Compute the update in single precision x I B 5 JO ME 2S ar WY change_new norm x All processors must share the root s test of convergence CALL MPI_BCAST change_new nr MPI_REAL 0 amp MP_LIBRARY WORLD IERROR Exit when changes are no longer decreasing if ALL change_new gt change_old exit iterative_refinement change_old change_new end DO ITERATIVE_REFINEMENT IF MP_RANK 0 write Parallel Example 3 is correct messages and quit MPI SETUP Final See to any error MP_NPROCS MP_ end 208 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 Parallel Example 4 Here an alternate node is used to compute the majority of a single application and the user does not need to make any explicit calls to MPI routines The time consuming parts are the evaluation of the eigenvalue eigenvector expansion the solving step and the residuals To do this the rank 2 arrays are changed to a box data type with a unit third dimension This uses parallel computing The node priority order is established by the initial function call MP_SETUP n The root is restricted from working on the box data type by assigning MPIT_ROOT_WORKS false This example anticipates that the most efficient node other than the root will perform the heavy computing
367. rguments data data 1 4 Input Output An assumed shape array with size data 1 4 The data are placed in the array data 1l i x data 2 i yj data 3 i Zj data 4 i 06 i 1 ndata If the variances are not known but are proportional to an unknown value use data 4 i 1 i 1 ndata knotsx knotsx Input A derived type _spline_knots that defines the degree of the spline and the breakpoints for the data fitting domain in the first dimension knotsy knotsy Input A derived type _spline_knots that defines the degree of the spline and the breakpoints for the data fitting domain in the second dimension Example 1 Tensor Product Spline Fitting of Data The function g x y exp x 3 is least squares fit by a tensor product of cubic splines on the square 0 2 0 2 There are ndata random pairs of values for the independent variables Each datum is given unit uncertainty The grid of knots in both x and y dimensions are equally spaced in the interior cells and identical to each other After the coefficients are computed a check is made that the surface approximately agrees with g x y at a tensor product grid of equally spaced values USE surface_fitting_int USE rand_int USE norm_int implicit none ERAS acs Example 1 for SURFACE_FITTING tensor product IMSL Fortran 90 MP Library 4 0 Chapter 4 Curve and Surface Fitting with Splines 113 B splines approximati
368. ribes routines for solving systems of linear algebraic equations by direct matrix factorization methods for computing only the matrix factorizations and for computing linear least squares solutions Contents Vin SOL GON A E E E E T EET 2 Example 1 Solving a Linear System of Equations 2 Example 2 Matrix Inversion and Determinant seeeeeeseeeseeeeerrneenes 5 Example 3 Solving a System with Iterative Refinement cccce 6 Example 4 Evaluating the Matrix Exponential ccceeesceeeeenneeeeeeaes 7 PIAS OD SOLE LA E E A E E E E 9 Example 1 Solving a Linear Least squares System 9 Example 2 System Solving with Cholesky Method ccccssseeeeeeees 13 Example 3 Using Inverse Iteration for an Eigenvector ecee 14 Example 4 Accurate Least squares Solution with Iterative Refinement 16 E oE o es pis se veces vas gues eden ce cused iuueee seca E E E E E 17 Example 1 Solving a Linear Least squares System 18 Example 2 System Solving with the Generalized Inverse 00008 22 Example 3 Two Dimensional Data Fitting ce eeeeeeeeeeeeeeeeeeeeeeeeneeees 23 Example 4 Least squares with an Equality Constraint 0 c ceeee 25 lin SO Syanni a aa a a aiaa iii 26 Example 1 Least squares solution of a Rectangular System 26 Example 2 Polar Decomposition of a Square Matrix ccccceseseeeeeeeee 29 Example 3 Reduction of an Array of Black and White
369. rint all types of rank 1 and rank 2 intrinsic arrays Reset precision and subscripts for one type Prepare output in a CHARACTER array Reset precision subscripts and end of line sequence for one type Natural B spline interpolation to the function f x exp x 2 x20 Shape the B spline curve that least squares fits f x exp x 2 x 2 0 with function and derivative constraints matching f x Appendix B List of Examples B 3 spline _fitting_ex3 spline_fitting_ex4 surface_fitting_exl surface_fitting_ex2 surface_fitting_ex3 surface_fitting_ex4 scpk_ex1 scpk_ex2 scpk_ex3 pnlsq_ ex1 pnlsq_ex2 pblsq_ex1 pblsq_ex2 pde_exl1 pde_ex2 pde_ex3 pde_ex4 pde_ex5 pde_ex6 pde_ex7 pde_ex8 pde_ex9 B 4 e Appendix B List of Examples Use B spline interpolation Gauss Legendre quadrature and uniform random numbers to generate random numbers according to the distribution f x exp x 2 I lt x lt l Use piece wise linear B splines to fit a periodic curve the perimeter of a box in two dimensions Use tensor product B splines to least squares fit f x y exp x y x20 y20 Use tensor product B splines to least squares fit the standard spherical coordinate parametric representation of a sphere Remove regularization Use tensor product B splines to least squares fit f x y exp x y x 20 y20 Constraints are F 0 0 1 00 0 and 0 0 0 ox oy Use tensor
370. rite Example 4 for LIN_GEIG_GEN operators is correct end if Clean up the allocated array This is good housekeeping deallocate d_eig_options end Operator_ex37 use rand_gen_int use fft_int use ifft_int use linear_operators 204 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 implicit none This is integer j integer parameter real kind le0 real kind le0 complex kind 1e0 Generate a rand a b rand b Compute the convolution yy 1 1 b do j 2 n yy 2 yy 1 3 end do oka c yy a Example 4 for FAST_DFT err dimension n two random periodic sequences using operators n 40 one 1e0 b fe ar c yy n n dimension n a and b eh of Sat and bh Compute f inverse transform a transform b f ifft fft a fft b Check the Convolution Theorem inverse transform a transform b norm c f norm c err lt sqrt epsilon one err if Gay aaa ees write end if end IMSL Fortran 90 MP Library 4 0 Example 4 for FAST_DFT convolution a b then operators is correct Chapter 6 Operators and Generic Functions The Parallel Option 205 Parallel Examples This section presents a variation of key examples listed above or in other parts MPI REQUIRED of the document In all cases the examples appear to be simple use p
371. rix pencil iopt iopt Input Derived type array with the same precision as the input matrix Used for passing optional data to the routine The options are as follows Packaged Options for 1in_geig_gen Option Name Option Value Sods oe ae lin_geig_gen_set_small 1 ss Aae L lling ig_gen_overwrite_input 2 Seer lin_geig_gen_scan_for_NaN 3 s_ d c z_ in geig_gen_self_adj_pos 4 EE o ENE o 2 lin_geig_gen_for_lin_sol_self 5 72 Chapter 2 Singular Value and Eigenvalue Decomposition IMSL Fortran 90 MP Library 4 0 Packaged Options for 1in_geig_gen S02 ew lin_geig_gen_for_lin_eig_self 6 EE e E lin_geig_gen_for_lin_sol_lsq ef Sea aa a2 lin_geig_gen_for_lin_eig_gen 8 iopt IO _options _lin_geig_gen_set_small Small This tolerance multiplied by the sum of absolute value of the matrix B is used to define a small diagonal term in the routines lin_sol_lsq and 1in_sol_self That value can be replaced using the option flags lin_geig_gen_for_lin_sol_isq and lin_geig_gen_for_lin_sol_self Default Small epsilon the relative accuracy of arithmetic iopt IO _options _lin_geig_gen_overwrite_input _dummy Does not save the input arrays A and B Default The array is saved iopt IO _options _lin_geig_gen_scan_for_NaN _dummy Examines each input array entry to find the first value such that isNaN a i j or is
372. rix algebra are shown in the following table Defined Array Operation Matrix Operation Alternative in Fortran 90 la x B AB mem A B ISA AT lin_sol_gen Inia sol Ise lene A R eA ATSAN transpose A conjg transpose A ALIX B A B 1in_sol_gen lim sol lse B xi A BA 1in_sol_gen lzi sol Lec la tx Bor t A x B A BAR matmul transpose A B A hx B or h A x B matmul conjg transpose A B B xt A orB x t A BA BA matmul B transpose A B xh AorB x h A matmul B conjg transpose A Operators apply generically to all precisions and floating point data types and to objects that are broader in scope than arrays For example the matrix product x applies to matrix times vector and matrix times matrix represented as Fortran 90 arrays It also applies to independent matrix products For this use the notion a box of problems to refer to independent linear algebra computations of the same kind and dimension but different data The racks of the box are the distinct problems In terms of Fortran 90 arrays a rank 3 assumed shape array is the data structure used for a box The first two IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 143 dimensions are the data for a matrix problem the third dimension is the rack number Each problem is independent of
373. rm x lt sqrt epsilon one then write Example 4 for LIN_SOL_LSQ operators is correct 182 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 end if end Operator_ex13 use linear_operators implicit none This is Example 1 for LIN_SOL_SVD using operators and functions integer parameter m 128 n 32 real kind 1d0 one 1d0 err real kind 1d0 A m n b m x n U m m V n n S n g m Generate a random matrix and right hand side A rand A b rand b Compute the least squares solution matrix of Ax b S SVD A U U V V g U tx bj x V x diag one S x g 1 n Check the results err norm A tx b A x x norm A norm x if err lt sqrt epsilon one then write Example 1 for LIN_SOL_SVD operators is correct end if end Operator_ex14 use linear_operators implicit none This is Example 2 for LIN_SOL_SVD using operators and functions integer parameter n 32 real kind 1d0 one 1d0 zero 0d0 real kind 1ld0 A n n P n n Q n n amp S_D n U_D n n V_D n n Generate a random matrix A rand A Compute the singular value decomposition S_D SVD A U U_D V V_D Compute the left orthogonal factor P UD xt VD Compute the right self adjoint factor Q V_D x diag S_D xt VD Check the results if norm EYE n P
374. rocessing routine error_post Every call to a separate routine that includes the argument epack may increase the number of pending error messages When several fatal or terminal error messages are pending reset the level of PRINT and STOP associated with error message printing and stopping see Chapter 9 The value of s_error 1 idummy or d_error 1 idummy indicates the size of the list containing error message numbers and data Call error_post see Chapter 5 any time the array value s_error 1 idummy or d_error 1 idummy is positive You may follow calls to any IMSL Library routine with a call to the error post processor IMSL Fortran 90 MP Library 4 0 Introduction v Optional Subprogram Arguments IMSL Fortran 90 MP Library routines have required and optional arguments All arguments are documented for each routine For example consider the routine 1in_sol_gen that solves the linear algebraic matrix equation Ax b The required arguments are three rank 2 Fortran 90 arrays A b and x The input data for the problem are the A and b arrays the solution output is the x array Often there are other arguments for this linear solver that are closely connected with the computation but are not as compelling as the primary problem The inverse matrix A may be needed as part of a larger application To output this parameter use the optional argument given by the ainv keyword The rank 2 output array argument used on t
375. rrays of input values for both of the independent variables yield an array output spline values of the size of the product of the sizes of the input Users can also evaluate the same surface quantities at a single point The Array Function surface_fitting The coefficients of the tensor product B spline are the output values of this generic function The precision of the coefficients is determined through the generic interface by the precision of the arguments The data array and the derived type _spline_knots for the x and y coordinates are required arguments The array of derived type _surface_constraints is an optional argument 98 Chapter 4 Curve and Surface Fitting with Splines IMSL Fortran 90 MP Library 4 0 spline_constraints This function returns the derived type array result _spline_constraints given optional input There are optional arguments for the derivative index the value applied to the spline and the periodic point for any periodic constraint The function is used for entry number j _spline_constraints j amp spline_constraints derivative derivative_index amp point where_applied value value_applied amp type constraint_indicator amp periodic_point value_applied The square brackets enclose optional arguments For each constraint either but not both the value orthe periodic_point optional arguments must be present Required Arguments point wher
376. rrors delta pi nvalues 1 228 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 x pi twoti delta i 1 nvalues y two x values zero do j 1 3 values values surface_values 0 0 x y knotsx knotsy amp Coen IC Bp ay ee end do values values A 2 Compute the R M S error sizev norm pack values values values nvalues if sizev lt TOLERANCE then if mp_rank amp write Parallel Example 18 is correct end if EXIT BLOCK END DO BLOCK See to any error messages and exit MPI mp_nprocs mp_setup Final end IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 229 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers Introduction This chapter describes the use of ScaLAPACK a suite of dense linear algebra solvers applicable when a single problem size is large We have integrated usage of Fortran 90 MP Library with this library However the ScaLAPACK library including libraries for BLACS and PBLAS are not part of Fortran 90 MP Library To use ScaLAPACK software the required libraries must be installed on the user s computer system We adhered to the specification of Blackford et al 1997 but use only MPI for communication The ScaLAPACK library includes certain LAPACK routines Anderson et al 19
377. rs The quantity real z kind one is the real valued answer when the Schur decomposition method is used Z W X Z Compute the solution by solving for x directly x A EYE n h ix b Check that x and z agree approximately if norm x z norm z lt sqrt epsilon one then write Example 3 for LIN_EIG_GEN operators is correct end if end Operator_ex32 use linear_operators implicit none This is Example 4 using operators for LIN_EIG_GEN integer parameter n 17 real kind 1d0 parameter one 1d0 real kind 1d0 dimension n n A C real kind 1d0 variation n eta complex kind 1d0 dimension n n U V e n d n Generate a random matrix A rand A Compute th igenvalues left and right eigenvectors D EIG A W V E EIG t A W U Compute condition numbers and variations of eigenvalues variation norm A abs diagonals U hx V Now perturb the data in the matrix by the relative factors eta sqrt epsilon and solve for values again Check the differences compared to th stimates They should not exceed the bounds eta sqrt epsilon one C A eta 2 rand A 1 A D EIG C IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 201 Looking at the differences of absolute values accounts for switching signs
378. rwritten by lin_sol_tri Check for any errors This is not recessary but illustrates control returning to the calling program unit call lin_sol_tri c d b y amp epack d_lin_sol_tri_epack call error_post d_lin_sol_tri_epack Check the size of the residuals y x They should be small relative to the size of values in x err norm x l in l in y l n 1 n 1 norm x 1 n 1 n 1 if err lt sqrt epsilon one then write Example 1 for LIN_SOL_TRI operators is correct end if end Operator_ex18 use linear_operators use lin_sol_tri_int implicit none This is Example 2 using operators for LIN_SOL_TRI integer nopt integer parameter n 128 real kind le0 parameter s_one le0 s_zero 0e0 186 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 real kind 1d0 parameter d_one 1d0 d_zero 0d0 real kind le0 dimension 2 n n d b C x Y real kind le0 change_new change_old err type s_options iopt 2 s_options 0 s_zero real kind 1d0 dimension n n d_save b_save c_save amp x_save y_save x_sol logical solve_only c s_zero d s_zero b s_zero x S_zero Generate the upper main and lower diagonals of the matrices A A random vector x is used to construct the right hand sides y A x e 1 n rand c 1 n d lin rand d l in d l in rand c l n x 1 n rand
379. s a Fortran 90 routine with a FORTRAN 77 counterpart The main program reads single lines of input NSIZE NTRIES PREC Description NSIZE NTRIES PREC Description QUIT The parameters NSIZE and NTRIES appear in the summary tables The parameter PREC has values 1 2 or 3 The choice depends on whether the user wants precision of single double or both versions timed The array functions return a 6 x 2 summary table of values F90 Version F77 Equivalent 1 Average time Average time 2 Standard deviation Standard deviation 3 Total time Total time 4 nsize nsize 5 ntries ntries 6 Time Units Sec Time Units Sec IMSL Fortran 90 MP Library 4 0 Appendix D Benchmarking or Timing Programs D 1 As an example the program t ime_rand_gen is compiled and linked with the single and double precision timing functions s_rand_gen_bench and d_rand_gen_bench The two lines of input are 100000 5 3 Random Number Benchmarks QUIT This routine evaluates the elapsed time to compute 100 000 random numbers obtained with rand_gen from the Fortran 90 MP Library and rnun drnun from the IMSL MATH LIBRARY The Average is the mean of the individual elapsed times for 5 calls to the routines obtaining 100 000 random numbers in each call The St Dev is the standard deviation for that Average This value indicates the variability of the Average In order for this value to prov
380. s correct 0 amp See to any error messages and quit MPI MP_NPROCS MP_SETUP Final end 206 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 Parallel Example 2 use linear_operators use mpi_setup_int implicit none This is Parallel Example 2 for i and det with box data types operators and functions integer parameter n 32 nr 4 integer J real kind le0 one 1le0 real kind le0 dimension nr rr det_A det_i he ale kane O SF ccaemensitoras mem 10s esa pelea eee Setup for MPI MP_NPROCS MP_ SETUP Generate a random matrix A rand A Compute the matrix inverse and its determinant inv i A det_A det A Compute the determinant for the inverse matrix det_i det inv Check the quality of both left and right inverses See DO Jel nr Res PHBE CN YS END DO S R R R A x inv S S inv x A err norm R norm S cond A if ALL err lt sqrt epsilon one and amp abs det_A det_i one lt sqrt epsilon one amp and MP_RANK 0 amp write Parallel Example 2 is correct to any error messages and quit MPI MP_NPROCS MP_SETUP Final end Parallel Example 3 This example shows the box data type used while obtaining an accurate solution of several systems Important in this example is the fact that only the ro
381. s solved We anticipate that B might be singular and detect this fact Also see operator _ex36 Chapter 6 use lin_geig_gen_int use lin_sol_lsq_int use rand_gen_int use isNaN_int implicit none This is Example 4 for LIN_GEIG_GEN integer parameter n 32 real kind 1d0 parameter one 1d0 real kind 1d0 a n n b n n beta n y type d_options iopti 4 type d_error epack 1 complex kind 1d0 alpha n v n n Generate random matrices for both A and B call rand_gen y a reshape y n n call rand_gen y b reshape y n n Set the option a larger tolerance than iopti 1l d_options d_lin_geig_gen zero 0d0 n n err default for lin_sol_lsq for_lin_sol_lsq zero Number of secondary optional data items iopti 2 d_options 2 zero iopti 3 d_options d_lin_sol_lsq_set_small sqrt epsilon one amp sqrt sum b 2 n iopti 4 d_options d_lin_sol_lsq_no_sing_mess zero Compute the generalized eigenvalues call lin_geig_gen A B alpha beta ilopt iopti epack epack if not isNaN alpha then Check the residuals err sum abs matmul A V spread be matmul B V spread alp sum abs a tabs b if err lt sqrt epsilon one then V V amp ta dim 1 ncopies n amp ha dim 1 ncopies n amp write Example 4 for LIN_GEIG_G EN is correct Chapt
382. s to solve delta_g one real m 1 kind one 184 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 g i delta_g i 1 m Compute equally spaced quadrature points delta_t one real n kind one t j 1 delta_t j 1 n 1 Compute collocation points with an array form of Newton s method s m SOLVE_EQUATIONS do s s exp s one s g s g lt amp exit SOLVE_EQUATIONS g exp s if sum abs on xp s epsilon one sum g end do SOLVE_EQUATIONS Evaluate the integrals over the quadrature points A exp spread t 1 n 1 m spread s 2 n amp exp spread t 2 nt 1 1 m spread s 2 n amp spread s 2 n Compute the singular value decomposition S_S SVD A U U_S V V_S Singular values larger than epsilon determine the rank k k count S_S gt epsilon one Compute U_S T times right hand side g g US tx g Use the minimum number of singular values that give a good approximation to f t 1 oldrms huge one do i 1 k f V_S 1 i x g 1 1 S_S 1 i rms sum f one 2 n if rms gt oldrms exit oldrms rms end do write Using this number of singular values amp amp i4 the approximate R M S error is lpel2 4 amp i 1 oldrms if sqrt oldrms lt delta_t 2 then write Example 4
383. section of Chapter 6 lists 18 programs that use IMSL defined operations and generic functions applied to the box data type The final two examples show how to choreograph printed output from each parallel process and a surface fitting problem which uses four processes Description Chapter Solve a system with random data 1 Invert a random matrix evaluate its determinant 1 Solve a random system with iterative refinement 1 Evaluate a random matrix exponential 1 Solve a symmetric system of normal equations with random data 1 Solve normal equations using Cholesky method compute 1 covariance uses random data Inverse iteration for an eigenvector of a symmetric matrix with 1 random data Solve a least squares problem using iterative refinement with 1 random data Solve a least squares problem of data fitting a Chebyshev series to 1 a given function with random independent variable values Solve a data fitting problem as in lin_sol_lsq_ex1 using 1 the generalized inverse for computing the coefficients Two dimensional least squares data fitting of radial basis functions 1 to a given function Uses random data Least squares fitting with an equality constraint by heavy 1 weighting uses random data Solve a least squares system with random data 1 Compute the polar decomposition of a square matrix with random 1 data Appendix B List of Examples B 1 lin_sol_svd_ex3 lin_sol_svd_ex4 lin_sol_tri_exl lin_sol_tri_ex2 lin
384. sers can change the handle of MP_LIBRARY_WORLD as required by their application code Often this issue can be ignored e The integers MP_RANK and MP_NPROCS are respectively the node s rank and the number of nodes in the communicator MP_LIBRARY_WORLD Their values require the routines MPI_Comm_size andMPI_Comm_rank The default values are important when MPI is not initialized and a box data type is computed In this case the root node is the only node and it will do all the work No calls to MPI communication routines are made when MP_NPROCS 1 when computing the box data type functions A program can temporarily assign this value to force box data type computation entirely at the root node This is desirable for problems where using many nodes would be less efficient than using the root node exclusively e The array MPI_NODE_PRIORITY is unallocated unless the user allocates it The Fortran 90 MP Library codes use this array for assigning tasks to processors if it is allocated If it is not allocated the default priority of the nodes is 0 1 MP_NPROCS 1 Use of the function call MP_SETUP N allocates the array as explained below Once the array is allocated its size is MP_NPROCS The contents of the array is a permutation of the integers 0 MP_NPROCS 1 Nodes appearing at the start of the list are used first for parallel computing A node other than the root can avoid any computing except receiving the schedule by settin
385. ses random data Compute the FFT of a linear function plus harmonic terms Remove the linear trend and transform the residuals Uses random data Compute the FFT of a complex vector Precompute the multipliers and internal data for later efficiency Uses random data Compute the convolution of two periodic sequences Uses random data Compute FFT of a complex array Transform forward then backwards Uses random data Compute the FFT of a linear function plus harmonic terms Remove the linear trend and transform the residuals Uses random data Compute the FFT of a complex vector Precompute the multipliers and internal data for later efficiency Uses random data Compute FFT of a complex array Transform forward then backwards Uses random data Compute the running mean and variance of a sequence of random numbers Start the random number generation with a known seed Reset the generator after obtaining some numbers Generate integers with the same frequency as a given histogram Executes until the results are steady state and then lists twenty samples Generate random numbers using the PDF function 1 cos x 2m m lt x lt 7 listing thirty samples Sort an array of random numbers so they are non decreasing Sort any array so it is nonincreasing Move columns of a matrix using the output permutation Generate arrays of single and double precision NaNs Uses the function isNaN to detect the NaNs P
386. shape complex array of rank 1 2 or 3 the result is the complex array of the same shape and rank consisting of the DFT Modules Use the appropriate module fft_int or linear_operators Optional Variables Reserved Names The optional argument is WORK 3 a COMPLEX array of the same precision as the data For rank 1 transforms the size of WORK is n 15 To define this array for each problem set WORK 1 0 Each additional rank adds the dimension of the transform plus 15 Using the optional argument WORK increases 160 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 the efficiency of the transform This function uses fast_dft fast_2dft and fast_3dft from Chapter 3 The option and derived type names are given in the following tables Option Name for FFT Option Value options_for_fast_dft 1 Derived Type Name of Unallocated Array s_options s_fft_options s_options s_fft_options_once d_options d_fft_options d_options d_fft_options_once Example Compute the DFT of a random complex array x rand x y fft x FFT BOX The Discrete Fourier Transform of several complex or real sequences Required Argument The function requires one argument x If x is an assumed shape complex array of rank 2 3 or 4 the result is the complex array of the same shape and rank consisting of the DFT for each of the last rank s indices Modules Use t
387. should be positive FORWARD Calls MP_SETUP 0 C_Name 4 riot WANING THRUNOIN 2 og secligike Oi T agument should be positive FORWARD Calls MP_SETUP 0 C_Name 4 xx FATAL ERROR 2 from C_Name now has value 1 FORWARD Calls C_Name IMSL Fortran 90 MP Library 4 0 ie x als Cl elm Selene Omens Om It now has value 0 Error Types and Codes 0 3 C_Name The It now has value 0 Error Types and Codes 0 3 The agument should be positive It Error Types and Codes 3 Chapter 9 Error Handling and Messages The Parallel Option 315 Questions and Answers CAINE EAS CALL E1S CALL E1M CALL E1M Q 1 When do I need to use E1PSH and E1POP A They are not needed in every routine They should be used in every subprogram that calls E1MES either directly or indirectly This is important during application debugging To ignore further calls the user can call E1PSH with the special name nullify_stack The name stacking is restored with a call to E1POP using the same special name Q 2 How can I tell if an error condition has occurred in a lower level routine A When an error state has been set the error type may be retrieved by referencing the INTEGER function N1RTY 1 The corresponding error code is retrieved by referencing the function N1RCD 1 The purpose of the error code is to allow the programmer to
388. sing optional data to the routine The options are as follows Packaged Options for 1in_sol_self lin_sol_self_set_small Option Prefix SiS CZ S Gis Oa Z lin_sol_self_save_factors 2 lin_sol_self_no_pivoting 3 s d c z_ lin_sol_self_use_Cholesky 4 Se o en re lin_sol_self_solve_A 5 s_ d c z_ lin_sol_self_scan_for_NaN 6 s_ d_ c z_ lin_sol_self_no_sing_mess 7 iopt IO _options _lin_sol_self_set_smalli Small When Aasen s method is used the tridiagonal system Tu v is solved using LU factorization with partial pivoting If a diagonal term of the matrix U is smaller in magnitude than the value Smail it is replaced by Small The system is declared singular When the Cholesky method is used the upper triangular matrix R see Description is obtained If a diagonal term of the matrix R is smaller in magnitude than the value Small it is replaced by Small A solution is approximated based on this replacement in either case Default the smallest number that can be reciprocated safely IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 11 iopt IO _options _lin_sol_self_save_factors _dummy Saves the factorization of A Requires the optional argument pivots if the routine will be used for solving further systems with the same matrix This is the only case where the input arrays A and b are not saved For solving efficiency the diagona
389. ssed in this chapter are designed to work well for these three classes of errors IMSL Code and User Code In this chapter user code refers to routines written by the user and referenced by IMSL routines Designating these sections as user code allows the error handler to use error handling attributes set by the user See the discussion of E1USR in the Error Control section of this chapter 302 Chapter 9 Error Handling and Messages The Parallel Option IMSL Fortran 90 MP Library 4 0 Type Class and Severity Information Class 1 Informational note A note is issued to indicate the possibility of a trivial error or simply to provide information about the computations Default attributes PRINT NO STOP NO 2 Informational alert This error type indicates that a function value has been set to zero due to underflow Default attributes PRINT NO STOP NO 3 Informational warning This error type indicates the existence of a condition that may require corrective action by the user or calling routine Usually the condition can be ignored Default attributes user code PRINT YES STOP NO Default attributes IMSL code PRINT NO STOP NO 4 Informational fatal This error type indicates the existence of a condition that may be a serious error In most cases the user or calling routine must take corrective action to recover Default attributes user code PRINT YES STOP NO Default attributes I
390. ssignment C SURFACE_FITTING defined below The values M size C 1 and N size C 2 satisfies the respective identities N 1 spline_degree size _knotsx and M 1 spline_degree size _knotsy where the two right most quantities in both equations refer to components of the arguments knot sx and knotsy The same value of spline_degree must be used for both knotsx and knotsy Optional Arguments covariance G Input This argument when present results in the evaluation of the square root of the variance function elx y b x y Gb x y where b x y B x B y By x By y and G is the covariance matrix associated with the coefficients of the spline T C Cissy The argument G is an optional output from surface_fitting described below When the square root of the variance function is computed the arguments DERIVATIVE and C are not used iopt iopt Input This optional argument of derived type _options is not used in this release surface_fitting Weighted least squares fitting by tensor product B splines to discrete two dimensional data is performed Constraints on the spline or its partial derivatives are optional The spline function 112 Chapter 4 Curve and Surface Fitting with Splines IMSL Fortran 90 MP Library 4 0 Fy YoY eyBi 0 B 2 j l i l its derivatives or the square root of its variance function are evaluated after the fitting Required A
391. st_3dft_scale_inverse real_part_of_scale iopt IO 1 _options _dummy imaginary_part_of_scale Complex number defined by the factor cmplx real_part_of_scale imaginary_part_of_scale is multiplied by the inverse transformed array Default value is 1 Description The fast_3dft routine is a Fortran 90 version of the FFT suite of IMSL 1994 pp 772 776 Fatal and Terminal Messages See the messages gls file for error messages for fast_3dft These error mes sages are numbered 685 695 740 750 94 e Chapter 3 Fourier Transforms IMSL Fortran 90 MP Library 4 0 Chapter 4 Curve and Surface Fitting with Splines Contents gpline constraints rusan a RA A A a 99 spline Values reroncen orii tiia eT iaia 100 spline Fitting aprann aiaiai 101 Example 1 Natural Cubic Spline Interpolation to Data cee 101 Example 2 Shaping a Curve and its Derivatives c ceeeeeeeeeeeees 104 Example 3 Splines Model a Random Number Generator 0 106 Example 4 Represent a Periodic Curve ccscecescceeceseeeeeeeeeneeeneeees 108 SUP face Const Paints ls siasisccabadcadnasadinsaapasiadndagenawaap i 110 SUP LACE Val USS se caciyscscesustbedocsssusacdayteacouss beanedsnseataunratecdenn Denceces seenedayaaees 111 surfaca dol oy Jes oo asesan a E ES 112 Example 1 Tensor Product Spline Fitting of Data 00 eee 113 Example 2 Parametric Representation of a Sphere cceeseeeees 116 Example 3 C
392. sum abs res abs d 1 lt sqrt epsilon one then write Example 3 for LIN_EIG_SELF is correct end if end if end 60 Chapter 2 Singular Value and Eigenvalue Decomposition IMSL Fortran 90 MP Library 4 0 Example 4 Analysis and Reduction of a Generalized Eigensystem A generalized eigenvalue problem is Ax ABx where A and B are n X n self adjoint matrices The matrix B is positive definite This problem is reduced to an ordinary self adjoint eigenvalue problem Cy Ay by changing the variables of the generalized problem to an equivalent form The eigenvalue eigenvector decomposition B VSV is first computed labeling an eigenvalue too small if it is less than epsilon 1 d0 The ordinary self adjoint eigenvalue problem is Cy Ay provided that the rank of B based on this definition of Small has the value n In that case where C DV AVD The relationship between x and y is summarized as X VDY computed after the ordinary eigenvalue problem is solved for the eigenvectors Y of C The matrix X is normalized so that each column has Euclidean length of value one This solution method is nonstandard for any but the most ill conditioned matrices B The standard approach is to compute an ordinary self adjoint problem following computation of the Cholesky decomposition where R is upper triangular The computation of C can also be completed efficiently by exploiting its self adjoint property See Golub and V
393. t type s_spline_knots break_points type s_spline_constraints constraints 1 tdata i 1 delta_t i 1 ndata xspline_data 1 tdata yspline_data 1l tdata xspline_data 2 xdata yspline_data 2 ydata xspline_data 3 sddata yspline_data 3 sddata bkpt nord nbkpt ndegree i nord delta_b amp i nord nbkpt ndegree Collapse the outside knots bkpt 1 ndegree bkpt nord bkpt nbkpt ndegree 1 nbkpt bkpt nbkpt ndegree Assign the degr of the polynomial and the knots pointer_bkpt gt bkpt break_points s_spline_knots ndegree pointer_bkpt Make the two parametric curves also periodic constraints 1l spline_constraints amp derivative 0 point bkpt nord type amp value bkpt nbkpt ndegree xcoeff spline_fitting data xspline_data knots break_points amp constraints constraints ycoeff spline_fitting data yspline_data knots break_points amp constraints constraints Use the splines to compute the coordinates of points along the perimeter Compare them with the coordinates of the edge points tvalues i 1 delta_v i 1 nvalues xvalues spline_values 0 tvalues break_points xcoeff yvalues spline_values 0 tvalues break_points ycoeff do i l nvalues j 1 1 ngridt l delta_x xdata jt 1 xdata j ngrid delta_y ydata jt 1 ydata j ngrid IMSL Fortran 90 MP Library 4 0 Chapter 4 Curve and Surface
394. t BKPT I LEFT_OF LEFT_OF 1 end do Use Newton s method to solve the nonlinear equation accumulated_distribution_function random_number 0 alpha_x x bkpt LEFT_OF half beta_x xtbkpt LEFT_OF half FN QUAD LEFT_OF NORD RN DO I 1 NQUAD FN FN QW I spline_values 0 alpha_x QX I beta_x amp break_points coeff alpha_x END DO This is the Newton method update step x x fn spline_values 0 x break_points coeff niterat niterattl Constrain the values so they fall back into the interval Newton s method may give approximates outside the interval where x lt one or X gt one x zero if norm fn 1 lt sqrt epsilon one norm x 1 amp exit solve_equation end do solve_equation Check that Newton s method converges if niterat lt limit then write Example 3 for SPLINE_FITTING is correct end if T end Example 4 Represent a Periodic Curve The curve tracing the edge of a rectangular box traversed in a counter clockwise direction is parameterized with a spline representation for each coordinate function x t y t The functions are constrained to be periodic at the ends of the parameter interval Since the perimeter arcs are piece wise linear functions the degree of the splines is the value one Some breakpoints are chosen so they correspond to corners of the box where the derivatives of the
395. t Derived type array with the same precision as the input array used for passing optional data to spline_fitting The options are as follows Packaged Options for spline_fitting Prefix None Option Name Option Value spline_fitting_tol_equal spline_fitting_tol_least iopt IO _options spline_fitting_tol_equal _value This resets the value for determining that equality constraint equations are rank deficient The default is 2_value 10 iopt IO _options spline_fitting_tol_least _value This resets the value for determining that least squares equations are rank deficient The default is _value 1074 Description This routine has similar scope to CONFT DCONFT found in IMSL 1994 pp 551 560 We provide the square root of the variance function but we do not provide for constraints on the integral of the spline The least squares matrix problem for the coefficients is banded with band width equal to the spline order This fact is used to obtain an efficient solution algorithm when there are no constraints When constraints are present the routine solves a linear least squares problem with equality and inequality constraints The processed least squares equations result in a banded and upper triangular matrix following accumulation of the spline fitting equations The algorithm used for solving the constrained least IMSL Fortran 90 MP Library 4 0 Chapter 4 Curve and Surface Fitting with Splines 103 squar
396. tain random numbers call rand_gen x Calculate each partial mean do i l1 n mean_1 i sum x 1 1i i end do Calculate each partial variance do i l n s_1 i sum x 1 i mean_1 i 2 i end do mean_2 0 zero mean_2 1 x 1 s_2 0 1 zero Alternately calculate each running mean and variance handling the random numbers once do i 2 n mean_2 i i 1 mean_2 i 1 x i i s_2 i i 1 s_2 i 1 it mean_2 i x i 2 i 1 end do Check that the two sets of means and variances agree if maxval abs mean_1 1 mean_2 1 mean_1 1 lt amp sqrt epsilon one then if maxval abs s_1 2 s_2 2 s_1 2 lt amp sqrt epsilon one then write Example 1 for RAND_GEN is correct end if end if end Optional Arguments irnd irnd Output Rank 1 integer array These integers are the internal results of the Generalized Feedback Shift Register GFSR algorithm The values are scaled to yield the floating point array x The output array entries are between and 2 _ 1 in value istate_in istate_in Input Rank 1 integer array of size 3p 2 where p 521 that defines the ensuing state of the GFSR generator It is used to reset the internal tables to a previously defined state It is the result of a previous use of the optional argument 6 istate_out istate_out istate_out Output Rank 1 integer array of size 3p 2 that describes the current sta
397. taining the right hand side matrix x Output Array of size n X nb containing the solution matrix Example 1 Solving a Linear Least squares System This example solves a linear least squares system Cx d where C is a real matrix with m n The least squares solution is computed using the self adjoint matrix A C C IMSL Fortran 90 MP Library 4 0 Chapter 1 Linear Solvers 9 and the right hand side b A d The n X n self adjoint system Ax b is solved for x This solution method is not as satisfactory in terms of numerical accuracy as solving the system Cx d directly by using the routine 1in_sol_isq Also see operator_ex05 Cha pter 6 use lin_sol_self_int use rand_gen_int implicit none This is Example 1 for LIN_SOL_SELF integer parameter m 64 n 32 real kind le0 parameter one le0 real kind le0 err real kind le0 dimension n n A b x res y m n amp C m n d m n Generate two rectangular random matrices call rand_gen y C reshape y m n call rand_gen y d reshape y m n Form the normal equations for the rectangular system A matmul transpose C C b matmul transpose C d Compute the solution for Ax b call lin_sol_self A b x Check the results for small residuals res b matmul A x err maxval abs res sum abs A abs b if err lt sqrt epsilon one then write Example 1 for
398. te For purposes of maximum efficiency an option is passed to routine 1in_sol_self so that pivoting is not used in the computation of the Cholesky decomposition of matrix B This example does not require that secondary option Also see operator _ex34 Chapter 6 74 Chapter 2 Singular Value and Eigenvalue Decomposition IMSL Fortran 90 MP Library 4 0 use lin_geig_gen_int use lin_sol_self_int use rand_gen_int implicit none This is Example 2 for LIN_GEIG_GEN integer i integer parameter n 32 real kind 1d0 parameter one 1 0d0 zero 0 0d0 real kind 1d0 beta n temp_c n n temp_d n n err type d_options iopti 4 d_options 0 zero complex kind 1d0 dimension n n A B C D V alpha n Generate random matrices for both A and B do i l n call rand_gen temp_c 1 n i call rand_gen temp_d 1 n i end do c temp_c d temp_c do i l n call rand_gen temp_c 1 n i call rand_gen temp_d 1 n i end do c cmplx real c temp_c kind one d cmplx real d temp_d kind one a conjg transpose c c b matmul conjg transpose d d Set option so that the generalized eigenvalue solver uses an efficient method for well posed self adjoint problems iopti 1 d_options z_lin_geig_gen_self_adj_pos zero iopti 2 d_options z_lin_geig_gen_for_lin_sol_self zero Number of secondary optional data items and the options iopti 3 d_options 1 zero iopt
399. te of the GFSR IMSL Fortran 90 MP Library 4 0 Chapter 5 Utilities e 127 generator It is normally used to later reset the internal tables to the state defined following a return from the GFSR generator It is the result of a use of the generator without a user initialization or it is the result of a previous use of the optional argument istate_in followed by updates to the internal tables from newly generated values Example 2 illustrates use of istate_in and istate_out for setting and then resetting rand_gen so that the sequence of integers irnd is repeatable iopt iopt nput Output Derived type array with the same precision as the array x used for passing optional data to rand_gen The options are as follows Packaged Options for rand_gen Option Prefix Option Name Option Value s_ d_ rand_gen_generator_seed 1 s_ d_ rand_gen_LCM_modulus 2 s_ d_ rand_gen_use_Fushimi_start 3 iopt IO _options _rand_gen_generator_seed _dummy Sets the initial values for the GFSR The present value of the seed obtained by default from the real time clock as described below swaps places with iopt I0 1 idummy If the seed is set before any current usage of rand_gen the exchanged value will be zero iopt IO _options _rand_gen_LCM_modulus _dummy iopt IO 1 _options modulus _dummy Sets the initial values for the GFSR The present value of the LCM with default value k 16807 s
400. ted and a Fortran 90 STOP is executed Users may want to change this rule This is illustrated by continuing and not printing the error message The following is an additional source to accomplish this for all following invocations of the operator i allocate inverse_options 1 inverse_options 1 skip_error_processing B i A There are additional self documenting integer parameters packaged in the module linear_operators that allow users other choices such as changing the value of the tolerance as noted above Included will be the ability to have the option apply for just the next invocation of the operator Options are available that allow optional data to be passed to supporting Fortran 90 subroutines This is illustrated with an example in operator_ex36 in this chapter IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 149 Operators x tx xt hx xh Compute matrix vector and matrix matrix products The results are in a precision and data type that ascends to the most accurate or complex operand The operators apply when one or both operands are rank 1 rank 2 or rank 3 arrays Required Operands Each of these operators requires two operands Mixing of intrinsic floating point data types arrays is permitted There is no distinction made between a rank 1 array considered a slim matrix and the transpose of this matrix Defined operations have lower precedence
401. than any intrinsic operation so the liberal use of parentheses is suggested when mixing them Modules Use the appropriate one of the modules operation_x operation_tx operation_xt operation_hx operation_xh or linear_operators Optional Variables Reserved Names These operators have neither packaged optional variables nor reserved names Examples Compute the matrix times vector y Ax y A x x e Compute the vector times matrix y x A y x xX A y A tx x e Compute the matrix expression D B AC D B A x C Operators t h Compute transpose and conjugate transpose of a matrix The operation may be read transpose or adjoint and the results are the mathematical objects in a precision and data type that matches the operand The operators apply when the single operand is a rank 2 or rank 3 array 150 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 Required Operand Each of these operators requires a single operand Since these are unary operations they have higher Fortran 90 precedence than any other intrinsic unary array operation Modules Use the appropriate one of the modules operation_t operation_h or linear_operators Optional Variables Reserved Names These operators have neither packaged optional variables nor reserved names Examples Compute the matrix times vector y A x yu t A x xpy A tx x Compute the vector times matrix
402. the inverse iteration problems Default singular vectors computed using cyclic reduction iopt IO _options _lin_svd_set_perf_ratio perf_ratio Uses residuals for approximate normalized singular vectors if they have a performance index no larger than perf_ratio Otherwise an alternate approach is taken and the singular vectors are computed again Standard elimination is used instead of cyclic reduction or the standard QR algorithm is used as a backup procedure to inverse iteration Larger values of perf_ratio are less likely to cause these exceptions Default perf_ratio 4 Description Routine 1in_svd is an implementation of the QR algorithm for computing the SVD of rectangular matrices An orthogonal reduction of the input matrix to upper bidiagonal form is performed Then the SVD of a real bidiagonal matrix is calculated The orthogonal decomposition AV US results from products of intermediate matrix factors See Golub and Van Loan 1989 Chapter 8 for details Additional Examples Example 2 Linear Least Squares with a Quadratic Constraint An m X n matrix equation Ax b m gt n is approximated in a least squares sense The matrix b is size m X k Each of the k solution vectors of the matrix x is constrained to have Euclidean length of value gt 0 The value of a is chosen so that the constrained solution is 0 25 the length of the nonregularized or standard least squares equation See Golub and Van Loan 1989 Chapte
403. the matrix times vector 1 2 3 4 5 y Als y i A x x jy A ix x Compute the vector times matrix y x Al y ux x i A y x Xi A Compute the matrix expression D B A C D B iA x C D B Option Value Name of Unallocated Array d_inv_options_once IMSL Fortran 90 MP Library 4 0 Operators ix xi Compute the inverse matrix times a vector or matrix for square non singular matrices or the corresponding Moore Penrose generalized inverse matrix for singular square matrices or rectangular matrices The operation may be read generalized inverse times or times generalized inverse The results are in a precision and data type that matches the most accurate or complex operand Required Operand This operator requires two operands In the template for usage y A ix b the first operand A can be rank 2 or rank 3 The second operand b can be rank 1 rank 2 or rank 3 For the alternate usage template y b xi A the first operand b can be rank 1 rank 2 or rank 3 The second operand A can be rank 2 or rank 3 Modules Use the appropriate one of the modules operation_ix operation_xi or linear_operators Optional Variables Reserved Names This operator uses the routines lin_sol_gen or lin_sol_lsq See Chapter 1 Linear Solvers lin_sol_gen and lin_sol_1sq The option and derived type names are given in the following tables Xi xi_ Option Names for ix xi Option Value use_lin_sol_gen_o
404. the same shape as the histogram This is not required to generate samples The program next generates a summary set of integers according to the histogram These are not repeatable and are representative of the histogram in the sense of looking at 20 integers during generation of a large number of samples use rand_gen_int use show_int implicit none 130 Chapter 5 Utilities IMSL Fortran 90 MP Library 4 0 This is Example 3 for RAND_GEN integer i i_bin i_map i_left i_right integer parameter n_work 1000 integer parameter n_bins 10 integer parameter scale 1000 integer parameter total_counts 100 integer parameter n_samples total_counts scale integer dimension n_bins histogram amp J r GB May 1207 Ep Lap 9 hp B37 integer dimension n_work working 0 integer dimension n_bins distribution 0 integer break_points 0 n_bins real kind le0 rn n_samples real kind le0O parameter tolerance 0 005 integer parameter n_samples_20 20 integer rand_num_20 n_samples_20 real kind le0 rn_20 n_samples_20 Compute the normalized cumulative distribution break_points 0 0 do i 1 n_bins break_points i break_points i 1 histogram i end do break_points break_points n_work total_counts Obtain uniform random numbers call rand_gen rn Set up the secondary mapping array do i_bin 1 n_bins i_left break_points i_bin 1 1 i_right break_points i_bin do
405. the variance function is evaluated The result will be a scalar value when the input independent variable is scalar Required Arguments IMSL Fortran 90 MP Library 4 0 derivative derivative 1 2 Input The indices of the partial derivative evaluated Use non negative integer values For the function itself use the array 0 0 variablesx variablesx Input The independent variable values in the first or x dimension where the spline or its derivatives are evaluated Either a rank 1 array or a scalar can be used as this argument variablesy variablesy Input The independent variable values in the second or y dimension where the spline or its derivatives are evaluated Either a rank 1 array or a scalar can be used as this argument knotsx knotsx Input The derived type _spline_knots used when the array coeffs was obtained with the function SURFACE_FITTING This contains the polynomial spline degree and the number of knots and the knots themselves in the x dimension knotsy knotsy Input The derived type _spline_knots used when the array Chapter 4 Curve and Surface Fitting with Splines 111 coeffs was obtained with the function SURFACE_FITTING This contains the polynomial spline degree and the number of knots and the knots themselves in the y dimension coeffs c Input The coefficients in the representation for the spline function These result from the fitting process or array a
406. tion ON TEND 4D0 p Ear ET 28 A Z TWO 2D0 THREE 3D0 I ES M IM IM EL TIM Teh SIM V_MIN ALLOCA IOPT 4 SHOW_INTOPT 2 ABLI ry L PROCS MP_S PRIOR the RKS TRUE ry S_OPTIONS I ah COSl MPI ROOT_WORKS NOT _ROOT_WO UP 1 TNS AGES I 1 program stops PENRROCS LE M FALSE IXS caeh CATE pP COUNTS MP_NP Get time s p ROCS Ler E TI IF ULATE Bie SE D ry MPI_WTIM P RANK AV_TIME for simula MP_NPROCS SHOW_IOPT 2 t AV IM aay a MAX_TIME MP_NPROCS Change Le MP_NPROCS MAX_TIME COR MP_NPROCS tion timing OPEN FILE P D ex09 out UNIT 7 Pick random parameter values iB p EPS 1D 1 ON E rand EPS 1D 1 ONI E rand P 296 Chapter 8 Partial Differential Equations IMSL Fortran 90 MP Library 4 0 ETA 10D0 ONE rand ETA Start loop to integrate and communicate solution times IDO 1 Get time start for each new problem DO IF NOT MPI_ROOT_WORKS and MP_RANK 0 EXIT
407. tion about correcting the error condition e Avoid calls to E1PSH and E1POP if this routine is at the most forward level of the call chain and there is no error condition Use of these routines is expensive Consider using the nullify_stack argument for operational use When this special name is an argument to E1PSH the package ceases stacking names When it is an argument to E1POP resume stacking names and print any error messages at Level 1 IMSL Fortran 90 MP Library 4 0 Chapter 9 Error Handling and Messages The Parallel Option 309 Error Message Formats and Examples Error messages are developed from arguments in the program unit that calls E1MES Additional information is inserted into the text including drop in values used to clarify the error type meaning subprogram name where the error occurred and node names and ranks where the application is executing The message is printed by lines breaking on a blank if possible The number of columns in a line has the default value SCREEN_SIZE 72 This can be reset using the routine E1HDR as follows CALL E1HDR NEW_SCREEN_SIZE The sign of the INTEGER variable NEW_SCREEN_S1IZE determines the action If its value is non positive then the argument is an output assigned the current value of SCREEN_SIZE For positive values of the argument the value of SCREEN_SIZE is s
408. tion access from this point forward HANDLE N1THRD This INTEGER function gives a handle for purposes of identifying the execution thread The default routine returns HANDLE 1 TEST NIMTCH HANDLE_1 HANDLE_2 This INTEGER function compares two thread handles for equality The default routines returns the bit wise exclusive or value NIMTCH ieor HANDLE_1 HANDLE_2 318 Chapter 9 Error Handling and Messages The Parallel Option IMSL Fortran 90 MP Library 4 0 Appendix A List of Subprograms and GAMS Classification Routine error_post fast_dft fast_2dft fast_3dft isNaN lin_eig_gen IMSL Fortran 90 MP Library 4 0 The routines listed below are the generic names typically called by users in their codes In fact there is no external library name in IMSL F90 MP Library that matches these generic names The generic name is associated at compile time with a specific external name that is appropriate for that data type The specific external names are not listed below Note that appearing in the Chapter column means that the routine is not intended to be user callable Purpose Chapter GAMS Prints error messages that are generated See Chapter 5 R3 by IMSL Library routines Computes the Discrete Fourier See Chapter 3 Jla2 Transform DFT of a rank 1 complex array x Computes the Discrete Fourier See Chapter 3 Jib Transform DFT of a rank 2 complex array
409. tion is desired the program ends here hat are about rounding ames Mehemsuzeom sehnomconsitieadnate equation and right hand side They are replaced DY Ox acl Zero WHERE W ZERO X ZERO W X Each group of residuals is disjoint per pr We add all the pieces together for the tota constraint LF CA M S P_NP LM WCS Ss l e PI_REDUCE X W N MET BE UTS _NPROC SUM iy P_LIBRARY_ WORLD IER P_RANK and PRINT amp HOW W Residuals for the constraints errors and shut down MPI S MP_SETUP Final P_ RANK THEN COUNT W lt ZI Example 1 for PARALL ocessor 1 set of DOUBLE __PRECISION amp ERO 0 WRITE ROR amp EL_NONN EGATIV TISO als CODISE Y Example 2 Distributed Non negative Least Squares The program PNLSQ_EX2 illustrates the computation of the solution to a system of linear least squares equations with simple constraints T a x b i 1 m subject to x 20 In this example we write the row vectors a b on a file This illustrates reading the data by rows and arranging the Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers e 249 data by columns as required by PARALLEL_NONNEGATIVE_LSQ After reading the data the right hand side vector is broadcast to the group before computing a so
410. tion y x evaluated at the grid of points Also see operator_ex10 Chapter 6 use lin_sol_lsq_int implicit none This is Example 2 for LIN_SOL_LSQ integer i integer parameter m 128 n 8 real kind 1d0 parameter one 1 0d0 zero 0 0d0 real kind 1d0 a m O n c O n 1 pi_over_2 x m y m 1 amp u m v m w m delta_x inv 0O n m Generate an array of equally spaced points on the interval 1 1 delta_x 2 real m 1 kind one do i l m x i one i 1 delta_x end do Compute the constant PI 2 pi_over_2 atan one 2 Compute data values on the grid y l m 1 exp x cos pi_over_2 x Fill in the least squares matrix for the Chebyshev polynomials 22 Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 do i 2 n aliri 2 x a t 4 1 ae 4 2 end do Compute the generalized inverse of the least squares matrix call lin_sol_lsq a y c nrhs 0 ainv inv Compute the series coefficients using the generalized invers as smoothing formulas c 0 n 1 matmul inv 0O n 1 m y 1 m 1 Evaluate residuals using backward recurrence formulas u zero v zero do i n 0 1 w 2 x u v c i 1 v au u w end do y 1 m 1 exp x cos pi_over_2 x u x v Check that n 2 sign changes in the residual curve occur This test will fail when n is larger x one x sign x y 1 m 1 if count x l m 1 x 2 m n 2 t
411. tional equation ov w dug Thus ow _ dw d ax Since the initial data for v x 0 uo the variational equation initial condition is w x 0 1 This model problem illustrates the method of lines and Galerkin principle implemented with the differential algebraic solver D2SPG IMSL 1994 pp 696 717 We use the integrator in reverse communication mode for evaluating the required functions derivatives and solving linear algebraic equations See Example 4 of routine DASPG IMSL 1994 pp 713 717 for a problem that uses reverse communication Next see Example 4 of routine IvPAG IMSL 1994 p 674 678 for the development of the piecewise linear Galerkin discretization method to solve the differential equation This present example extends parts of both previous examples and illustrates Fortran 90 constructs It further illustrates how a user can deal with a defect of an integrator that normally functions using only dense linear algebra factorization methods for solving the corrector equations See the comments in Brenan et al 1989 esp p 137 Also see operator _ex20 Chapter 6 42 Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 use lin_sol_tri_int use rand_gen_int use Numerical_Libraries implicit none This is Example 4 for LIN_SOL_TRI integer parameter n 1000 ichap 5 iget 1 iput 2 amp inum 6 irnum 7 real kind le0 parameter zero 0e0 one 1le0 integer i ido in 50 inr 2
412. tional argument is provided indicating that A is positive definite so that the Cholesky decomposition can be used Solves a rectangular least squares system See Chapter 1 D9al of linear equations Ax b using singular D6 value decomposition A usv Using optional arguments any of several related computations can be performed These extra tasks include computing the rank of A the orthogonal m x m and n x n matrices U and V and the m x n diagonal matrix of singular values S Solves multiple systems of linear See Chapter 1 D2a2a equations Ajx y j 1 k Each D2c2a matrix A is tridiagonal with the same dimension n The default solution method is based on LU factorization computed using cyclic reduction An option is used to select Gaussian elimination with partial pivoting Computes the singular value See Chapter 2 D6 decomposition SVD of a rectangular matrix A This gives the decomposition A usvV where V is ann Xn orthogonal matrix U is an m x m orthogonal matrix and S is a real rectangular diagonal matrix Returns as a scalar function a value See Chapter 6 R1 corresponding to the IEEE 754 Standard format of floating point ANSI IEEE 1985 for NaN Parallel routines for non negative See Chapter 7 Kla2 constrained linear least squares based on a descent algorithm Parallel routines for simple bounded See Chapter 7 Kla2 constrained linear least squares based on a descent algorithm Append
413. tions 2 show_starting_index_is options 3 1 The starting value options 4 show_end_of_line_sequence_is options 5 2 Use 2 EOL characters options 6 10 The ASCII code for CR options 7 13 The ASCII code for NL BUFFER Blank out the buffer Prepare the output in BUFFER call Shiau S se Rank 1 REAL with 7 digits natural indexing from rank amp trim adjust1l PROC_NUM IMAGE BUFFER IOPT options do i 1 mp_nprocs 1 A handle or baton is received by the non root nodes call mpi_bcast BATON 1 MPI_INTEGER 0 amp MP_LIBRARY WORLD ierror I we cais goce has rie beron slic icelinsinslics ales OCES if BATON mp_rank amp call mpi_send buffer BSIZE MPI_CHARACTER 0 mp_rank amp MP_LIBRARY WORLD ierror end do IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 225 else DO I 1 MP_NPROCS 1 The root sends out a handle to a node It is received as the value BATON Cenk injoslocesie Ab ib IMi2 ar TUINfImE eiRe 0 fe MP_LIBRARY_ WORLD ierror A buffer of data arrives from a node call mpi_recv buffer BSIZE MPI_CHARACTER MPI_ANY_SOURCI MPI_ANY_TAG MP_LIBRARY_ WORLD STATUS IERROR ESI Qa Display BUFFER as a CHARACTER array Discard blanks on the ends Look for non printable characters as limits p 0 k LEN TRIM BUFF DISPLAL
414. to switch to equivalent subroutine calls using IMSL Fortran 90 MP Library routines or mathematical routines in the IMSL FORTRAN 77 Libraries Defined Array Operation Matrix Operation A X B AB Ls A AT t Ahe A AT A A ix B AB B xi A BA la tx B or t A x B A B A B A hx B or h A x B B xt A orB x t A BA BA B xh A orB x h A Defined Array Functions Matrix Operation S SvD A U U V V A usv E EIG A B B D D V V W W AV VE AVD BVE AW WE AWD BWE IMSL Fortran 90 MP Library 4 0 Introduction e iii P NORM A type i m Defined Array Functions Matrix Operation Q ORTH A R R A QR Q Q 1 U UNIT A uy ay lar F DET A det A determinant K RANK A rank A rank aij p Al max gt i l p Al s largest singular value aij n p gt All eshuge 1 z max _ j l C COND A 5 Srank A 7 EYF N Z 1y A DIAG X A diag x X DIAGONALS A x ay W FFT Z Z IFFT W Discrete Fourier Transform Inverse A RAND A random numbers 0 lt A lt 1 L isNaN A test for NaN if 1 then Getting Started iv Introduction It is strongly suggested that users force all program variables to be explicitl
415. tput Array of the same type and kind as A 1 m 1 n It contains the n x n orthogonal matrix V iopt iopt Input Derived type array with the same precision as the input matrix Used for passing optional data to the routine The options are as follows Packaged Options for 1in_sol_svd lin_sol_svd_set_small lin_sol_svd_overwrite_input s d c z_ lin_sol_svd_safe_reciprocal 3 s_ d c z_ lin_sol_svd_scan_for_NaN 4 iopt IO _options _lin_sol_svd_set_small Small Replaces with zero a diagonal term of the matrix S if it is smaller in magnitude than the value Small This determines the approximate rank of the matrix which is returned as the rank optional argument A solution is approximated based on this replacement Default the smallest number that can be safely reciprocated iopt IO _options _lin_sol_svd_overwrite_input _dummy Does not save the input arrays A and b iopt IO _options _lin_sol_svd_safe_reciprocal safe Replaces a denominator term with safe if it is smaller in magnitude than the value safe Default the smallest number that can be safely reciprocated iopt IO _options _lin_sol_svd_scan_for_NaN _dummy Examines each input array entry to find the first value such that isNaN a i j or isNan b i j true See the isNaN function Chapter 6 Default Does not scan for NaNs Description The 1in_sol_svd routine solves a rectangular system of
416. tput lines in array BUFFER End each line with ASCII sequence CR NL options 1 show_significant_digits_is_7 options 2 show_starting_index_is options 3 1 The starting value options 4 show_end_of_line_sequence_is options 5 2 Use 2 EOL characters options 6 10 The ASCII code for CR options 7 13 The ASCII code for NL BUFFER Blank out the buffer Prepare the output in BUFFER call show S_x amp Rank 1 REAL with 7 digits natural indexing amp internal BUFFER CR NL EOLs amp IMAGE BUFFER IOPT options Display BUFF on the ends WRITE 1x A TRIM BUFFER R as a CHARACTER array Discard blanks E end Fatal and Terminal Error Messages See the messages gls file for error messages for show These error messages are numbered 601 606 611 617 621 627 631 636 641 646 140 Chapter 5 Utilities IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option Introduction This chapter describes numerical linear algebra software packaged as operations MPI REQUIRED that are executed with a function notation similar to standard mathematics The resulting interface is a great simplification It alters the way libraries are presented to the user Many computations of numerical linear algebra are documented here as operators and generic functions A notation is deve
417. trix Modules Use the appropriate one of the modules rank_int or linear_operators Optional Variables Reserved Names This function uses Llin_sol_svd to compute the singular values of the argument The singular values are then compared with the value of the tolerance to compute the rank IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 169 The option and derived type names are given in the following tables Option Name for RANK Option Value s_rank_set_small 1 s_rank_for_lin_sol_svd 2 d_rank_set_small 1 d_rank_for_lin_sol_svd 2 c_rank_set_small 1 c_rank_for_lin_sol_svd 2 z_rank_set_small 1 z_rank_for_lin_sol_svd 2 Derived Type Name of Unallocated Array Seasons s_rank_options s_options s_rank_opt ions_once d_options d_rank_options d_options d_rank_opt ions_once SVD Example Compute the rank of an array of random numbers and then the rank of an array where each entry is the value one A rand A k rank A A 1 k rank A Compute the singular value decomposition of a rank 2 or rank 3 array A USV Required Arguments The argument must be rank 2 or rank 3 array of any intrinsic floating point type The keyword arguments u and v are optional The output array names used on the right hand side must have sizes that are large enough to contain the right and left singular vectors U
418. ts ccccccccesseeeeee 66 Example 3 Solving Parametric Linear Systems with a Scalar Change 68 Example 4 Accuracy Estimates of Eigenvalues Using Adjoint and Ordinary EiJenVECtOf Ssss treina eaaa 69 Tin geig Go frsiss kia a aa aaaea aea 71 Example 1 Computing Generalized Eigenvalues ccccceeeeeerees 71 Example 2 Self Adjoint Positive Definite Generalized Eigenvalue PLODIGIMN crror snr EN ENTNER 74 Example 3 A Test for a Regular Matrix Pencll ccsesccceeteeeeeeeeeeeee 76 Example 4 Larger Data Uncertainty than Working Precision 77 IMSL Fortran 90 MP Library 4 0 Chapter 2 Singular Value and Eigenvalue Decomposition 47 lin_svd Computes the singular value decomposition SVD of a rectangular matrix A This gives the decomposition A USV where V is an n X n orthogonal matrix U is an m x m orthogonal matrix and S is areal rectangular diagonal matrix Required Arguments A Input Output Array of size m X n containing the matrix S Output Array of size min m n containing the real singular values These nonnegative values are in non increasing order u Output Array of size m X m containing the singular vectors U v Output Array of size n X n containing the singular vectors V Example 1 Computing the SVD The SVD of a square random matrix A is computed The residuals R AV US are small with respect to working precision Also see operator_e
419. ts proper order by the subscripted array assignment y x iperm i n icycle icycle Output Permutations applied to the input data are converted to cyclic interchanges Thus the output array y is given by the following elementary interchanges where denotes a swap j icycle i y j y i i 1 n iopt iopt Input Derived type array with the same precision as the input matrix used for passing optional data to the routine The options are as follows Packaged Options for sort_real s sort_real_scan_for_NaN d_ iopt IO _options _sort_real_scan_for_NaN _dummy Examines each input array entry to find the first value such that isNaN x i true See the isNaN function Chapter 6 Default Does not scan for NaNs Description The sort_real routine is a Fortran 90 version of SVRGN from IMSL MATH LIBRARY User s Manual IMSL 1994 p 1141 IMSL Fortran 90 MP Library 4 0 Chapter 5 Utilities e 135 Additional Examples Example 2 Sort and Final Move with a Permutation A set of n random numbers is sorted so the results are nonincreasing The columns of an n x n random matrix are moved to the order given by the permutation defined by the interchange of the entries Since the routine sorts the results to be algebraically nondecreasing the array of negative values is used as input Thus the negative value of the sorted output order is nonincreasing The optional argument iperm
420. tup_int itis not necessary to explicitly use this module If neither MP_SETUP nor MPI_Init is called then the box data type will compute entirely on one node No routine from MPI will be called MODULE MPI_NODE_INT INTEGER ALLOCATABLE MPI_NODE_PRIORITY INTEGER SAVE MP_LIBRARY_WORLD huge 1 LOGECAL SAVE 3 MP2 ROOT_WORKS TRUE INTEGER SAVE MP_RANK 0 MP_NPROCS 1 146 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 END MODULE When the function MP_SETUP is called with no arguments the following events occur e If MPI has not been initialized it is first initialized This step uses the routines MPI_Initialized and possibly MPI_Init Users who choose not to call MP_SETUP must make the required initialization call before using any Fortran 90 MP Library code that relies on MPI for its execution If the user s code calls a Fortran 90 MP Library function utilizing the box data type and MPI has not been initialized then the computations are performed on the root node The only MPI routine always called in this context is MPI_Initialized The name MP_SETUP is pushed onto the subprogram or call stack e If MP_LIBRARY_WORLD equals its initial value huge 1 then MPI_COMM_WORLD the default MPI communicator is duplicated and becomes its handle This uses the routine MPI_Comm_dup U
421. u j k 1 NPDE If any of the functions cannot be evaluated set IRES 3 Otherwise do not change its value boundary_conditions Input The name of an external subroutine written by the user when using forward communication It gives the boundary conditions as expressed in Equation 2 w u j j me u dudx j ox beta j B x t u u gamma j y x t u u j 1 NPDE The value Boxe and the logical flag LEFT TRUE for Z The flag X Xp has the value LEFT FALSE for TOP T Input Derived type array s_options or d_options used for passing optional data to PDE_1D_MG See the section Optional Data in the Introduction for an explanation of the derived type and its use It is necessary to invoke a module with the statement USE ERROR_OPTION_PACKET near the second line of the program unit Examples 2 8 use this optional argument The choices are as follows IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations 273 Packaged Options for PDE_1D_MG Option Prefix Option Name Option Value s_ d_ PDE_1D_MG_CART_COORDINATES 1 s_ d_ PDE_1D_MG_CYL_COORDINATES 2 s_ d_ PDE_1D_MG_SPH_COORDINATES 3 s_ d_ PDE_1D_MG_TIME_SMOOTHING 4 s_ d_ PDE_1D_MG_SPATIAL_SMOOTHING 5 s_ d_ PDE_1D_MG_MONITOR_REGULARIZING
422. uccessfully reached the end point TOUT IDO 3 This value is assigned by the user at the end of a problem The routine is called by the user with this value Internally it causes termination steps to be performed IDO 4 This value is assigned by the integrator when a type FATAL or TERMINAL error condition has occurred and error processing is set NOT to STOP for these types of errors It is not necessary to make a final call to the integrator with IDO 3 in this case Values of IDO 5 6 7 8 9 are reserved for applications that provide problem information or linear algebra computations using reverse communication When problem information is provided using reverse communication the differential equations boundary conditions and initial data must all be given The absence of optional subroutine names in the calling sequence directs the routine to use reverse communication In the module PDE_1D_MG_INT scalars and arrays for evaluating results are named below The names are preceded by the prefix s_pde_1d_mg_ or d_pde_1d_mg_ depending on the precision We use the prefix _pde_1d_mg_ for the appropriate choice IMSL Fortran 90 MP Library 4 0 Chapter 8 Partial Differential Equations 269 IDO 5 This value is assigned by the integrator requesting data for the initial conditions Following this evaluation the integrator is re entered Optional Update the grid of values in array locations U NPDE 1 j j 2 N 1_
423. ue IOPT 1 PDE_1D_MG_MAX_BDF_ORDER IOPT 2 5 Update to the next output point Write solution and check for final point CASE 2 TO TOUT IF TO lt TEND THEN WRITE 7 F10 5 TOUT DO I 1 NPDE 1 a 0 ZERO D 290 Chapter 8 Partial Differential Equations IMSL Fortran 90 MP Library 4 0 All completed Defin WRITE 7 4E1 END DO 54 9 MU Lee TOUT MIN IF END IF Solver is shu CASE 3 CLOSE UNIT 7 EXIT OUT D Define initial data values CASE 5 U 1 ONE WRITE 7 25 WRITE 7 4E15 DO differential Reverse communica quations CASE 6 D_PD D_PD D_MG_C ONE ELTA_T TEND END IDO 3 t down TO 5 U T m m ral 1D_MG_R D_P Ale D_PDE_1D_MG_Q Define boundary conditions CASE 7 DE_1D_MG_DUDX H D_PDE_1D_MG_U 1 IF PDE_1D_MG_LEFT THEN U Gl TA ZERO _PDE_1D_MG_BE PDE_1 P D D ELS D_PD TA ZERO SE D_MG_BE D_PD IF ECT ETER Er zL 1 E D N SEL END CALL END DO PDE_1D_MG TO CONTAINS FUNCTION H Z real kind ld0 H tion is used for t T D_MG_GAMMA D_PDE_1D_MG_U 1 TOUT IDO U
424. uffer line by line which contains an indication of where the output originated Note that the root directs the order of results by broadcasting an integer value BATON giving the index of the node to transmit The random numbers generated at the nodes and then listed are not checked There is a final printed line indicating that the example is completed 224 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 use show_int use rand_int use mpi_setup_int implicit none INS iO ay saaj ova se Lev This is Parallel Example 17 Each non root node transmits the contents of an array that is the output of SHOW The root receives the characters and prints the lines from alternate nodes integer parameter n 7 BSIZE 72 2 4 integer k p q ierror status MPI_STATUS_S1IZI integer I BATON real kind le0 s_x 1 n type s_options options 7 CHARACTER LEN BSIZE BUFFER character LEN 12 PROC_NU EI IES Sie Um 1a MPas mp_nprocs mp_setup if mp_rank gt 0 then The data types printed are real kind le0 random numbers s_x rand s_x Convert node rank to CHARACTER data write proc_num 13 mp_rank Show 7 digits per number and according to the natural or declared size of the array Prepare the output lines in array BUFFER End each line with ASCII sequence CR NL options 1 show_significant_digits_is_7 op
425. uments the output values will be real and imaginary parts with random values of the same type kind and rank Modules Use the appropriate modules rand_int or linear_operators Optional Variables Reserved Names This function uses rand_gen to obtain the number of values required by the argument The values are then copied using the RESHAPE intrinsic Note If any of the arrays s_rand_options s_rand_options_once d_rand_options or d_rand_options_once are allocated they are passed as arguments to rand_gen using the keyword iopt 168 e Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 The option and derived type names are given in the following table Derived Type Name of Unallocated Array s_options s_rand_options s_options s_rand_options_once d_options d_rand_options d_options d_rand_options_once Examples Compute a random digit l lt is lt n i rand le0 nt 1 Compute a random vector x x rand x RANK Compute the mathematical rank of a rank 2 or rank 3 array Required Arguments The argument must be rank 2 or rank 3 array of any intrinsic floating point type The output function value is an integer with a value equal to the number of singular values that are greater than a tolerance The default value for this tolerance is pes where is machine precision and sj is the largest singular value of the ma
426. up if mp_rank 0 then Generate a random matrix and right hand side A rand A b rand b Heavily weight desired constraint All variables sum to one Nme a A eS one sqrt epsilon one Dm ee one sqrt epsilon one endif Compute the least squares solution with this heavy weight tS A aaldigg lo IECheckMGhesconstisadimine if ALL abs sum x 1 dim 1 one norm x amp lt sqrt epsilon one then if mp_rank 0 amp write Parallel Example 13 is correct endif See to any error messages and exit MPI mp_nprocs mp_setup Final end IMSL Fortran 90 MP Library 4 0 Chapter 6 Operators and Generic Functions The Parallel Option 221 Parallel Example 14 Systems of least squares problems are solved but now using the SVD function A box data type is used This is an example which uses optional arguments and a generic function overloaded for parallel execution of a box data type Any number of nodes can be used use linear_operators use mpi_setup_int implicit none This is Parallel Example 14 Letor SVO Sie sand NORM integer parameter m 128 n 32 nr 4 real kind 1d0 one 1d0 err nr realikind Lau im hee Sey res oh Ley ye CRI iy Te oe VAC ol sigue gt Sh Gal ioe op iat AL sae Setup te 7e WIP ICs mp_nprocs mp_setup if mp_rank 0 then Generate a random matrix and right hand side A rand
427. ussian Elimination if Cyclic Reduction did not get an accurate solution It is an exceptional event when Gaussian Elimination is required if sum abs x_sol x_save sum abs x_save amp lt sqrt epsilon d_one exit factorization_choice iopt s_options 0 s_zero iopt nopt 1 s_options s_lin_sol_tri_use_Gauss_elim s_zero end do factorization_choice Check on accuracy of solution res x l in 1l n x_save err sum abs res sum abs x_save if err lt sqrt epsilon d_one then write Example 2 for LIN_SOL_TRI is correct end if end Example 3 Selected Eigenvectors of Tridiagonal Matrices The eigenvalues Ay n of a tridiagonal real self adjoint matrix are computed Note that the computation is performed using the IMSL MATH LIBRARY EVASB routine from the FORTRAN 77 Library This information is made available to the Fortran 90 compiler by using the FORTRAN 77 interface for EVASB The user may write this interface based on documentation of the arguments IMSL 1994 p 356 or use the module Numerical_Libraries as we have done here The eigenvectors corresponding to k lt n of the eigenvalues are required These vectors are computed using inverse iteration for all the eigenvalues at one step See Golub and Van Loan 1989 Chapter 7 The eigenvectors are then orthogonalized Also see operator_ex19 Chapter 6 use lin_sol_tri_int use rand_gen_int use Numerical_Libraries
428. ute the coefficient matrix for the least squares system do nrack 1 nr AC Sy aee Seras atn ESS a S eee a E spread q nrack 2 m 2 dim 1 delta_sqr Compute the right hand side of function values b 1 nrack exp sum p nrack 2 dim 1 enddo endif Compute the least squares solution An error message du to rank deficiency is ignored with the flags allocate d_invx_options 1 d_invx_options 1 skip_error_processing c A ix b Check the results ete VAT Ie morm Al tox be CAN eax SN i inommi A EEMoremi e amp lt sqrt epsilon one then if mp_rank 0 amp write Parallel Example 12 is correct end if Unload option type for good housekeeping deallocate d_invx_options 220 Chapter 6 Operators and Generic Functions The Parallel Option IMSL Fortran 90 MP Library 4 0 See to any error messages and quit MPI mp_nprocs mp_setup Final end Parallel Example 13 Here least squares problems are solved each with an equality constraint that the variables sum to the value one A box data type is used and the solution obtained with the ix operator Any number of nodes can be used use linear_operators use mpi_setup_int implicit none This is Parallel Example 13 for ix and NORM integer parameter m 64 n 32 nr 4 real kind le0 one le0 A m 1 n nr b m 1 1 nr x n 1 nr P SSiSUley seene WIP IES mp_nprocs mp_set
429. utines This routine contains a call to a barrier routine so that if one process is writing the file and an alternate process is to read it the results will be synchronized All processors in the BLACS context call the routine IMSL Fortran 90 MP Library 4 0 Chapter 7 ScaLAPACK Utilities and Large Scale Parallel Solvers 233 Required Arguments File_Name Input A character variable naming the file containing the matrix data This file is opened with STATUS OLD If the name is misspelled or the file does not exist or any access violation happens a type terminal error message will occur After the contents are read the file is closed This file is read with a loop logically equivalent to groups of reads oe EAD BUFFER I J I 1 M J 1 NB or optionally READ BUFFER I J J 1 N I 1 MB DESC_A Input The nine integer parameters associated with the ScaLAPACK matrix descriptor Values for NB MB LDA are contained in this array A LDA Output This is an assumed size array with leading dimension LDA that will contain this processor s piece of the block cyclic matrix The data type for A is any of five Fortran intrinsic types integer single precision real double precision real single precision complex and double precision complex Optional Arguments Format Input A character variable containing a format to be used for reading the file containing matrix data If th
430. ution s wset 1 for k 0 npde 1 do begin umin min u k umax max u k for i 0 nt 1 do begin title strcompress U_ string k 1 remove_all 276 Chapter 8 Partial Differential Equations IMSL Fortran 90 MP Library 4 0 at time string times i plot g i u i k ystyle 1 title title xtitle x ytitle strcompress U_ string k 1 remove_all xr xl xr yr umin umax psym 4 wait twait end end end Example 1 Electrodynamics Model This example is from Blom and Zegeling 1994 The system is u pux g u v V PY g u v where g z exp nz 3 exp 2nz 3 O lt x lt 1l 0 lt t lt 4 u O andv 0atx 0 u landv Oatx 1 e 0 143 p 0 1743 n 17 19 We make the connection between the model problem statement and the example C m 0 R pu Ry pv Q g u v Q Q u landv Oatr 0 The boundary conditions are Bi 1 B 0 Y 0 Y v atx x 0 B 0 8B Ly u 1 Y 0 atx xp l Rationale This is a non linear problem with sharply changing conditions near 0 The default settings of integration parameters allow the problem to be solved The use of PDE_1D_MG with forward communication requires three subroutines provided by the user to describe the initial conditions differential equations and boundary conditions program PDE_EX1 Electrodynamics Model USE PDE_1ld_mg_int IM
431. values This rank 1 array function returns an array result given an array of input Use the optional argument for the covariance matrix when the square root of the variance function is required The result will be a scalar value when the input variable is scalar Required Arguments derivative derivative Input The index of the derivative evaluated Use non negative integer values For the function itself use the value 0 variables variables Input The independent variable values where the spline or its derivatives are evaluated Either a rank 1 array or a scalar can be used as this argument knots knots Input The derived type _spline_knots defined as the array COEFFS was obtained with the function SPLINE_FITTING This contains the polynomial spline degree and the number of knots and the knots themselves for this spline function coeffs c Input The coefficients in the representation for the spline function N f x eB x j l These result from the fitting process or array assignment C SPLINE_FITTING defined below The value N size C satisfies the identity N 1 spline_degree size _knots where the two right most quantities refer to components of the argument knots Optional Arguments covariance G Input This argument when present results in the evaluation of the square root of the variance function e x o x s Gb x where b x B x By x and G is the
432. waps places with iopt 10 1 idummy iopt IO _options _rand_gen_use_Fushimi_start _dummy Starts the GFSR sequence as suggested by Fushimi 1990 The default starting sequence is with the LCM recurrence described below Description This GFSR algorithm is based on the recurrence x Xi 3p X 13g where a b is the exclusive OR operation on two integers a and b This operation is performed until size x numbers have been generated The subscripts in the recurrence formula are computed modulo 3p These numbers are converted to floating point by effectively multiplying the positive integer quantity x U1 by a scale factor slightly smaller than 1 huge 1 The values p 521 and q 32 yield a sequence with a period approximately 128 Chapter 5 Utilities IMSL Fortran 90 MP Library 4 0 2P gt 10156 8 The default initial values for the sequence of integers x are created by a congruential generator starting with an odd integer seed m v lcount A gee Iu obtained by the Fortran 90 real time clock routine CALL SYSTEM_CLOCK COUNT count CLOCK_RATE CLRATE An error condition is noted if the value of CLRATE 0 This indicates that the processor does not have a functioning real time clock In this exceptional case a starting seed must be provided by the user with the optional argument iopt and option number _rand_generator_seed The value v is the current clock for this day in milliseconds
433. x and complex double precision versions of the code When dealing with a complex matrix all references to the transpose of a matrix A are replaced by the adjoint matrix IMSL Fortran 90 MP Library 4 0 A A A where the overstrike denotes complex conjugation IMSL Fortran 90 MP Library linear algebra software uses this convention to conserve the utility of generic documentation for that code subject References to orthogonal matrices are replaced by their complex counterparts unitary matrices Thus an n xn orthogonal matrix Q satisfies the condition o O 1 Ann Xn unitary matrix V satisfies the analogous condition for complex matrices VV Ip Using Operators and Generic Functions For users who are primarily interested in easy to use software for numerical linear algebra see Chapter 6 Operators and Generic Functions The Parallel Option This compact notation for writing Fortran 90 programs when it applies results in code that is easier to read and maintain than traditional subprogram usage Note that all of the examples in Chapters 1 and 2 have been rewritten using operators and generic functions whenever appropriate These examples are renamed as shown in Chapter 6 Table A Examples and Corresponding Operators Less code is typically needed to compute equivalent results Users may begin their code development using operators and generic functions If a shorter executable code is required a user may need
434. x with random data Compare values with magnitudes of singular values Compute complete eigenexpansions of a self adjoint matrix with random data Compute eigenvalues of self adjoint matrix Compute some eigenvectors using inverse iteration and a symmetric solver Uses random data Compute solution of a self adjoint generalized problem by reduction to an ordinary self adjoint problem Compute the eigenexpansion of a real matrix with random data Compute the roots of a complex polynomial equation with random coefficients Solve linear systems with a scalar diagonal parameter with random data Compute condition numbers of eigenvalues to estimate their accuracy with random data Compute the generalized eigenvalues of a matrix pencil with random data Compute the eigenexpansion of a self adjoint matrix pencil with random data Uses options Test for solvability of a DAE system with random data Compute eigenexpansion of a matrix pencil where the second matrix may be singular Uses random data IMSL Fortran 90 MP Library 4 0 fast_dft_exl fast_dft_ex2 fast_dft_ex3 fast_dft_ex4 fast_2dft_exl fast_2dft_ex2 fast_2dft_ex3 fast_3dft_exl rand_gen_ex1 rand_gen_ex2 rand_gen_ex3 rand_gen_ex4 sort_real_exl sort_real_ex2 nan_exl show_ex1 show_ex2 spline_fitting_exl spline_fitting_ex2 IMSL Fortran 90 MP Library 4 0 Compute FFT of a complex vector Transform forward then backwards U
435. x21 Chapter 6 use lin_svd_int use rand_gen_int implicit none This is Example 1 for LIN_SVD integer parameter n 32 real kind 1d0 parameter one 1d0 real kind 1d0 err real kind 1d0 dimension n n A U V S n y n n Generate a random n by n matrix call rand_gen y A reshape y n n Compute the singular value decomposition call lin_svd A S U V Check for small residuals of the expression A V U S err sum abs matmul A V U spread S dim 1 ncopies n amp sum abs S if err lt sqrt epsilon one then write Example 1 for LIN_SVD is correct end if end 48 Chapter 2 Singular Value and Eigenvalue Decomposition IMSL Fortran 90 MP Library 4 0 Optional Arguments MROWS m Input Uses array A 1 m 1 n for the input matrix Default m size A 1 NCOLS n Input Uses array A 1 m 1 n for the input matrix Default n size A 2 RANK k Output Number of singular values that exceed the value Small RANK will satisfy k lt min m n iopt iopt Input Derived type array with the same precision as the input matrix Used for passing optional data to the routine The options are as follows Packaged Options for 1in_svd Saa aa x lin_svd_set_small Sd Cun lin_svd_overwrite_input 2 50 56 4 2 lin_svd_scan_for_NaN 3 Sad Cs go Z lin_svd_use_qgr 4 S026 2 lin_svd_skip_orth 5 SEPE o E e li
436. y typed This is done by including the line IMPLICIT NONE as close to the first line as possible Study some of the examples accompanying an IMSL Fortran 90 MP Library routine early on These examples are available online as part of the product Each subject routine called or otherwise referenced requires the use statement for an interface block designed for that subject routine The contents of this interface block are the interfaces to the separate routines for that subject and the packaged descriptive names for option numbers that modify documented optional data or internal parameters Although this seems like an additional complication many typographical errors are avoided at an early stage in development The use statement is required for each routine called As illustrated in Examples 3 and 4 in routine lin_geig_gen the use statement is required for defining the secondary option flags The function subprogram for s_NaN or d_NaN does not require an interface block because it has only a required dummy argument IMSL Fortran 90 MP Library 4 0 Error Processing and the Testing Suite A design principle of the IMSL Fortran 90 MP Library subroutines is that error messages are by default printed in the routines Information to print the error messages can be returned to the calling program unit No printing in the routine itself needs to occur This happens when the argument epack is include
437. ze m X nb containing the right hand side matrix When using the option to solve adjoint systems A Tz b the size of b is n x nb x Output Array of size n X nb containing the solution matrix When using the option to solve adjoint systems A Tz b the size of x is m X nb Example 1 Solving a Linear Least squares System This example solves a linear least squares system Cx d where C mxn is a real matrix with m gt n The least squares problem is derived from polynomial data fitting to the function y x e cos T 2 using a discrete set of values in the interval 1 lt x lt 1 The polynomial is represented as the series i 0 where the 7 x are Chebyshev polynomials It is natural for the problem matrix and solution to have a column or entry corresponding to the subscript zero which is used in this code Also see operator_ex09 Chapter 6 18 Chapter 1 Linear Solvers IMSL Fortran 90 MP Library 4 0 use lin_sol_lsq_int use rand_gen_int use error_option_packet implicit none This is Example 1 for LIN_SOL_LSOQ integer 1 integer parameter m 128 n 8 real kind 1d0 parameter one 1d0 zero 0d0 real kind 1d0 A m O n c O n 1 pi_over_2 x m y m 1 amp u m v m w m delta_x Generate a random grid of points call rand_gen x Transform points to the interval 1 1 x x 2 one Compute the constant PI 2 pi_over_2 atan one 2 Generate known function data on the

Download Pdf Manuals

image

Related Search

Related Contents

PDF Manual ATCOR4  Multiquip MTX60HF Automobile Parts User Manual  teliers. - Loft Sportif  

Copyright © All rights reserved.
Failed to retrieve file