Home

up.time 5 User Guide - Documentation Portal

1. an Group Service Monitors Add Action Profiles together to make a template to generate events Click any of the icons in the image to perform a task For example click the Add Service Monitors to a system icon to configure a new service monitor 136 up time 5 User Guide f up time Using Service Monitors Using Service Monitor There are three main types of service monitors e Agent Monitors S For more information about Agent Monitors see Using Agent Monitors e Agentless Monitors For more information about Agentless Monitors see Using Agentless Monitors e Custom Monitors For more information about custom monitors see Using Advanced Monitors Using Agent Monitors To use agent monitors up time requires e an up time agent to be installed and running on the system on which the service that you want to monitor is running e the service about which you want to collect information to be installed and running on the system that you intend to monitor Agents enable you to collect very detailed data about a system such as information about processes and low level system statistics The level of granularity of the information collected by agents is greater than that of the information collected by agentless monitors The monitors that require an agent are e Exchange e File System Capacity e IIS e Performance Check e Process Count Check up time software SQL Server Adv
2. c3p0 library Sets the amount of time a connection can be idle before it is closed This parameter should only be modified with the assistance of uptime software Customer Support e connectionPoolNumHelperThreads c3p0 library Sets the number of helper threads that can improve the performance of slow JDBC operations This parameter should only be modified with the assistance of uptime software Customer Support Changing the DataStore Database The up time DataStore is first linked to a database during the installation process and contains important historical performance data that has since been collected Linking the DataStore to a new database will result in lost data unless you properly migrate your data to the new database As such changing the DataStore s database should be done only after some consideration and planning In cases where you would like to migrate the database e g from the default up time MySQL implementation to Oracle or move the DataStore to a different system from the Monitoring Station you will modify the aforementioned database values in the uptime conf file Note that the up time software 533 Configuring and Managing up time nterfacing with up time modification of these values is one of a series of steps Refer to the Knowledge Base for more information on migrating your DataStore Monitoring Station Web Server Monitoring Stations include a Web server component that drives the us
3. rrrrrrrrrunnnnnnnnnnner 27 Understanding Dates and Times rrurununnnanvarrrrrrrrnnnnnnnnnnr 22 Understanding Retained Data rrrnnnnnnvvnanrrnnnrrrunnnnnnnnnnner 24 Understanding up time Understanding the up time Interface Understanding the up time Interface The up time Web interface consists of seven main sections The following image displays the up time application screen The panels change according to the task area that is selected from the tool bar Tree Panel Tool Bar Panel Subpanel Assistance Search Current Use aflmii Global Scan My Portal My Infrastructure Services Users Reports Config Reports Resource Usage Performance and Analysis Date and Time Range Resource Usage Multi Specific Date and Time DateRange YYYY MM DD HH MM SS aanw C Last 2008 04 21 00 00 00 C Quick Date o 2008 04 21 Report Options Select All Options 7 Resource Utilization I cpu Performance Graph Multi CPU Network I O IT network Errors TCP Retransmits Free Memory IT Page Scanning Stats Disk Statistics Disk I O Bandwidth File System Capacity I Workload Top 10 CPU Workload Top 10 Memsize CPU Run Queue Threshold Workload Top 10 RSS File System Service Time Sum Service Level Agreements Group report options by system SLA Summary SLA Detailed Availability List of Groups Select All Groups I Application Availability IT Email Systems IT Linux Boxes My Enterprise nitor Av
4. s ssssssssnnnennssnsnsnnnnennnrrnrrnnnnennns 251 Oracle Advanced MOtriCs ciiccccccnecceen nen enenenenennnnennnnnes 253 Oracle Basic CHECKS aaauenrnnnvnnennnenennreenennneneneonennnnenene 256 Oracle Tablespace Check ccccccccccccccnnnnceuectunnseeesssannnnsnes 259 SOL Server Basic Checks raauurnananavenanvrernannrrnenerernenveeer 262 SOL Server Advanced MEtriCs ccicccccceneeceneneeeeeneeeennnes 266 SOL Server Tablespace Check annnannnuunnnnnnnnnnnnnnnsnnnnnnn 270 Sl stad 275 243 Database Monitors MySQL Advanced Metrics MySQL Advanced Metrics The MySQL Advanced Metrics monitor checks the performance of MySQL databases and instances that are running on a system against the thresholds that you define If MySQL is not responding the database can process queries but the results will demonstrate behavior that alerts you to a problem The MySQL Advanced Metrics monitor can e determine whether or not a MySQL instance is running on your system e check whether or not MySQL is listening on a specific port e check performance values to determine the efficiency of a MySQL instance Configuring MySQL Advanced Metrics Monitors To configure MySQL Advanced Metrics monitors do the following 1 Complete the monitor information fields To learn about monitor information fields see Monitor Identification on page 141 2 Complete the following settings by entering the appropriate
5. 1 Inthe Published Reports window click the Search button The Search Options appear in the window 2 Select one of the following options from the Search Column dropdown list Year Month Name Date User 3 Specify the criteria for the search and then click the Search button to view the results on the Report Library page 406 up time 5 User Guide f up time Scheduling Reports Scheduling Reports If you need to run a report at a particular interval for example daily or weekly you can schedule when the report should be generated up time generates the report and emails it to a user or group of users For example you generate a File System Capacity Growth Report which charts the amount of disk usage for a system However the system for which you are generating the report schedules backups from midnight to 4 00 a m Due to the gap caused by the backup the CPU usage and disk activity statistics are not indicative of the overall system load You can specify that the report does not cover the periods of time over which the backups occur To schedule reports do the following 1 Inthe Reports subpanel select the Email option in the Save Report section of the subpanel and then select one of the following options e User e Group e E mail Address 2 Type a name for the report in the Save to My Portal As field 3 Optionally type a description for the report in the Report Description field 4 Click
6. 3 Select one of the monitors in the monitors that is listed in the window and then click Continue See The Monitor Template on page 141 for information on completing the configuration of a custom monitor 140 up time 5 User Guide f up time The Mo Monitor I The Monitor Template nitor Template You use a general template to configure monitors While the specific configuration information varies from monitor to monitor every template contains areas for e Monitor Identification e Monitor Settings Configuration e Monitor Timing Settings e Monitor Alert Settings e Alert Profiles e Action Profiles dentification Each service monitor template has a monitor identification information area that you use to e specify the name of the monitor e include an optional description of the monitor e select the system node or virtual node that you want up time to monitor The monitor identification information area is illustrated below Service Name Description You must ensure that the system can be resolved by a naming service running on an operating system for example DNS or NIS YP up time software 141 SJOHUON 2914135 bulsn ry Using Service Monitors The Monitor Template Adding Monitor Identification Information To add monitor identification information do the following 1 Enter a name for the monitor in the Service Name field The name can for exampl
7. 5 Click Generate Graph If there is no data to graph the message No Data found 3 for the given time range appears in the graph window 504 up time 5 User Guide hy up time Workload Graphs Workload Graphs The three workload graphs determine the demand that network and local services are putting on a system The graphs chart an aggregate amount of performance information for a given user group or process You can generate the following workload graphs e Workload User The demand that network and local services are putting on the system based on the IDs of the users who are logged into a system e Workload Group The demand that network and local services are putting on the system based on the IDs of the user groups that are logged into a system e Workload Process Name The demand that network and local services are putting on a system based on the processes that are running These graphs use the same input criteria but they return different data For information on how to generate these graphs see Generating a Workload Graph on page 506 Each workload graph captures the following metrics e CPU The percentage of CPU time that is taken up by a user group or process e Memory Size The amount of the page file and virtual memory that is taken up by a user group or process On Windows systems Memory Size is called Virtual Bytes e RSS The Run Set Size which is the amount of physical m
8. fe 5 e Oo 7 Application Monitors ESX Advanced Metrics ESX Advanced Metrics The ESX Advanced Metrics monitor offers greater visibility into your ESX environment by expanding on the high level usage metrics for a virtual machine s CPU memory and disk activity Configuring ESX Advanced Metrics Monitors To configure an ESX Advanced Metrics monitor do the following 1 Complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following fields e Percent Wait Guest metric The percetnage of time that a virtual CPU is not runnable A non running CPU could be idle halted or waiting for an external event such as I O e Memory Balloon Avg Guest metric The average amount of memory in KB held by memory control for ballooning e Memory Balloon Target Guest metric The total amount of memory in KB that can be used by memory control for ballooning Memory Overhead Avg Guest metric The average amount of additional host memory in KB allocated to the virtual machine e Memory Swap In Avg Guest metric The average amount of memory in KB that was swapped in e Memory Swap Out Avg 220 up time 5 User Guide dy up time up time software ESX Advanced Metrics Guest metric The average amount of memory in KB that was swapped out Memory Zero Avg Gues
9. 2 Inthe Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 up time 5 User Guide f up time Reports for Performance and Analysis 3 Select one or more of the following report options Service Status The status of each service that has been assigned to the selected system or systems The statuses are OK WARN CRIT MAINT and UNKNOWN Network I O The average amount of traffic measured in megabytes per second that is travelling through the network interfaces The report also identifies bursts in network activity that may occur over short intervals This information appears as a graph in the report Free Memory The amount of free memory available to the system This information appears as a graph in the report File System Capacity The amount of free disk space on the system This information appears as a graph in the report Workload Top 10 RSS The top 10 processes that are consuming physical memory in KB as measured by the run set size RSS of the process This information appears as a graph in the report a VMware ESX system This graph does not appear when you generate a report for up time software Resource Utilization The average and maximum amount of CPU and memory use Network Errors Any errors that have occurred with the physical network interface The errors can be for example collisio
10. The Wait I O report contains the following information e the names of the hosts for which the report has been generated the average maximum and minimum wait I O times expressed as percentages Creating a Wait I O Report To create a Wait I O report do the following 1 Inthe Reports Tree panel click Wait I O 2 In the Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 3 If you want the report to only include data from certain hours during the day select those hours from the dropdown lists in the Daily Hours section as shown below Daily Hours Include data samples between these hours only End 21 00 z For example if you want to report to cover the hours from 1 00 a m to 1 00 p m select 1 00 from the Start dropdown list and 13 00 from the End dropdown list 4 Optionally enter a value in the Highlight average WIO over threshold field Any system with an average Wait I O percentage that exceeds the value that you enter in this field will be highlighted in red in the report As well the following text appears in the header of the report Systems with an Average Wait I O over x x are highlighted Where x x is the percentage that you entered in this field 5 If you want to generate reports for systems in specific groups select the groups from the List of Groups area 424 up time 5 User Guide hy u
11. When viewing a Resource Scan for a system you can navigate to other groups by selecting the name of the group from the Current Location dropdown list at the top of the Resource Scan panel as shown below 132 up time 5 User Guide f up time Viewing Scrutinizer Status Viewing Scrutinizer Status up time software Scrutinizer is a NetFlow analyzer that takes advantage of communications standards for Cisco IOS networking devices as well as other compatible switches and routers to retrieve and store network traffic information for users systems and applications It allows administrators to monitor graph and report on network usage patterns and locate the heaviest traffic creators Scrutinizer can be integrated with up time Doing so allows you to add node type Elements that are exporting NetFlow data to Scrutinizer as well as call a Scrutinizer instance from a commonly monitored Element s status page whether the Element is a NetFlow exporting node or a non node Element You can also access all of Scrutinizer s features such as the MyView status panel from within Global Scan by clicking the NetFlow tab up times My Portal My Infrastructure Services Users Reports Config si Applications AllElements ResourceScan AllServices NetFlow Status aaa nor LAN interface amp T 1 OutBound Qos 281m Interval Percent 900000 g 1 interval ate 700 008 800 200 200 000 00 000 300 000 200 000
12. all of which return a critical error and then sends an alert after the third recheck up time then checks the host every two hours While up time encounters two critical errors it does not send an alert Then the status of the host changes from critical to warning When this change is detected up time sends an alert informing recipients of the change in status When the status of the host changes to OK up time issues an alert informing recipients that the host has recovered up time software 379 Alerts and Actions Understanding Alerts This alert flow is illustrated in the following diagram 05 05 OK 00 00 00 30 i E A recovery message 05 35 OK CRIT 02 33 04 33 is sent OK Standard 120 min host check interval detecting outages 00 00 OK up time continues to check the WARN a 00 15 service every check interval period State change occurs and 05 20 00 33 05 00 up time issues an alert 05 00 00 32 00 31 00 33 The rechecks detect an outage An alert is sent at 00 33 380 up time 5 User Guide d up time Alert Profiles Alert Profiles Alert Profiles are templates that tell up time how to react to various alerts that are generated by service checks Alert Profiles enable up time to execute a series of actions in response to the failure of a service check or when a threshold is exceeded The following diagram illustrates how an Alert Profile works Alert Profile Action Profile Serv
13. configuration and system information for the hosts that you are monitoring the performance data gathered by monitors which is used for generating graphs and reports user information including user names and passwords encrypted if it is sensitive information the settings for service monitors Alert and Action Profiles scheduled maintenance and host checks reports that Monitoring Station users have saved and are scheduled to run at specific intervals Like any other database the DataStore consists of a number of tables Data that you enter and save or which up time collects from hosts is written to specific tables in the DataStore Access to the DataStore is determined by one of the three installed user accounts root uptime and reports Each account gives users varying levels of access to the contents of the DataStore For more information about these accounts see the uptime software Knowledge Base article Securing MySQL Database and Adding Users up time can also use either an Oracle or MS SQL Server database as its DataStore If you plan to use either of these databases refer to our Knowledge Base for the additional steps required to enable up time to work with these databases Connecting to the DataStore Using ODBC You can extract data from the DataStore for use in custom reporting or data warehousing by connecting to the DataStore using an ODBC connection Once the connection is established you can import th
14. dbDriver The database driver that is used to connect the Monitoring Station to the DataStore By default up time uses a JDBC Java Database Connectivity driver The supported drivers are e com mysql jdbc Driver for MySQL e net sourceforge jtds jdbc Driver for SQL Server e oracle jdbc OracleDriver for Oracle You can also use an ODBC driver which enables you to connect to the DataStore with tools like MySQL Query Browser Microsoft Excel and Crystal Reports For detailed information on installing and configuring the MySQL ODBC driver see the uptime software Knowledge Base article Connecting to the up time DataStore via ODBC dbType The type of database that is being used to store data from up time The default is mysql You can also specify mssql and oracle dbHostname The name of the system on which the database is running The default is localhost up time 5 User Guide hy up time Interfacing with up time dbPort The port on which the database is listening The default is 3308 e dbName The name of the database The default is upt ime e dbUsername The name of the default database user which is uptime e dbPassword The password for the default database user which is upt ime e connectionPoolMaximum The maximum number of connections that are allowed to the DataStore Setting this option to a lower number will help increase the performance of up time connectionPoolMaxIdleTime
15. e of CPUs e CPU Speed e Maximum CPU Minimum CPU e Average Memory Maximum Memory e Minimum Memory Average Page Scan e Maximum Page Scan e Minimum Page Scan Select Ascending or Descending from the Sort Direction dropdown list Optionally in the Minimum sort value for inclusion field enter a value for the sort threshold The report displays items from the Sort By list whose value is equal to or greater than the value in this field For example if you chose of CPUs from the Sort by list and set this field to 2 the report only displays systems with two or more CPUs Select one or more of the following CPU statistics at which the report will look e sys The percentage of CPU time that is being use to carry out system processes usr The percentage of CPU time that is being used to carry out user processes e wio The percentage of CPU time that could be handling processes but which is waiting for I O operations to complete up time 5 User Guide hy up time Reports for Performance and Analysis 7 Select one or more of the following statistics on which to report CPU The percentage of CPU resources that are being used e Memory The percentage of system memory that is being used e Page Scans The number of page scans per second The statistic you select must match the sort criteria that you selected in step 4 For example if your sort criteria is Average CPU you must also s
16. e Checks whether or not the system has already been added to up time If the system has been added then the button to add the system is disabled e Performs an agent check by scanning systems to determine whether or not agents are installed on them e Performs a WMI check by checking whether systems are using WMI to gather metrics optional e Performs an SNMP probe to find any systems that use Net SNMP optional Systems that are repeatedly discovered through additional checks e g both an agent and WMI implementation are detected on the same system will by default be assigned a type based on the first check that resulted in its discovery The auto discovery order is as follows agent check WMI check SNMP probe node discovery Once a list of systems in the range of IP addresses that you specified is generated you can selectively add them to up time See for more information up time 5 User Guide hy up time Working with Systems You can also use Auto Discovery feature to add VMware ESX systems that are being monitored by Virtual Infrastructure 3 or vSphere 4 or pSeries systems that are managed by a Hardware Management Console HMC For more information see the following sections e Using Auto Discovery to Add ESX Systems on page 76 e Using Auto Discovery to Add pSeries Servers Managed by an HMC on page 77 Using Auto Discovery To use Auto Discovery do the following 1 Inthe My Infrastructur
17. on page 357 Creating an SLA Summary Report To create an SLA Summary Report 1 Inthe Reports Tree panel click SLA Summary up time software 453 Using Reports Reports for Service Level Agreements In the Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 Select a Compliance Period to report on Clear the Display Outage Tables checkbox if you want the report to display only outage graphs If you want to generate reports for one or more groups that include SLAs select the groups from the List of Groups area To generate reports for one or more views that contain SLAs select the groups from the List of Views area See Working with Views on page 108 for more information about views If you are generating reports for specific Service Level Agreements select them from the List of SLAs Select a report generation option See Report Generation Options on page 402 for details To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information SLA Detailed Report 454 In cases where an SLA compliance target is not being met the SLA Detailed report breaks down both the outages of an SLA s component SLOs and the outages of each SLOs component
18. up time software In the FTP monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 Complete the following fields Port The number of the port number on which the FTP server is listening The default is 21 Server Response Enter the Warning and Critical time thresholds required to receive a ready response from the FTP server A server ready response can look like the following 220 filter FTP server Version wu 2 6 2 1 Mon Dec 3 15 29 55 EST 2005 ready For more information see Configuring Warning and Critical Thresholds on page 144 283 EL 2 pr Oo Ww 92 Oo Oo Oo 2 Network Service Monitors FTP 284 Response Time Enter the Warning and Critical Response Time thresholds for the length of time that the service check takes to complete For more information see Configuring Warning and Critical Thresholds on page 144 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146
19. 354 up time 5 User Guide hy up time Changing How Users Are Authenticated Enabling the DataStore for User Authentication To use up time DataStore to store passwords for user authentication do the following 1 On the up time tool bar click Config In the Tree panel click User Authentication Click Edit Configuration Select Database as the authentication method Click Save a FF OO N GL O e e lt gt e Cc 2 oD 7 up time software 355 Configuring Users Changing How Users Are Authenticated 356 up time 5 User Guide Cupane CHAPTER 16 Working with Service Level Agreements This chapter explains how to configure up time to monitor for compliance with Service Level Agreements SLAs in the following sections OV ET VICW rs lepers REEE ETE eas aattcentlbad ste viembie ade ha say ucla lame ates 358 SLAs Service Monitors and SLOS wiccccccccccccneccccnccsncesennsees 359 Viewing Service Level Agreements 1 1 1 cccceec cen nneeneeeeeeeees 360 SLA Compliance Calculation cccciccccccccscscccssseneeeeeeneeeesanes 363 SLA Creation Strategies cscccccccece cece een n neste een ett ennenneees 366 Working with SLA ReportS aaarrnnnaanannnnnnranrennnennnnennnnnennr 370 Adding and Editing SLA Definitions rannaaanannr nr nee e ee eees 371 357 Working with Service Level Agreements Overview Overview In up time a service level agreement SLA measures your
20. To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following fields e Port The number of the port number on which the LDAP server is listening The default is 389 e Password The password that is required to log in to the LDAP server e Base The location in the LDAP directory from which you want the monitor to begin searching for information The following diagram shows a simple LDAP directory structure dc com dc uptime ou customers ou employees ou suppliers Europe America Europe America Europe America Using this directory structure you can check your LDAP structure for your European employees by selecting the following as your base dc ldap dc uptime ou employees ou Europ 292 up time 5 User Guide D up time LDAP e Bind The Bind string which associates user account properties and LDAP account attributes This string gives you access to the Base location of your LDAP directory structure The format of the Bind string must match the Base location of your LDAP directory structure For example if you are checking for information found below the European employees directory you can use the following Bind string cn ldapadmin dc ldap dc uptime dc com Depending on your network security model you will need domain controller administration privileges to bind to the locations on which you want t
21. bax VE ust Delete Title Clone Change i Use the Graph Editor to do the following e exclude graph lines e change the style of the graph e re arrange the order of lines on your graph or the actual data to highlight specific entities in your data e copy lines e change the title of a line or of the graph e change the style of graph lines margins titles and the X and Y axis information 482 up time 5 User Guide dh up time Using the Graph Editor The Graph Editor contains the following subtabs e Series subtab Enables you to select the data series that the graph will display If for example you have a graph that displays the following data series e total memory e percentage of memory used by system processes e percentage of memory used by user processes You can choose to display any or all of the data series e General subtab Adjusts the graphs margins and controls the focus and scrolling functions e Axis subtab Manipulates the graph axis inverts the graph scales the data points on the axis and sets the position of the graph e Titles subtab Enables you to add delete or modify all labels and titles in the graph You can for example change the generic title LRX 234 to Main Email Server e Legend subtab Enables you to manipulate the legend which describes the graphed information for a graph You can add adjust and delete legend information
22. e Global Warning Threshold Mandatory Enter the percentage of the file system that must be used for up time to generate a warning e Global Critical Threshold Mandatory Enter the percentage of the file system that must be used for up time to generate a critical alert up time software 167 Agent Monitors File System Capacity 3 Optionally to exclude specific mount points on the disk from the capacity calculations enter the names of the mount points in any or all of the five the Exclude Pattern fields For example you can enter D for Windows or usr for Solaris Linux or AIX to ignore that drive or directory To for example ignore all mount points that start with u enter u 4 Optionally you can set thresholds for specific mount points by entering the following information in any or all of the five Mount Point fields e The name of the mount point for example opt Case sensitivity is not taken into account when monitor defined mount points are matched with those on the file system e The Warning threshold which is percentage of space used on the mount point that when exceeded generates a warning e The Critical threshold which is the percentage of space used on the mount point that when exceeded generates a critical alert The thresholds that you set for each mount point will be calculated separately from the thresholds that you specified in step 2 5 Specify values for the Warning and Critical Response
23. e Network I O e Disk I O Using the information in the report you can gain insight into the overall workload on an IBM pSeries server This enables you to accurately adjust the CPU entitlements of the LPARs and keep track of the overall workload Cc gt Ke J D xe e pa 7 over time Creating an LPAR Workload Report To create an LPAR Workload report do the following 1 Inthe Reports Tree panel click LPAR Workload 2 Inthe Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 3 Select one or more of the following report options e CPU Workload The CPU entitlements of the LPARs and their use of the entitlements e Memory Workload The amount of memory in kilobytes that is being used by the LPARs on the system e Disk IO Workload up time software 475 Using Reports Reports for Virtual Environments The amount of data measured in kilobytes per second that is being read from and written to the disk by the LPARs on the system e Network IO Workload The amount of data measured in kilobytes per second that is being sent and received over the network interface by the LPARs on the system Optionally click Select All to generate a report on all of the options that are listed above 4 If you selected more than one report option and plan to report on more than one system you can optionally click
24. f up time Archiving the DataStore e Disk Performance Statistics e File System Capacity Statistics e Network Statistics e User Information Statistics e Volume Manager Statistics e Retained Data 4 Ensure the Enable Archiving checkbox is selected 5 Click Set Archive Policy 6 Optionally you can click the Archive Now button to immediately create archives of the data in your DataStore up time will check the DataStore entries and archiving anything that is older than the limits you have configured Restoring Archived Data If you need to generate graphs or reports on older data that has already been archived and is no longer in the DataStore you can import specific archives using the restorearchive command line utility The command s parameters allow you to import archives in the following manner e asingle archive that represents a specific archive category and date the collected data for each archive category and 24 hour period is exported to individual XML files e all archives for a specific date 1 e 24 hour period Importing Archived Data into the DataStore To import archived data into the DataStore do the following 1 At the command line navigate to the following directory e Linux usr local uptime scripts e Solaris opt uptime scripts up time software 547 Configuring and Managing up time Archiving the DataStore e Windows C Program Files uptime software uptime archives Run the restorearc
25. 55 changing 156 ping 156 up time agent 156 HTTP Web services monitor 285 icons 10 Clone 10 151 critical 124 Delete 11 343 Edit 10 156 384 395 View 10 IIS monitor 200 IMAP Email Retrieval monitor 289 installation agents 40 Linux 42 pSeries 43 with HMC 43 without HMC 45 Solaris 41 UNIX 42 Windows 40 guidelines 26 Monitoring Station 29 UNIX Linux 32 VMware 35 Windows 30 post installation tasks 37 requirements 27 browsers 28 hardware 28 587 Index Monitoring Station 27 up time agents 28 upgrading 39 Instance Motion graphs 523 interface overview 6 up time tool bar 6 Config 9 Global Scan 7 My Infrastructure 7 My Portal 7 Reports 9 Services 8 Users 8 L LDAP authentication 349 monitor 291 license information 563 Live Splunk Listener monitor 238 LPAR adding 81 entitlement graphs 510 workload graphs 509 LPAR workload graphs 509 M mail servers 534 monitor template 141 alert settings 148 configuring response time 145 configuring settings 142 identification 141 Monitoring Period 150 response time 145 thresholds 144 timing setting options 147 timing settings 146 Monitoring Periods 397 monitors 135 adding alert settings 149 adding information 142 adding timing settings 148 advanced 138 321 external check 328 588 with retained data 326 agent File System Capacity 167 overview 166 Performance Check 170 Process Count Check 174 agentless NIS YP 297 alert settings 148 applica
26. Device dropdown list e Agent e Net SNMP v2 e Net SNMP v3 e Node e Novell NRM e pSeries LPAR Server VIO e pSeries LPAR Server HMC e Virtual Node e VMware ESX e WMI Agentless only present on Monitoring Stations running on Windows Enter the host name of the system in the Host Name field The host name can be the actual name of the machine that up time will be monitoring You can also enter an IP address in this field Optionally enter the port number at which you will be connecting to the system in the Port field In most cases you can use the default port If you selected Agent in step 4 and want to securely access the system click the Use SSL option If you selected Net SNMP v2 in step 4 enter information in the following fields e SNMP Port The port on which the Net SNMP instance is listening e Read Community A string that acts like a user ID or password giving you access to the Net SNMP instance up time 5 User Guide dy up time Working with Systems Common read communities are public enables you to retrieve read only information from the device and private enables you to access all information on the device 9 If you selected Net SNMP v3 in step 4 enter information in the following fields up time software SNMP Port The port on which the Net SNMP instance is listening Username The name that is required to connect to the Net SNMP instance Authentication Password
27. Groups e Applications e Service Level Agreements e Views For more information about using the My Infrastructure panel see Defining and Managing Your Infrastructure on page 65 Services The Services panel enables you to manage and configure services which are provided by an application to perform a specific task up time monitors both services and applications to ensure that performance and availability are maintained In the Services panel you can manage and configure the following e service instances and service groups e Alert Profiles and Action Profiles e host checks e topological dependencies e scheduled maintenance For more information about using the Services panel see Using Service Monitors on page 135 Users The Users panel enables you manage all users user groups Notification Groups and their associated permissions You can view create edit and delete the following e users e user groups e Notification Groups e user roles 8 up time 5 User Guide f up time System L up time software Understanding the up time Interface For more information about using the Users panel see Configuring Users on page 333 Reports The Reports panel enables you to manage and create detailed custom reports on the performance and availability of the resources in your enterprise Using the Reports panel you can e generate a report and schedule when you want it to be generated e selec
28. If you use SNMP traps the trap message will be sent in the format specified by the up time MIB This MIB is found in the scripts directory The uptime software enterprise OID is 1 3 6 1 4 1 24216 Creating Action Profiles To create Action Profiles do the following 1 On the up time tool bar click Services 2 Inthe Tree panel click Add Action Profile The Add Action Profile window appears 3 Enter a name for this profile in the Name of Action Profile field 4 Specify the number of times an error must occur before up time sends a notification in the Start action on notification number field 5 Specify the number of times action will be carried out in the End action on notification number field Optionally select the Never Stop Notifying option to continually carry out the action in this profile until the problem is resolved 6 If VMware vCenter Orchestrator integration has been enabled and you would like the Action Profile to drive an Orchestrator workflow do the following i In the Select Workflow field input a workflow to configure You can either scroll through and select the workflow from the drop down list or begin typing the workflow s name ii Click Get Parameters up time will retrieve information from the Orchestrator server and dynamically display configuration fields for the chosen workflow s input parameters iii Configure the input parameter fields for the workflow For information on the specific c
29. Required three letter abbreviation correct Sun Mon Tue Wed Thu Fri Sat Not Accepted full spellings incorrect e other abbreviation styles Sr MT We Th Friday Saturday Dates Required single or two digit number correct 8 09 10 Not Accepted ordinal suffixes incorrect e full spellings er Months Required three letter abbreviation correct Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Not Accepted other abbreviation styles incorrect J F M June July August Se Oc No De Years Required full year correct 2008 Not Accepted any abbreviation of the year incorrect 08 08 Y2K 8 568 up time 5 User Guide hy up time Time Period Definitions Lists and Ranges Days can be inputted as a list e each day is separated by a comma e g mon tue wed e spaces are optional e g mon tue wed Times and days can be inputted as ranges e Elements in the range must be separated by hyphens e spaces are optional the following examples are correct 8AM 8PM e 8 00 AM 8 00 PM e Fri Mon e Fri Mon e ranges wrap around day and week boundaries e 10PM 2AM is interpreted as 10 00 p m to 11 59 p m on one day and 12 00 a m to 2 00 a m the following calendar day e Fri Mon is interpreted as Friday through Saturday on one week then Sunday through Monday the following week e up time converts day ranges to lists e g Fri Mon becomes Fr
30. Script File ZL Click the Script File check box and then enter the full path on the Monitoring Station to the script that this monitor will run against the database Script U iy o o 2 ce ce 7 Select this option and then type or copy the script that you want up time to against the database into this text box Use this option if you do not have access to the file system on the Monitoring Station or if your script is short or will not regularly change Match Enter a string that you want to match against the return value from the script Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 257 Database Monitors Oracle Basic Checks 3 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 5 Click Fin
31. These states and their conditions under which they happen are shown in the Global Scan status display Service Level Agreement Status Service Level Agreement Status GO Email SLA E 63 of compliance period a 100 of allowable downtime used 89 47 of target 99 0 CRIT The allowable downtime has been exceeded by 15 minutes GEN MY 62 of compliance period EE 66 of allowable downtime used 98 97 of target 99 0 WARN At the current rate this SLA will breach after 10 more minutes of downtime LO Customer Service SLA OOO CS S o 63 of compliance period 38 of allowable downtime used 99 39 of target 99 0 OK The SLA is performing within its target Handling Simultaneous Service Downtime The simultaneous downtime of multiple services does not cumulatively impact an SLA s remaining allowable downtime the term allowable downtime can be expanded to mean the amount of time during which there can be any service downtimes until the compliance period has ended after which the counters are reset In the following outage graph for an SLO note that any time an outage is experienced whether by one or four services the SLO is deemed to have experienced an outage which is reflected in the top red line SLO Overall Outages mi IHH EHIH H PING 10 1 1 140 Outages Fogz Response Time Outages 4 MYSQL Port Check Outages File System Capacity Outages CPU Performance Check Outages PING ginger upti
32. drive 6 Click Generate Graph up time 5 User Guide hy up time VXVM Stats Graph VXVM Stats Graph The VXVM Stats graph charts the amount of data written to or read from a Solaris volume that is managed by the Veritas Volume Manager Veritas Volume Manager is storage management system that operates between a host s operating system and its filesystems or database management systems Veritas Volume Manager enables you to manage disk drives on a system as if they were vo umes logical devices that appear to be physical partitions on a disk Depending on the options that you specify this graph contains the following information e the number of read and write operations to and from the volume e the number of blocks that were read and written to and from the volume e the amount of time that is required to read data from and write data to the volume If Veritas Volume Manager is not running on a host or if up time cannot connect to the volume an error message informing you that up time cannot detect the Veritas Volume Manager appears in the Graphing subpanel In the Info amp Rescan panel verify that the entry Has a Logical Volume Manager is set to Yes If it is then ensure that you can connect to the host from the Monitoring Station See Viewing System and Service Information on page 50 for more information Generating a VXVM Stats Graph To generate a VX VM Stats graph do the following 1 Inthe Global Scan or
33. e Oo 7 up time software 229 Application Monitors Email Delivery Monitor Email Delivery Monitor Although specific up time monitors are available for your POP IMAP and SMTP servers their monitoring duties focus on availability and response time To test your IT infrastructure s ability to send or receive emails within a reasonable amount of time use the Email Delivery monitor Typically email delivery tests include a server that is part of your IT infrastructure and monitored by up time In these cases you will test either incoming mail delivery times by supplying information about a monitored POP3 or IMAP server or test outgoing mail delivery times by supplying information about a monitored SMTP server The Email Delivery executes several steps in order to calculate mail delivery and retrieval time the monitor requests an internal or external SMTP server to send a generated test mail when the monitor asks the SMTP server to send the mail the monitor records the delivery time e the monitor waits for five seconds then logs in to and checks an internal or external POP3 or IMAP mail server to verify the mail was received e if the test mail is not found the monitor waits another five seconds and checks again and continues to check until the process has either timed out or the mail is found the monitor confirms the mail was received and reports both the delivery and retrieval times Configuring Em
34. hy up time Interfacing with up time VMware vCenter Orchestrator Integration Administrators can configure Action Profiles to automatically carry out tasks in the event of an up time alert One such task is the initiation of contact with VMware vCenter Orchestrator and the execution of a workflow To have access to this functionality up time needs to know how to communicate with Orchestrator For information about Action Profiles and VMware vCenter Orchestrator see Action Profiles on page 389 Integrating up time with VMware vCenter Orchestrator To configure up time integration with Orchestrator to execute workflows do the following 1 On the up time tool bar click Config In the Tree panel click VMware vCenter Orchestrator In the sub panel click Edit Configuration Ensure the VMware Orchestrator Enabled check box is selected a A OO N In the VMware Orchestrator Server field enter the host name of or IP address assigned to the Orchestrator server when it was configured 6 Inthe VMware Orchestrator Port field enter the port the Orchestrator server was configured to use in order to communicate with other systems 7 Optionally select the Use SSL check box if Orchestrator was configured to use an SSL certificate 8 Enter the Username and Password of an appropriate user account on the Orchestrator server For proper integration an Orchestrator account with View and Execute permissions is required 9
35. operation A complete transaction must occur before the next transaction can start Longer disk operations per transaction increases the average length of the queue 514 up time 5 User Guide f up time Disk Performance Statistics Graph Read Writes The number of read write requests per second from or to a disk Throughput blks s The amount of disk traffic in blocks of 512 bytes that is flowing to and from a disk each second Average Wait Time The average time in milliseconds that a transaction is waiting in a queue The wait time is directly proportional to the length of the queue Average Serve Time The average time in milliseconds required to perform a task All of the above for one disk up time graphs all of the metrics listed above for a single disk 6 Select the disks for which you want to collect information from the list If you select multiple disks and selected All of the above for one disk in step 5 then up time only graphs information for the first disk that you selected 7 Click Generate Graph up time software 515 Using Graphs Top 10 Disks Graph Top 10 Disks Graph The Top 10 Disks graph displays the ten busiest disks in your environment as of the last sample that up time has taken If there are fewer than ten disks on the system then all of the disks on a system will be charted in the graph Generating a Top 10 Disks Graph To generate a Top 10 Disks graph do the following 1 In
36. select them from the List of SLAs Select a report generation option See Report Generation Options on page 402 for details To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information up time software 455 Using Reports Reports for Availability Reports for Availability The following reports enable you to visualize the availability metrics for all your mission critical Applications and your critical system services e Application Availability Report e Incident Priority Report e Service Monitor Availability Report e Service Monitor Outages Report Application Availability Report The Application Availability report tracks the availability of the Applications in your environment as well as the monitors that are associated with the Applications This report contains the following information e the name of the Application e the service monitors that are associated with the Application e the percentage of time that the Application and monitors are in OK Unknown Warning and Critical states For more information on Applications see Working with Applications on page 101 Creating an Application Availability Report To create an Application Availability report do the following 1 Inthe Reports Tree panel click Application Availa
37. windomain WMI Username administrator WMI Password password Editing a System Profile After you have added a system to up time you might need to change some of the basic information about that system You can do this by editing the system profile To edit a system profile do the following 1 Inthe My Infrastructure panel right click the name of the Element whose profile you want to edit then click Edit The Edit System window appears 2 Inthe Edit System window change any or all of the following options e Display name in up time The descriptive name for the system that appears in the up time Web interface e Description A brief functional description of the system up time software 99 ainjonsjseajuy INOA Huibheuew pue buiuijeg FI Defining and Managing Your Infrastructure Working with Systems 100 3 Parent Group Select the group of systems in up time with which this system will be associated Custom Field 1 to Custom Field 4 These fields enable you to include additional information about the system For example you can record the types of reports that should be run on this system or when maintenance is scheduled The information in the Custom Fields is displayed when you view system information by clicking the Info amp ReScan link in the Tree panel Number of processes to retrieve The default number of processes running on the system that up time will retrieve If you select 10 proce
38. 17 09 02 54 196 From 2008 04 17 01 46 40 to 2008 04 17 03 03 24 76 From 2008 04 17 00 00 00 to 2008 04 17 00 23 38 23 up time software 367 9L e x co gt w D S Oo OD Fr lt D gt Ko OD OD 3 D pr 7 Working with Service Level Agreements SLA Creation Strategies However there may be cases were analyzing the SLA Detailed report will show intermittent outages that have not caused your trial SLA to fail but represent underperforming services that should be optimized SLD Overall Outages HH HHHH H HH HH44H4 Plants Response Outages WebSphere Outages PING lab websphere51 Outages File System Capacity Outages HHH qa a Mar 11 00 00 4 Mar 31 00 00 W Mar 05 00 00 4 Mar 07 00 00 4 Mar 09 00 00 4 Mar 13 00 00 4 Mar 15 00 00 4 Mar 17 00 00 4 Mar 19 00 00 4 Mar 21 00 00 4 Mar 23 00 00 4 Mar 25 00 00 4 Mar 27 00 00 Mar 29 00 00 r T o Q O 98 Ss 6 o 8 0 amp G 2 amp Developing Baselines After outages and underperforming systems have been addressed use the SLA Summary report to compare test service levels to historical data Find a service level that is attainable For example in the SLA graph below a 95 service level would be more realistic than the default 99 level given the historical data In the bottom SLA graph although the 90 service level is compliant based on historical data the perfo
39. 190 000 2010 01 291001 201001 201001 61 87 Kb s 278804 133 F O lt OD 2 OD 2 gt e lt Oo lt n D m i 1 Overseeing Your Infrastructure Changing Reporting Thresholds Changing Reporting Thresholds 134 The thresholds that determine when an Element s reported status changes between normal Warning and Critical i e green yellow and red can be modified for both Global Scan and the Resource Scan Global Scan and the Resource Scan thresholds are configured by separate sets of attributes that can be changed in the up time Configuration panel By changing these attributes you can set how large the color ranges are on resource gauges and at what point table cells change color See Status Thresholds on page 554 for more information Note that when you change Global Scan threshold values the changes are not retroactively applied to all existing Elements monitored by up time changes only apply to Elements added to up time after the threshold changes are made Conversely the Resource Scan gauge ranges are updated immediately up time 5 User Guide f up time CHAPTER 8 Using Service Monitors This chapter introduces the common features and concepts of up time service monitors in the following sections OVverVIeW urdaidrdn abre an dei klerikat ee ie 136 Using Service Monitors rrnnnaanaannnnr vanne ence een rrrann
40. 2007 05 13 2007 05 14 2007 05 15 2007 05 18 In this example the system is consistently over the run queue threshold that was specified when the report was defined Based on this information you can generate a CPU performance graph see page 491 for more information to get a better idea of why the system is exceeding the CPU run queue threshold 448 up time 5 User Guide hy up time Reports for Capacity Planning File System Service Time Summary Report The File System Service Time Summary report indicates which system disks and file systems are using an excessive amount of time to complete disk operations This report helps you identify which systems may benefit from configuration changes e g adding RAM moving a file system to another hard disk implementing a RAID The report contains the following information e the name of the systems for which the report has been generated the names of the disks and file systems on the system e the high low and average service times for each disk or file system measured in milliseconds e then percentile for each disk or file system e g although a file system may have had a high service time of 100ms its 95th percentile of 40ms means 95 of the service times were 40ms or lower On a system with heavy disk usage disks and file systems will be in the higher end of the percentile You can also sort the results in the report by one of six criteria that you can specify when d
41. 24 hours Memory Usage last 24 hours Disk Busy last 24 hours Disk Usage last 24 hours Elements Chart The Resource Scan chart displays the following information for all of the Elements in your environment up time software CPU Usage The percentage of CPU resources that are being used Memory Usage The amount of memory expressed as a percentage of total available memory that is being consumed by a process Disk Capacity The percentage of storage space on the system disk that is being used Network In The average amount of traffic coming in over the network interface Network Out The average amount of traffic going out over the network interface 131 F Oo lt D 2 D 2 gt Ko lt Oo C 5 n o i Overseeing Your Infrastructure Viewing the Resource Scan Report The following image illustrates the Resource Scan chart Elements Name CPU Usage Memory Usage disk Busy Disk Capacity Network In Network Out EUF FilterSNMP filter uptimesoftware com Ag ED cingersNMP ginger PAA DS Novell lab novell6s You can view the Resource Scan gauges for a particular server by clicking the name of the server in the chart If you have grouped your servers the names of individual servers do not appear in the Resource Scan chart Instead the names of the groups are displayed To view a list of Elements in a group click the name of the group
42. 5 8 8 9 To generate reports for groups of systems select the groups from the List of Groups area 10 To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views 11 If you are generating reports for specific Applications in your environment select them from the List of Entities 12 Select a report generation option See Report Generation Options on page 402 for details up time software 447 Using Reports Reports for Capacity Planning 13 To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information Using the CPU Run Queue Threshold Report The following is an example of a CPU Run Queue Threshold report CPU Run Queue Threshold Date Range 2008 04 17 00 00 00 to 2008 04 17 12 02 02 between the time range 00 00 to 23 59 Max CPU Percentage 90 0 Threshold 2 0 x number of CPUs CPU Options Usr Sys Wio CPU Run Queue Depth Sustained Hostname CPUs Threshold High Low Average a Threshold AIX DEV LPAR 10 1 1 57 1 2 00 3 1 2 17 5 AIXS aix5l 1 2 00 4 0 2 20 15 Minutes over Threshold of 2 0 for AIX DEV LPAR 10 1 1 57 Minutes gt N N on 2007 05 04 2007 05 05 2007 05 06 2007 05 07 2007 05 10 2007 05 11 2007 05 12
43. 5 66 7 05 6 51 3 14 8 66 3 44 95th Percentile 0 00 181 80 113 00 173 35 50 00 44 20 10 00 9 00 12 00 12 00 l up time Reports for Service Level Agreements Reports for Service Level Agreements The following reports enable you to assess your organization s ability to meet and diagnose failures in meeting service level agreements by summarizing compliance and reporting on compliance and non compliance of an SLA s component objectives and services e SLA Summary Report e SLA Detailed Report SLA Summary Report The SLA Summary report shows whether an SLA s performance target is being met whether performance even through currently compliant with the defined target may eventually fall short in the future and how component SLOs contributed to performance The report contains charts and a table that provide the following information e your defined service level target and how closely the SLA was met over daily weekly or monthly intervals e atrend line that indicates whether compliance is at risk of not being met on a future date e an optional breakdown of how component SLOs contributed to the SLA not achieving 100 compliance The report answers the following questions e Are we meeting our service targets If we aren t which areas of our infrastructure are failing e Are things getting better or worse For more information on SLA definitions see Working with Service Level Agreements
44. 5 Click Finish LL gt xo O o fe 5 e Oo 7 up time software 199 Application Monitors S IIS The IIS Internet Information Server service monitor checks the performance of an IIS Web server based on thresholds that you set against common IIS performance counters You can use this monitor to determine whether or not IIS is running on a defined port and according to the thresholds you have set on common performance counters Configuring IIS Monitors To configure IIS monitors do the following 1 Inthe IIS monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following settings by clicking the checkbox beside each option and then specifying a warning and critical threshold If the thresholds that you set are exceeded then up time generates an alert For more information see Configuring Warning and Critical Thresholds on page 144 e Bytes Sent Sec The number of bytes that are sent by the server each second e Bytes Received Sec The number of bytes that are received by the server each second e Anonymous Users Sec The rate in seconds at which users have made anonymous requests to the IIS server 200 up time 5 User Guide D up time IIS e Non anonymous Users Sec The rate in seconds at which registered users have made non anonymous
45. ADDSYSTEM cfgcheck true port 9998 number 1 use ssl false systemType 1 hostname 10 1 1 241 displayName MailMain systemSystemGroup 1 serviceGroup description systemSubtype 1 the Audit Log By default the audit log is disabled To enable it edit the uptime conf file which is located at the root of the up time installation directory e opt uptime on Solaris e usr local uptime on Red Hat and SLES e C Program Files uptime software uptime on Windows In the uptime conf file locate the auditEnabled entry and modify it to be auditEnabled yes If the entry does not exist add the entry to the file up time 5 User Guide t up time CHAPTER 5 Using My Portal This chapter explains the My Portal panel 61 Using My Portal Overview Overview When you log into up time the first screen you see is the My Portal panel The My Portal panel gives quick access to basic up time functions and to saved reports The My Portal panel is divided into several sections e Assistance e My Preferences e Latest up time Articles e up time Information e My Alerts e Saved Reports e Custom Dashboards Assistance 62 Search New to up ti Want to kn and running perform The top portion of the My Portal panel gives you quick access to e tutorials that demonstrate how to perform basic tasks in up time e _up time s online help e the uptime software community support for
46. Achieving statistic 362 up time 5 User Guide d up time SLA Compliance Calculation SLA Compliance Calculation SLA downtime occurs when any of the SLA s services are in a critical state An SLA is compliant if its downtime has not exceeded a maximum number of minutes over a one week or one month Monitoring Period For example consider an SLA whose compliance period type is weekly and its Monitoring Period is Monday through Friday 9 p m to 5 p m The Monitoring Period consists of five eight hour days in other words 40 hours or 2400 minutes If the SLA s target is 95 it has 120 minutes of allowable downtime for any of its services Reporting SLA Status 9L An SLA s reported status in the Global Scan panel includes the following in the form of progress bars the percentage of the Monitoring Period that has expired and the percentage of allowable downtime consumed during the Monitoring Period See Viewing All SLAs on page 119 for information about SLA information in the Global Scan panel An SLA will reach a critical state when its allowable downtime has been depleted An SLA will reach a warning level state when its allowable downtime at the current rate of use will be depleted before the compliance e x co gt w D S Oo OD Fr lt D gt Ko OD OD 3 D pr 7 up time software 363 Working with Service Level Agreements SLA Compliance Calculation period has ended
47. Analysis 7 Select a report generation option See Report Generation Options on page 402 for details 8 To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information CPU Utilization Summary Report The CPU Utilization Summary report generates a tabular summary of the CPU and memory consumption over a specific time period Specifically this report returns the following information number of CPUs on the server e the total processor speed of all the CPUs in MHz the maximum minimum and average CPU use expressed as a percentage the maximum minimum and average memory use expressed as a percentage the maximum minimum and average page scan per second expressed as a percentage Creating a CPU Utilization Summary Report To create a CPU Utilization Summary report do the following 1 Inthe Reports Tree panel click CPU Utilization Summary 2 Inthe Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 3 Select one of the following options from the Sort by dropdown list to sort the results that up time returns e Average CPU the default e Hostname up time software 419 Using Reports 420 Reports for Performance and Analysis
48. Check Monitors 170 To configure Performance Check monitors do the following 1 Complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 In the CPU Check area do the following e Select one of the following options from the CPU Value dropdown list e User Time that the CPU spends processing application threads or threads that support tasks which are specific to applications e System Time that the kernel spends processing system calls If all the CPU time is spent in system time there could be a problem with the system kernel or the system is spending too much time processing I O interrupts up time 5 User Guide hy up time Performance Check e Waiting on I O Time that a runnable process requires to perform an I O operation e Total The total of all CPU time that is being used e Enter values expressed as percentages in the CPU Warning Threshold and CPU Critical Threshold fields e Enter the time period in minutes over which up time should check CPU processes in the CPU Time Interval field 3 In the Swap Check area enter values expressed as percentages in the Used Swap Warning Threshold and Used Swap Critical Threshold fields When the percentage of available swap space exceeds these thresholds up time issues an alert 4 In the Process Check area complete the following fields e Process Name The name of
49. Discovery feature See Using Auto Discovery to Add pSeries Servers Managed by an HMC for more information You can add multiple systems to up time in a batch operation using a text file and a command line utility See Adding Multiple Systems on page 92 for more information e Agentless WMI A Windows based system whose metrics collection is managed by WMI Windows Management Instrumentation and does not have an up time agent installed on it WMl based monitoring only works if the Monitoring Station 3 is running on Windows Adding Systems or Network Devices To add systems or network devices do the following 1 Inthe My Infrastructure panel click Add System Network Device The Add System Network Device window appears 2 Enter a descriptive name for the server in the Display name in up time field This name will appear in the up time interface A system can have a different display name than the hostname For example you can assign the display name Toronto Mail Server to a system with the host name 10 1 1 6 This way IP addresses are stored in up time but a more descriptive or meaningful name is displayed in the up time Web interface 3 Optionally enter a description of the system in the Description field up time software 69 ainjonsjseajuy INOA BuiBeuey pue buiuijeg FI Defining and Managing Your Infrastructure Working with Systems 70 Select one of the following options from the Type of System
50. Fri 9 00AM 6 00PM Compliance Period Type Weekly 4200 minutes Parent Group My Enterprise Currently being monitored v Ves Service Level Objective Name Description cat D R websphere Oracle For information about creating and using SLAs see Adding and Editing SLA Definitions on page 371 104 up time 5 User Guide f up time Working with Groups Working with Groups At sites with multiple systems to monitor searching through a large list of systems is time consuming To avoid this problem you can define groups of systems Groups are sets of systems that have been combined in a meaningful way You can group systems by their geographical location or by their function The name of the group should describe the servers or they way in which they have been grouped For example you can create a group called Database Servers that contains all of the database servers in your environment You can assign the following to groups Elements which can be systems nodes SLAs or Applications the user groups that are allowed to view the systems or Elements in a group see Working with User Groups on page 341 for more information on user groups If you plan to group your systems you should first map out what groups you need and which systems will be part of those groups Adding Groups To add a group do the following 1 2 3 On the My Infrastructure panel click Add Group Enter a descriptive name for
51. H AGN 6H 6S AH BN Or Sr 6 6 6 Er S69 S6 SB Er SSSSC SY BY SSP 6 Ger dB 6 8 8 S 6 6 8 6 6 e 6 8 o 6 6 S A a AW A a A A A A A a a A A AN A a AN The server appears to have an ample amount of memory available The report indicates that you can add more instances to the VMware server up time software 471 Using Reports Reports for Virtual Environments Creating a VMware Workload Report To create a VMware Workload report do the following 1 Inthe Reports Tree panel click VMware Workload 2 In the Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 3 Inthe Report Options section select one of the following e Workload Profile CPU The percentage of CPU time that is being used by a VMware instance This is a percentage of the available maximum amount of CPU time This ensures that all of the CPU usage figures add up to the overall CPU usage of the server e Workload Profile Memory The amount of physical memory in kilobytes that is being used by a VMware instance e Workload Profile Disk IO The amount of the disk I O capacity in kilobytes per second that is being used by a VMware instance e Workload Profile Network IO The amount of the network I O capacity in kilobits per second that is being used by a VMware instance e Workload Profile Ready The amount of time that one or more instances runn
52. INOA BuiBeuey pue buiuijeg FI Defining and Managing Your Infrastructure Working with Systems 68 Auto Discovery aO Virtual Node In a clustered environment a device with which up time can communicate using a floating IP address In the Global Scan and My Infrastructure panels virtual nodes are denoted by this icon a VMware ESX A system that is running version 3 or 4 of the VMware ESX server software which enables a single host to run multiple virtual servers and their applications ESX includes features like the ability to balance the computing loads of a group of virtual servers as well as backup data and better manage clusters You do not need to install an agent on an ESX server pSeries LPAR Server VIO A pSeries server that is hosting multiple logical partitions LPARs The VIO virtual input output handles the physical I O requests from the LPARs that are on the server In this configuration up time directly polls the agents installed on the VIO and the LPARs on a pSeries server for workload and other data as illustrated below up time Monitoring Station up time 5 User Guide hy up time Working with Systems You will need to install an agent on each LPAR that you want to monitor See Installing Agents on IBM pSeries Servers on page 43 for more information You can also add pSeries servers that are managed by a Hardware Management Console HMC to up time either manually or using the Auto
53. Management Console for more information To use Auto Discovery to add pSeries servers that are managed by an HMC do the following 1 Inthe My Infrastructure panel click Auto Discovery The Auto Discovery window appears 2 Click the pSeries HMC Discovery option 3 Complete the following fields e HMC Host Name up time 5 User Guide hy up time Working with Systems The name of the system on which the HMC is running e Username The user name required to log into the HMC e Password The password required to log into the HMC 4 Click Continue up time returns a list of the pSeries servers that are being managed by the HMC 5 Click the Add button beside the server that you want to add The Add System Network Device window appears 6 If necessary edit the details of the system as described in the section Adding Systems or Network Devices on page 69 Otherwise click Save in the Add System Network Device window 7 Repeat steps 5 and 6 for any other systems that you want to add Adding VMware Instances to up time VMware ESX server software enables a single host to run multiple virtual servers and their applications up time can monitor both the server that is running VMware ESX and VMware instances which are the virtual servers that are running on the VMware server To add VMware instances to up time do the following 1 Inthe My Infrastructure panel click the name of the VMware server that contains ins
54. Monitors sscissrcsocsseecetcewesiaasesasserdectedantedeaxploonseecs 281 al ee er er er er errr eer reer reer tere 283 Configuring FTP Monitors vssasugsmnmummmvuebu buk 283 HTTP Web Services runnnnunnnnunnnnnnnnnunnnnnnnnnunnnnnnnnnvnnunnn 285 Configuring HTTP Web Services Monitors rrrrnrrrrnnvrnnnvvrnennr 285 up time software xiii D OD ie k O Oo gt p D gt pr 7 xiv IMAP Email Retrieval runnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnsnunnnnner 289 Configuring IMAP Email Retrieval Monitors rrnrrrnnnnnrrnnnrrn 289 LDAP i ssisciaacitnataneiandiias cxmactwacnas cnncdaeatacstaddncdssueecnucctanseeediinns 291 Before You Begin end 291 Configuring LDAP Monitors rrrvvnnnnrrrvnnnnnvrrnnnnnnnrnnrnnrrnnnnnnnnnnnnnn 292 NFS PEPE 295 Configuring NFS MonitofS s sesssssaennnnnnennnnnennnnnennnnnnnnnnnneennnnenne 295 ET JE danian anaa 297 Configuring NIS YP Monitors rrrrrnnrrrnnnnnnvrrnnnnnnnrnnrnnrrrnnnnnnnnnnnnn 297 NNTP Network News rnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 299 Command Implementation eerrnnrrrennnrnnvrnnnnnnnnvnnonnnrnnnnnnnvnnnnnen 299 Response Category ERE EEE renee eer ree 300 Response Codes rrrnnnnnannvnnnnnnronnnnnnvrnnnnnnnvnnnnennnvnnnennerneneenennense 300 Configuring NNTP Network News Monitors rrrnnrrnennnnvrnnnnrn 301 PINO NM sens 303 Configuring Ping MONS a peste nddnd Dace ck senendet eadecee esata tadis 303 POP Email Retri
55. My Infrastructure panel click the name of the system whose information you want to graph 2 Inthe Tree panel click the Graphing tab 3 Click VXVM Stats 4 Select the start and end dates and times for which the graph will chart data For more information see Understanding Dates and Times on page 22 up time software 519 Using Graphs VXVM Stats Graph In the Available Disk Groups and Volumes area select one or more volumes on which to report The disk groups or volumes that appear in this area will vary from system to system You must select at least one disk group or volume Select one of the following options I O Operations The number of times per second that data is written to and read from the volume Block Throughput The amount of disk traffic in blocks of 512 bytes that is flowing to and from the volume Average Service Times The average amount of time in milliseconds that is required for a request to be carried out If necessary uncheck either of the Read or Write checkboxes Depending on the option you chose in step 6 the Read and Write options chart the following information in the graph B 8 520 If you selected I O Operations in step 6 the number of read and write operations to and from the volume If you selected Block Throughput in step 6 the number of blocks that were read and written to and from the volume If you selected Average Service Times in step 6 the amoun
56. MySQL Advanced Metrics 3 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph 4 Complete the following settings e Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish 250 up time 5 User Guide d up time MySQL Basic Checks MySQL Basic Checks The MySQL Basic Checks monitor does the following e determines whether or not a host that is running a MySQL database is available e determines whether or not you can log into a MySQL database e evaluates a response based on a script that is executed against a database or database instance Configuring MySQL Basic Checks Monitors ZL To configure MySQL Basic Checks monitors do the following 1 Inthe MySQL Basic Checks monitor template complete the monitor information fields To learn about monitor information fields see Monitor Identification on page 141 2 Complete the following fields If you enter a value in the SID field up time can capture th
57. NNTP Network News 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information e Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish 302 up time 5 User Guide up time Ping 2 gt The Ping monitor determines whether or not you can communicate with other IP addresses or domain names The Ping monitor can check the following e whether or not you can reach a specified system the amount of time required to bounce a packet off of another site You will receive a response if the connections are good and the target system is running If you have successfully pinged a system in the past but you cannot get a response there is a problem either with the network or with the system If it takes a long time for a ping to return the network or system may be extremely busy EL The ping program sends a small packet of information containing 64 bytes 56 bytes of data and eight bytes of protocol reader information The computer that sent the packet listens for a reply from the specified IP address The ping program then eval
58. Oct 28 2008 12 00AM to Oct 30 2008 12 00AM Note The time range in a fixed date range merely acts as a more precise start point and end point a fixed date range is a contiguous block of time that has no gaps up time 5 User Guide f up time Time Period Definitions Weekly Recurrence every lt day gt lt day range list gt lt time range gt Basic example Sun Sun Tue Every Sun Mon Tue Spaces are optional Sun Tue EverySun Mon Tue Time ranges are optional Sun 9 AM 5 PM Sun Tue 9AM 5PM EverySun Mon Tue 9AM 5PM Note Recurring days that do not include a time range are interpreted to include the entire day i e 12 00 a m through 11 59 p m although this will not automatically appear in the defined time period Yearly Recurrence every lt month gt lt date gt lt time range gt Basic example Every Oct 28 Ordinal suffixes are optional Every Oct 28th Time ranges are optional Every Oct 28 7PM 1 1PM Note You cannot define a date range within a yearly recurrence instead combine a separate yearly recurrences for each date in the date range up time software 571 ER ES ETE Y Time Period Definitions Monthly Recurrence every month on the lt date gt lt time range gt Basic example Ordinal suffixes are optio Time ranges are optional Monthly Ordinal Re Every month on t
59. Report sarrrrnnnnnnnvnnnnnnnvnnennnrrnnnnrnvnnnnnen 425 Reports for Capacity Planning xs aarunnnnnnnnnnnnnnnnnnnnnnnnnr 428 Enterprise CPU Utilization Report rrarnnnnnnnnnnronnnnnrrrennnnnvnnnnnnn 428 File System Capacity Growth Report arrvrnnnnnnvrvnnrnnnrrennrnvrrnnnrn 431 Server Virtualization Report errnnrrrnnnnnnvvnnnnnnnvnnennnrrnnnnrnvnnnnnen 432 Solaris Mutex Exception Report arrrrrnnnnnvvnnnnnnvrvnrrnnrrrennnnvrnnnnnn 436 Network Bandwidth Report rrrrrnnnnnnnnnnrrrrnnnnnnnn nr rrrnnnnnnnnnnrnnnnr 438 Disk I O Bandwidth ACDON aan nat 441 CPU Run Queue Threshold Report arnnrrnnnnnnnnvnnnnnnnrnnnnnnrnnennen 445 up time 5 User Guide J up time File System Service Time Summary Report rrrrrrnnnnrrnnnnnnvnnnnnrn 449 5 oy Reports for Service Level AgreementSe aranrnannnnnnnnnnnnnnnr 453 D SLA Summary Report wes cicessnctiecxcsedeucersueensieues lucas vax pesyuspeanisetadents 453 S SLA Detailed Report rrronrnrvorrvvvrvvrvvrvr venners vnr vrvrvrvrrrvenrvenrvennn 454 z 5 Reports for Availability runravunnnnunnnnunnnnnnnnnnnnnnnnnnnnnnunnn 456 2 Application Availability Report eesossevvvnsenavvenennnnvnnvannnvensnsensernener 456 D Incident Priority FRODO oe socco nc tcntetalests exvisrat scab eaeeeeiees aucathcecebeteds 457 Service Monitor Availability Report rrrnrrrrnnnnnrrnnnrnrrrrnnrnnrrnnnnnn 460 Service Monitor Outages Report arrrrnnnnnnvnn
60. SSH_2 0_SUN_SSH1 0 SSH Server Version 307 EL 2 pr Oo Ww 92 Oo Oo Oo 2 Network Service Monitors SSH Secure Shell The version of the SSH server that you want to monitor This is the string immediately following the major and minor version numbers of SSH In the following example the SSH server version is SUN_SSH1 0 SSH_2 0_SUN_SSH1 0 Response Time Enter the Warning and Critical Response Time thresholds for the overall time required to perform a service check For more information Configuring Warning and Critical Thresholds on page 144 3 Click the Save for Graphing checkbox to save the data fora metric to the DataStore which can be used to generate a report or graph 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish 308 up time 5 User Guide f up time SMTP Email Delivery SMTP Email Delivery The SMTP monitor tests a mail server for the standard mail response header If the mail server does n
61. See The Platform Performance Gatherer on page 157 for more information By default the Platform Performance Gatherer checks the host Elements performance levels every 300 seconds You can change the interval by manually inputting settings in the up time Configuration panel as outlined in Modifying up time Config Panel Settings on page 529 Changing the Performance Monitor Check Interval You can modify the Platform Performance Gatherer check interval through the following parameter the default value is shown performanceCheckInterval 300 A change to the Platform Performance Gatherer check interval 3 is not retroactively applied to all Elements only Elements added after an interval change will reflect that change up time software 557 Configuring and Managing up time Report Storage Options Report Storage Options When an up time user generates a report that report is stored in the GUI reportcache directory when a scheduled report is automatically generated and published it is stored in the GUI published directory Both of these directory paths are found in the up time installation directory e Linux usr local uptime e Solaris opt uptime e Windows C Program Files uptime software uptime Windows Vista users can find the audit log in the Virtual Store instead of the default location i e C Users uptime AppData Local VirtualStore Program Files lt uptime install directory gt By d
62. Standard or Enterprise R2 with 32 bit execution Microsoft Windows Server 2008 Microsoft Windows Server 2003 Standard or Enterprise R2 Microsoft Windows 7 Microsoft Windows Vista Microsoft Windows XP Professional Red Hat Enterprise Linux 4 7 5 4 6 Solaris SPARC 10 SUSE Linux Enterprise Server 11 11 1 Note Suse Linux systems may require additional SSL libraries up time software a Installing up time Installation Requirements Supported Web Browsers You can use the following Web browsers with up time e Internet Explorer 7 or higher e Firefox 3 6 or higher e Chrome 10 or higher Minimum Hardware Configuration The hardware configurations for a Monitoring Station can change depending on the number of agents that you want to monitor the reports that you want to generate and the amount of data that in the up time DataStore Contact uptime software Client Care if you are monitoring more than 50 nodes The following is the recommended minimum hardware e 2 4 GHz dual core processor e 2 GB of memory e 80 GB of disk storage e 100 Mbps network interface up time Agents 28 You can install and use up time agents to collect data from a number of operating systems Check the uptime software Client Care Web site for the most up to date list of supported platforms and architectures gt Uup time can monitor Novell NetWare NRM version 6 5 Ear A lier versions of NR
63. Status of Regular Services columns For example if there are three services associated with the Application and their status is OK then three green bars appear in this column O lt OD 2 OD 2 gt e lt Oo lt n o pen i a 1 up time software 125 Overseeing Your Infrastructure Viewing All Applications Detailed View Click the Show Detailed View button to change to the Detailed view of the View Applications subpanel as illustrated below Application Status Show Condensed View Application Name D Monitor Information SCap member x QA LPAR member X DEV LPAR member KS E pine 10 1 1 56 Default ping check for 10 1 1 56 B pine 10 1 1 57 Default ping check for 10 1 1 57 Ber Default ping check for aix5 Application Name Description bea i Oracle Advanced HP Integ E prneo 10 1 1 232 Default ping check for 10 1 1 232 lab t1 2 The name of the master Application group is in the far left column for example Databases in the image above The names of the individual Applications are in the columns on the right for example PING mckay and UPTIME mckay in the image above Master service monitors in an Application are marked with an asterisk The status of a service is denoted by a colored bar beside the name of the service green for services that are functioning normally yellow for services that are in a warning state and red for services
64. The host name of the proxy server that the Web Application Transaction monitor uses to access the Internet e webmonitor proxyPort The port through which the Web Application Transaction monitor communicates with the proxy server e webmonitor proxyUsername The user name required to use the proxy server e webmonitor proxyPassword up time 5 User Guide f up time Interfacing with up time The password required to use the proxy server Remote Reporting Settings If you are using a reporting instance an up time instance that only generates and serves reports the remote reporting settings enable you to specify the location of the reporting instance and the port on which it is listening Modifying the Remote Reporting Server Settings To configure the remote reporting instance used by up time do the following 1 On the up time tool bar click Config 2 Inthe Tree panel click Remote Reporting 3 Inthe sub panel click Edit Configuration 4 Ensure the Reporting Instance Enabled check box has been selected 5 In the Remote Reporting Server field enter the host name or IP address of the server on which the remote reporting instance is found 6 Enter the port used to communicate with the server 7 Click Save The edit window closes and you are returned to the Remote Reporting Instance Configuration panel 8 To test the remote reporting server configuration click Test Configuration A pop up window appears i
65. The password that is required to connect to the Net SNMP instance Authentication Method optional From the list select one of the following options which will determine how encrypted information travelling between the Net SNMP instance and up time will be authenticated MD5 A widely used method for creating digital signatures used to authenticate and verify the integrity of data SHA A secure method of creating digital signatures SHA is considered the successor of MD5 and is widely used with network and Internet data transfer protocols Privacy Password The password that will be used to encrypt information travelling between the Net SNMP instance and up time Privacy Type optional From the list select one of the following options that determine how information travelling between the Net SNMP instance and up time will be encrypted DES 71 ainjonsjseajuy INOA Huibheuew pue buiuijeg FI Defining and Managing Your Infrastructure Working with Systems 72 An older method used to encrypt information e AES The successor to DES which is used with a variety of software that require encryption including SSL servers You can set both the authentication and password types only one of them or neither 10 If you selected Node in step 4 optionally select the following check boxes Is Node Pingable This options specifies whether up time can contact the node using the ping utility There are s
66. Time thresholds For more information see Response Time on page 145 To save the data from the thresholds for graphing or reporting click the Save for Graphing checkbox beside each of the Response Time metrics 168 up time 5 User Guide hy up time File System Capacity 6 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 7 Click Finish up time software 169 Agent Monitors Performance Check Performance Check The Performance Check monitor gathers the following metrics the percentage of CPU time user system waiting for IO or total averaged over the number of seconds that you specify that is being used the percentage of swap space that is available CPU usage reported by the ps utility averaged over the number of minutes that you specify the number of network collisions per second inbound errors per second and outbound errors per second the number of network retransmits averaged over the number of seconds that you specify Configuring Performance
67. UID is called the Owner e GID The ID of the group that has been consuming CPU time On Windows systems the GID is called the Group Name e Memory Used The amount of memory expresses as a percentage of total available memory being consumed by a process On Windows systems Memory Used is called Virtual Bytes The Memory Used value can be misleading because shared memory between processes is counted multiple times For example if five Oracle processes are using 10 of available memory this does not indicate that Oracle is consuming 50 of system memory e RSS Run Set Size the amount of physical memory that is being used On Windows systems RSS is called the Working Set 524 up time 5 User Guide hy up time Displaying Detailed Process Information e CPU The percentage of the CPU time used by the process calculated by dividing total used CPU Time by the process running time if applicable the result is further divided by the number of CPUs for the Element on which the process is running On Windows systems the CPU is called Processor Time e User Time The amount of time in seconds that a particular user group or account has been using the CPU This value is not displayed for Windows systems e User System Time The amount of time in seconds that a process has been consuming system time on the CPU This value is not displayed for Windows systems You can get a better indication of the amount of
68. YOU ARE AN INDIVIDUAL OVER THE AGE OF 18 AND THAT YOU HAVE READ THIS AGREEMENT THAT YOU UNDERSTAND IT AND THAT YOU ACCEPT AND AGREE TO BE BOUND BY ITS TERMS IF YOU ARE UNWILLING TO BE BOUND BY THE TERMS OF THIS AGREEMENT YOU SHOULD CLICK THE I DO NOT ACCEPT BUTTON BELOW TERMINATE THE DOWNLOAD PROCESS AND REFRAIN FROM ACCESSING OR USING THE SOFTWARE THIS AGREEMENT REPRESENTS THE ENTIRE AGREEMENT BETWEEN YOU AND UPTIME CONCERNING THE SOFTWARE AND THIS AGREEMENT SUPERSEDES AND REPLACES ANY PRIOR PROPOSAL REPRESENTATION COMMUNICATION ADVERTISEMENT OR UNDERSTANDING YOU MAY HAVE HAD WITH UPTIME RELATING TO THE SOFTWARE License 1 1 Grant of License Uptime hereby grants to you and you accept a limited non exclusive license to use the Software in machine readable object code form only and the user manuals accompanying the Software the Documentation only as authorized in this Agreement For purposes of this Agreement the Software includes any updates enhancements modifications revisions or additions to the Software made by Uptime and made available to end up time 5 User Guide D up time NOTICE TO USER users through Uptime s web site Notwithstanding the foregoing Uptime shall be under no obligation to provide any updates enhancements modifications revisions or additions to the Software 1 2 Scope of Use You may use the Software activated by a license key on a single server designated by you as t
69. agent name gt is the name of the archive that contains the agent that you are installing For example uptmagnt AIX lt version gt tar e Run the following command to install the agent as a Virtual I O Server INSTALL sh vio 3 Do the following to install the agent on the VIO e Log into the VIO as root e Copy the archive containing the agent to the LPAR e Extract the contents of the archive using the following command tar xvf lt agent name gt Where lt agent name gt is the name of the archive that contains the agent that you are installing For example uptmagnt AIX lt version gt tar e Run the following command to install the agent INSTALL sh vio 46 up time 5 User Guide f up time CHAPTER 4 Getting Started This chapter introduces you to the basic features of up time in the following sections Accessing and Exiting up time rrrrrrrrrrrrrrrrnnssnnnnnnnnnner 48 Viewing System and Service Information rrrrrruunnnnnnnnnnnr 50 Searching and Filtering raanmennnnnanan vanner annnennennnnnnrennennr 57 Audit L ggngraruxsandansult endar Saida EEEN 60 47 Getting Started Accessing and Exiting up time Accessing and Exiting up time Before logging into up time you will need a user name and password from your system administrator Your system administrator will provide assistance if this is your first time logging into the application Setting Up the Administrator Account T
70. and Analysis Multi System CPU Report 418 The Multi System CPU report charts and compares the CPU performance statistics from multiple systems in your environment These statistics indicate whether or not the systems are exhibiting balanced behavior or if processes are being forced off CPUs in certain circumstances Creating a Multi System CPU Report To create a Multi System CPU report do the following 1 2 In the Reports Tree panel click Multi System CPU In the Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 If you want the report to only include data from certain hours during the day select those hours from the dropdown lists in the Daily Hours section as shown below Daily Hours End 21 00 z For example if you want to report to cover the hours from 1 00 a m to 1 00 p m select 1 00 from the Start dropdown list and 13 00 from the End dropdown list If you want to generate reports for systems in specific groups select the groups from the List of Groups area To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views If you are generating reports for specific systems in your environment select them from the List of Systems up time 5 User Guide hy up time Reports for Performance and
71. and output from the service monitor This corresponds to the template used for email alerts Long Template Contains the information in the medium template as well as the status of the host 4 Click Fill up time software 385 Alerts and Actions Working with Custom Alert Formats The variables associated with the template appear in the subject and body fields Custom Formatting Options J Custom Format Medium Template EI up time Alert SERVICENAME gt SERVICESTATE Notification type TYPE DATETIME OSTSTATE ate SERVICESTATE Output fOUTPUTE Help on Custom Formats 5 Add or remove variables see as needed You can also add other information to the body of the alert such as paths to custom scripts or the names of alternative contacts 6 Click Save Custom Alert Format Variables 386 The variables are the building blocks of a custom alert format You can add or remove variables to suit your needs These alert variables are also available as input parameter values when configuring an Action Profile to initiate a VMware vCenter Orchestrator workflow The table below explains the variables available in custom alerts as well as Orchestrator input parameters Variable Definition SDISPLAYNAMES The name of the Element as it appears in the up time Web interface A system can have a different display name than the hostn
72. e SMTP Bytes Received Per Second The total number of bytes received per second by the Exchange SMTP server e SMTP Messages Sent Per Second The maximum number of messages sent per second allowed by the SMTP server e SMTP Messages Received Per Second The maximum number of messages received per second allowed by the SMTP server e SMTP Average Bytes Per Message The average number of message bytes per inbound message received indicating the size of messages received through an SMTP receive connector LL e SMTP Inbound Connections The number of incoming connections that the SMTP server allows e SMTP Outbound Connections The number of outbound connections that the server allows to all remote domains e Average Delivery Time gt xo O o fe 5 e Oo 7 The average time in milliseconds between an Exchange server receiving a message from the client and an Exchange server deliverying the message to an Inbox e Active Connections The number of connections to the Exchange store that have shown activity in the last 10 minutes e Active Client Logons The number of clients that performed any action within the last 10 minute time interval up time software 197 Application Monitors Exchange 198 Active User Count The number of unique user connections that have logged on to the server and shown activity in the last 10 minute time interval Current Webmail Users The num
73. e sys The percentage of CPU time that is being use to carry out system processes usr The percentage of CPU time that is being used to carry out user processes e wio The percentage of CPU time that could be handling processes but which is waiting for I O operations to complete If you want to generate reports for groups of systems select the groups from the List of Groups area To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views If you are generating reports for specific systems in your environment select them from the List of Systems You should select more than one system Select a report generation option See Report Generation Options on page 402 for details To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information up time 5 User Guide hy up time Reports for Capacity Planning File System Capacity Growth Report The File System Capacity Growth report illustrates the following The used available percentage used and total size of the file system at the beginning and end of the reporting period The used available and total size metrics are measured in megabytes e The percentage by which the file
74. field Generating a Disk I O Bandwidth Report To generate a Disk I O Bandwidth report do the following 1 Inthe Reports Tree panel click Disk 1 O Bandwidth up time 5 User Guide hy up time Reports for Capacity Planning 2 Inthe Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 If no data available for the date range the report displays a message indicating that there is no data for the time period 3 To only include data from certain hours during the day select those hours from the dropdown lists in the Daily Hours section as shown below Daily Hours Include data samples between these hours only End 21 00 z For example if you want to report to cover the hours from 8 00 a m to 6 00 p m select 8 00 from the Start dropdown list and 18 00 from the End dropdown list 4 In the Bytes per Block field specify the size of input and output a blocks in bytes The default is 512 bytes Ko J L xe e 7 Optionally click the Output in MB to display the I O values in megabytes rather than blocks 5 If you want to include or exclude certain disks enter the following in the Exclude Disks and Exceptions fields e The name of the disk e A regular expression See Using Regular Expressions on page 442 for more information 6 If you want to include or exclude certain file systems
75. flushing operation occurs This happens when someone issues a FLUSH TABLES statement or executes either the mysqladmin flush tables or mysqladmin refresh commands When the table cache fills up the server locates a cache entry to release tables that are not currently in use in least recently used order If a new table needs to be opened but the cache is full and no tables can be released the cache is temporarily extended as necessary When the cache is in a temporarily extended state and a table goes from a used to an unused state the table is closed and released from the cache e QPSA The average number of queries per second that must be exceeded before up time generates an alert e Bytes Received The number of bytes received by the server e Bytes Sent The number of bytes sent by the server to all clients e Delayed Insert Threads Select a comparison method for the Warning and Critical Thresholds Then enter the number of delayed insert threads that must be exceeded before up time sends an alert 246 up time 5 User Guide f up time up time software MySQL Advanced Metrics The DELAYED option for the INSERT statement is a MySQL extension to standard SQL that you can use with clients that cannot wait for the INSERT statement to complete When a client uses the INSERT DELAYED statement the row is immediately queued to be inserted when the table is not in use by any other th
76. for in the response from the server This monitor parses the text from the server and using the threshold values you enter determines if the entire Web page returned by the server is within acceptable parameters up time software 285 Network Service Monitors HTTP Web Services 286 For example if a Web page is returned then the monitor parses the entire page for the text that you input to match against If you want to ensure that a particular page is returned you could enter lt TITLE gt Expected Page lt TITLE gt where Expected Page is the title of the Web page The monitor generates an alert if this page is not matched Authentication The user ID and password in the form userid password For example jlamport bluefrog5 Virtual Host The unique domain name that resolves to the IP address of the domain that you want to monitor A virtual host has its own domain name but has the same IP address as other domain names hosted by the Web server Server Response Enter a string to match against the response from the server For example HTTP 1 1 200 OKOrHTTP 404 File not found Then set the Warning and Critical comparison methods For more information see Configuring Warning and Critical Thresholds on page 144 Follow Re Direct Actions Select an action that enables you to specify whether or not you want to be redirected to another Web address e OK Return an OK status for any re dire
77. for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information Click Finish up time 5 User Guide up time HTTP Web Services HTTP Web Services The HTTP Web Services monitor simulates the steps that you take to access a Web site Using this monitor you can verify several things e you can access a Web site using HTTP e you can log on to a Web site e a Web site is running according to your expectations You can determine this by examining the values that are returned from the Web server The HTTP Web Services monitor relies on a Universal Resource Identifier URI which defines a specific file location on a Web server This monitor can test for application calls database responses or any other information that a URI can return EL Configuring HTTP Web Services Monitors To configure HTTP Web Services monitors do the following 1 In the HTTP Web Services monitor template complete the monitor information fields To learn about monitor information fields see Monitor Identification on page 141 2 Complete the following fields 2 pr Oo Ww 92 Oo Oo Oo 2 e URI The URI of the Web page that you want to monitor For example login php e Text to Look For Optional Enter the text that you want the monitor to search
78. for the agent such as nobody bin or adm However using these accounts may pose security risks depending on other system processes that run under these accounts On HP UX you cannot start processes such as agents using the nobody user ID Also on Windows 2000 the agent must be running with Administrator privileges If it is not the agent will not be able to access the system perfor mance counters 26 up time 5 User Guide hy up time Installation Requirements Installation Requirements This section describes the system requirements for the up time Monitoring Station and up time Agents Before installation it is recommended that you check the uptime software Web site http www uptimesoftware com for the most up to date list of hardware and software requirements up time Monitoring Station The up time Monitoring Station is a computer running the core up time software that retrieves information from client systems either through agents installed on the system or by monitoring services running on the system The Monitoring Station has a self contained Web server and database that enables easy access to the application and data 5 a Si Ko xe 3 D The Monitoring Station can run on the operating systems listed below You should refer to the uptime software Client Care Web site for the most up to date list of supported platforms Operating System Version s Microsoft Windows Server 2008
79. host that you want to monitor For internal hosts you can use the ipconfig command from the command line The ipconfig command returns information similar to the following Connection specific DNS Suffix uptimesoftware com IP Address e ma eo Wik krets TE RED 4D subnet Mask sa a 2554255 255 0 Default Gateway 2 2 2 2 2 10 1 1 1 up time 5 User Guide D up time DNS For external hosts you can use the nslookup command from the command line as follows nslookup lt host name gt The nslookup command returns information about the host similar to the following Server filter uptimesoftware com Address 10 1 1 100 Name uptimesoftware com Addresses 217 160 226 70 10 1 1 95 192 168 23 17 192 168 190 1 Configuring DNS Monitors EL To configure DNS monitors do the following 1 Inthe DNS monitor template complete the monitor information fields To learn about monitor information fields see Monitor Identification on page 141 2 Complete the following fields e Hostname to Lookup The host name that the monitor will check The host name can be a Web site address a server name or a cluster name For example for a Web site enter www uptimesoftware com in this field 2 pr Oo Ww 92 Oo Oo gt Oo 2 e Port The number of the port on which the DNS server is listening The default is 53 up time software 281 Network Service Moni
80. in the monitor parameters The default OID that you specify should be the Enterprise identification string Net SNMP up time software The up time SNMP monitor also supports Net SNMP which is a suite of command line and graphical applications that do the following e request information from SNMP agents e set information on SNMP agents e generate and handle SNMP traps To take advantage of the Net SNMP features you must e Install and configure the Net SNMP application suite on your server Visit http net snmp sourceforge net for more information e Have a Net SNMP agent already installed on the host or hosts that you want to monitor The Net SNMP HOST RESOURCES MIB used to gather performance statistics from a host must also be enabled See the Net SNMP documentation for details e Adda Net SNMP entity to up time For more information see Adding Systems or Network Devices on page 69 311 EL 2 pr Oo Ww 92 Oo Oo Oo 2 Network Service Monitors SNMP SNMP MIB Browser The SNMP monitor uses the SNMP MIB Browser to load OIDs from MIBs on your system or on a server The first step in setting up an SNMP monitor is to use the up time SNMP MIB Browser applet to load MIBs into up time select managed objects Supported Versions of SNMP The up time SNMP monitor works with the following versions of SNMP v2 The second implementation of the S
81. is being accomplished and of which service monitors it consists up time software 373 Working with Service Level Agreements Adding and Editing SLA Definitions Add a service monitor that will be associated with the SLO by first selecting its host from the dropdown list then adding the service monitor Continue to add service monitors to the SLO as required Click Save Associating Alert and Action Profiles to an SLA To add a service level objective to an SLA do the following 1 In the My Infrastructure panel click the name of the Service Level Agreement that you want to edit The Service Level Agreement General Information subpanel appears Associate Alert Profiles with the SLA by clicking Edit Alert Profiles In the Alert Profile Selector pop up window select one or more of the Available Alert Profiles from the list then click Save If required associate Action Profiles with the SLA by clicking Edit Action Profiles In the Action Profile Selector pop up window select one or more of the Available Action Profiles from the list then click Save Editing SLA and SLO Definitions To edit a service level agreement do the following 1 374 In the My Infrastructure panel right click the name of the Service Level Agreement that you want to modify then click Edit The Edit Service Level Agreement window appears Edit the SLA as described in the previous section See Adding a Service Level Agreement
82. maximum and that there are a large number of connections that are waiting in the pool From there you can then adjust the size of the connection pool to allow more connections up time 5 User Guide d up time Reports for J2EE Applications Or if a WebLogic application is using a large amount of memory you could check the JVM charts section of the report WebLogic Server JVM Heap Size MedRecServer 67000000 66000000 65000000 64000000 g 63000000 62000000 61000000 60000000 co co D D D D D D D o m D T T T T T T T T ce q N q N 88 88 88 88 88 88 88 88 88 88 88 88 BS eS ga 28 g5 g8 SF 22 BR g8 SS g8 BF BF o o oa i o Q o N N N N N N N N N N N N N WebLogic Server JVM Free Memory E MedRecServer 50000000 40000000 30000000 o x 20000000 10000000 0 0 0 D m on a D D T T T T T T N y N X N 28 88 88 88 88 88 88 88 88 88 88 BS 88 g R 48 g5 29 g g 28 g 28 gS 48 Oo o o Oo Oo Oo o N N N N N N N N N N N N N If there are increases or sudden spikes in the heap size or memory usage of the JVM then you can tune the JVM to ensure that it is working at optimal levels up time software 469 Using Reports Reports for Virtual Environments Reports for Virtual Environments The following reports enable you to visualize the performance of systems that are consolidated on virtual machines whether using VMware or IBM pSeries Logi
83. more services e 3 Unknown This status is returned when e The host on which the service sits is offline e The host on which the service sits is in a scheduled maintenance or downtime period e The Monitoring Station could not execute the service monitor Each status reflects the state of the service that has been assigned to the system that you are currently viewing up time picks up these error codes and triggers an alert or an action If a service is in a warning or critical state you can acknowledge an alert so that up time does not generate subsequent notifications The status of the services associated with a system are displayed in the Global Scan panel as shown below Name Service Status WARN CRIT MAINT UNKN ACK g The Yault 7 WebSphere The figures in each column in the Global Scan panel indicate the number of services for that particular machine that are in each state Click a number to view the System Status screen for a particular system See Viewing the Status of a System on page 489 for more information up time software 21 IPUEISJApPUN F euwn dn Hu Understanding up time Understanding Dates and Times Understanding Dates and Times When you are configuring graphs or reports you must specify a range of dates and times over which the graph or report will chart information up time will only collect information for the periods that you specify You specify data and time ranges in the Dat
84. of resources applications and data for their user community 2 up time 5 User Guide up time up time Architecture up time Architecture up time consists of a Monitoring Station that retrieves information from client systems either through software i e an agent that is installed on a system or by monitoring services running on a system The following diagram illustrates the general architecture of up time System Administrators Oo T 0 0 c Mana e D CITITI Application amp Virtualization Network Database Business Network Web Servers VMware Infrastructure Servers Applications Storage WebLogic AIX LPARS ONS SNMP Oracle Email SAN WebSphere Solaris Zones FTP LDAP SQL Server CRM LUN Apache HTTP NFS MySQL ERP DISK Exchange Server IMAP NIS YP Sybase Web 3RD Party Ir POP NNTP Citrix Tomcat SMTP TOP Custom Apps SSH Platforms AXL we A N oiar amp t t up time software Welcome to up time up time Service Monitoring Concepts up time Service Monitoring Concepts Before you start using up time you should first understand the underlying service monitoring concepts Monitors The service monitor templates that are bundled with up time You use these templates to configure a service check Alert Profiles Templates that tell up time exactly how to react to various alerts issuing alert notifications and performing recovery options generated by your
85. on page 371 for information up time 5 User Guide f up time Adding and Editing SLA Definitions Since SLA reporting and monitoring is based on weekly or monthly compliance periods changing any of the following on an existing SLA affects the reported SLA status and generated reports Monitoring Period e target percentage e compliance period type B Any changes made are immediately reflected in any SLA reporting To edit a service level objective do the following 1 In the My Infrastructure panel click the name of the Service Level Agreement that you want to modify then click Edit The Service Level Agreement General Information subpanel appears Click the SLO s corresponding Edit icon Edit the SLO as described in the previous sections See Adding Service Level Objectives to an SLA on page 373 for information Since SLA reporting and monitoring is based on weekly or monthly compliance periods changing the service monitors that make up an SLO definition will affect the reported SLA status and generated reports B Any changes made are immediately reflected in any SLA reporting up time software 375 OL e x co gt w D S Oo OD Fr lt D gt Ko OD OD 3 D pr 7 Working with Service Level Agreements Adding and Editing SLA Definitions 376 up time 5 User Guide Cupane CHAPTER 17 Alerts and Actions This chapter covers up time s al
86. on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish up time software 237 LL gt xo O D fe 5 e Oo 7 Application Monitors Live Splunk Listener Live Splunk Listener 238 Live Splunks are scheduled searches of Splunk queries that are saved on the Splunk server A Live Splunk automatically runs a search can initiate an alert and can perform actions based on that alert You can for example set up a Live Splunk to search for all critical error conditions The Live Splunk Listener monitor enables you to capture the information generated by a Live Splunk This monitor is very similar to the External Check monitor see page 328 and uses scripts that are bundled with up time found in the scripts subdirectory to return Live Splunk information to the Monitoring Station The version of Splunk you are using with up time determines which script s you will need to modify e for Splunk v2 you need to edit and use liveSplunkHandler_v2 py e for Splunk v3 and v4 you need to edit both alertUptimeStatusHandler sh and alertUptime py The script or pair of scripts take the following options e message message that will be returned to the up time Monitoring Station For example if the Live Splunk is configured to search for warning conditions you can enter the message Changed to WARN status
87. pr Oo Ww 92 Oo Oo Oo 2 e 3xx the command OK to this point but the rest of it will be sent e 4xx the command was correct but could not be carried out e 5xx the command is not implemented or it is incorrect or a serious program error has occurred up time software 299 Network Service Monitors NNTP Network News Response Category The next digit in the status response code indicates the function response category x0x connection setup and miscellaneous messages x1x newsgroup selection x2x article selection x3x distribution functions x4x posting x8x nonstandard extensions x9x debugging output Response Codes 300 The following is a list of general response codes that may be sent by an NNTP server These are not specific to any one command but may be returned as the result of a connection a failure or an unusual condition 100 help text 190 through 199 debugging output 200 the server is ready and posting is allowed 201 the server is ready but no posting is allowed 400 service has been discontinued 500 the command is not recognized 501 a command syntax error occurred 502 an access restriction or permission is denied 503 a program fault occurred and the command was not executed You can ignore 1xx codes Code 200 or 201 is sent upon initial connection to the NNTP server depending upon th
88. process e enter the basic parameters for the report e select the values for the retained on which you want to report up time software 425 Using Reports Reports for Performance and Analysis Creating Service Monitor Metrics Reports To create a Service Monitor Metrics report do the following 1 Inthe Reports Tree panel click Service Monitor Metrics 2 In the Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 3 If you want to generate reports for systems in specific groups select the groups from the List of Groups area 4 To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views 5 If you are generating reports for specific systems in your environment select them from the List of Entities 6 Click Go to page 2 A table containing the current retained service metrics appears in the Service Metrics subpanel 7 Click the checkboxes in the Select column to select the variables on which you want to report as shown below Current Retained Service Metrics Host Instance Name Instance Description 1 select Variable Units WebSphere lab websphere51 WebSphere Connection Pool Average Time v ms Connection Pool Closed w Connection Pool Created v Connection Pool Percent Used v O 000080 Gonpection Pool Size 8 Optionally s
89. process that you want this monitor to check This monitor uses the ps utility on UNIX to collect information about active processes For example to check the status of the email process enter sendmail in this field e Enter values expressed as percentages in the Process Warning Threshold and Process Critical Threshold fields e Enter the time period in minutes at which up time will check the process in the Process Check Time Interval field 5 Inthe Network Check area do the following e Select one of the following options from the Network Value dropdown list e Collisions The simultaneous presence of signals from two nodes on the network which can occur when two nodes start transmitting over a network at the same time During a collision both up time software 171 Agent Monitors Performance Check packets involved in a collision are broken into fragments and must be retransmitted e In Errors Data packets that were received but could not be decoded because either their headers or trailers were not available e Out Errors Data packets that could not be sent due to problems formatting the packets for transmission or transmitting the packets e Enter values expressed as percentages in the Network Warning Threshold and Network Critical Threshold fields 6 Inthe Network Retransmit Check section complete the following fields e Network Retransmits Warning Threshold The number of retransmits per second that
90. report is scheduled e the date and time on which the report will be run The following image illustrates the Pending Reports section Pending Reports Report Name Report Description Next Run Time 3 ReportEnterpriseCPUUtilization Apr 17 2008 16 48 3 ReportCPUUtilizationSummary Apr 17 2008 16 49 3 ReportFileSystemServiceTime Apr 17 2008 16 50 e Running Reports Reports that are being run This section contains the same information as the Pending Reports section as illustrated below Running Reports Report Name Report Description Scheduled Next Run Time ReportResourceUsage X No Apr 17 2008 16 48 If the running report is not a scheduled report Emailing report in PDF format appears in the Report Name column 410 up time 5 User Guide up time The Report Log e Completed Reports Reports that have finished running whether they were successfully generated or not This section contains the following information e the name of the report e the date and time on which the report run was started the date and time on which the report run ended e the status of the report for example finished e astatus message for example Email sent or Address list is empty The following image illustrates the Completed Reports section Completed Reports Remove Completed Reports Report Name Startec 1ded Status Status Message ReportSlaDetailed 2 04 03 07 36 0 2 04 03 09 14 0 finished sfully ReportSla
91. requests to the IIS service IS 6 0 treats both an anonymous and a non anonymous user request as a new user e Current Connections The number of active connections to the IIS server Connection Attempts Sec The number of connection attempts that have been made per second since the IIS server was started LL e Logon Attempts Sec The number of attempts per second that are being made to log on to the server e Get Requests Sec The rate in seconds at which HTTP requests using the GET method have been made to the server e Post Requests Sec The rate in seconds at which HTTP requests using the POST method have been made to the server gt xo O D fe 5 e Oo 7 e CGI Requests Sec The rate in seconds at which the server is processing simultaneous CGI Common Gateway Interface requests e ISAPI Requests Sec The rate in seconds at which the server is processing ISAPI extension requests ISAPI enables programmers to develop Web applications that are tightly integrated with IIS ISAPI can also provide security functions to Windows servers and database connections through IIS up time software 201 Application Monitors 115 Not Found Errors Sec The maximum number of 404 file not found errors indicating that the requested document cannot be found on the server that can occur each second Response Time Enter the Warning and Critica
92. rpm Where lt agent name gt is the name of the rpm file for the agent that you are installing e g Upt imeAgent Linux lt version gt rpm If you are using SuSe Linux Enterprise Server 9 you must update the kernel to the latest version using the YAST package manager If you do not upgrade the kernel the agent will not be able to gather workload data 3 If AIX is running on the LPAR do the following 44 Log into the LPAR as root Copy the archive containing the agent to the LPAR Extract the contents of the archive using the following command tar xvf lt agent name gt Where lt agent name gt is the name of the archive that contains the agent that you are installing e g uptmagnt AIX lt version gt tar Run the following command to install the agent up time 5 User Guide hy up time Installing Agents INSTALL sh If you are using an HMC do not install the agent as a Virtual I O Server by using the vio attribute with the install command Doing so may lead to conflicts with HMC managed systems and can result in incorrect performance statistics 4 Do the following to install the agent on the VIO e Log into the VIO as root e Run the following command oem_setup_env 5 a Si Ko xe 3 D e Copy the archive containing the agent to the LPAR e Extract the contents of the archive using the following command tar xvf lt agent name gt Where lt agent n
93. service checks Host Checks Service checks that you select and assign to each host that is being monitored to test if it is functioning properly Service checks are temporarily disabled if up time determines that a host that is undergoing scheduled maintenance Monitoring Periods Specific windows during which you want to have up time generate and send alert notifications For example you can specify that alerts only be sent between 9 a m and 5 p m on weekdays Monitor Escalations The exact definitions of when and how up time should escalate service alerts if they have not been acknowledged by specific users within pre defined time limits Service Groups Service monitor templates that enable you to apply a common service check to one or multiple hosts servers network devices that you are monitoring up time 5 User Guide D uptime CHAPTER 2 Understanding up time This chapter explains underlying concepts in the following sections Understanding the up time Interface rrrrrrrrrrannnnnnnsnnnnr 6 Understanding Reports and Graphs rrvrrrrrrrrrrannnnnnnsnnner 12 Understanding AgentS arrnnnannannnnnrrennennesnnurrenrennnennnene 13 Understanding the up time DataStore rrrrrrrrrrnnnnnnsnsnnnr 15 Understanding Service Monitors rrrrrrrrrrrrrrnnnsnnnnnsnnener 17 Understanding Services rraanaannnnrrennenneannnnrranrennnennnnen 20 Understanding the Status of Services
94. services This report allows you to pinpoint when specific services experienced outages assisting with further investigation The report answers the following questions Were there any outages yesterday If so how long were they and on which systems did they happen Which business users were affected by service outages What kinds of transaction volumes are we processing up time 5 User Guide l up time Reports for Service Level Agreements What are the most important things we can fix in order to meet our SLA targets For more information on SLA definitions see Working with Service Level Agreements on page 357 Creating an SLA Detailed Report To create an SLA Summary Report 1 2 In the Reports Tree panel click SLA Detailed In the Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 Select a Compliance Period to report on Clear the Display Outage Tables checkbox if you want the report to display only outage graphs a 2 3 JJ iv xe fe 7 If you want to generate reports for one or more groups that include SLAs select the groups from the List of Groups area To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views If you are generating reports for specific Service Level Agreements
95. system has changed over the reporting period charting the following used space available space percentage used and total size of the file system On Windows servers with a single disk up time looks at the capacity of the main partition usually the c drive If the Windows server has multiple disks this report collects information for all of the disks On UNIX and Linux servers up time looks at individual file systems for example var export or usr on all the disks in the system This report ignores floppy drives tapes drives and CD ROM drives Creating a File System Capacity Growth Report To create a File System Capacity Growth report do the following 1 Inthe Reports Tree panel click File System Capacity Growth 2 Inthe Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 If no data available for the date range the report displays a message indicating that there is no data for the time period 3 Optionally in the Exclude file system names like field enter either the name of a file system or a regular expression that up time will use to ignore certain file systems when generating the report For example if you want to exclude the boot file system from the report enter boot in the field up time software 431 Using Reports 10 Reports for Capacity Planning Optionally enter a value in the Exclude fil
96. tablespaces files users rollback segments constraints synonyms A hit ratio approaching 100 is ideal Library Cache Hits Ratio Enter the Warning and Critical thresholds for the rate at which library cache pin misses occur A pin miss occurs when an session executes a statement that has already been parsed but which is no longer in the shared pool e Redo Log Space Request Ratio Enter the Warning and Critical thresholds for the number of redo log space requests per minute that have been made since the server was started e Disk Sort Rate Enter the Warning and Critical thresholds for the rate of Oracle sorts that are too large to be completed in memory and which are sorted using a temporary segment 254 up time 5 User Guide dy up time Oracle Advanced Metrics Active Sessions Enter the Warning and Critical thresholds for the number of active sessions based on the value of VSPARAMETER PROCESSES in the file init ora Oracle Blocking Sessions Enter the Warning and Critical thresholds for the number of sessions that are preventing other sessions from committing changes to the Oracle database Oracle Idle Sessions Enter the Warning and Critical thresholds for the number of Oracle sessions that are idle as determined by the Time Idle value that you specify Only the sessions that have been idle for the duration measured by the Time Idle value in seconds are considered idle Respon
97. that are in a critical State The name of each Application is a hyperlink Click a link to view detailed information about an Application For details about the Application information that is displayed see Viewing System and Service Information on page 50 126 up time 5 User Guide hy up time Viewing All Elements Viewing All Elements Elements are the systems network devices Applications and SLAs that up time is currently monitoring In the Global Scan panel you can view the status of all monitored Elements in the All Elements subpanel This can be accessed by clicking the View All Elements tab The All Elements subpanel is the default view in the Global Scan panel The following image illustrates the View All Elements subpanel F Service Status Outages cpu Disk Memory OK WARN CRIT MAINT UNKN ACK thr 12hr 24hr USR SYS WIO TOT Used Busy Swap Used ASI AIX DEY LPAR 2S AIX QA LPAR pe gt J dev s SLES z DNS and The Fogz ELinux ESX4 ESX7 Exchange FilterSNMP Ginger Agent GingerSNMP HMC Managed Server HP Integ Mail Server McKay O lt D 2 D 2 gt Ko lt Oo C 5 h te 7 pen i a my app MyMachine Novell Opteron Perl bad ed bad s ed ed rd ed ed ed ed ed ed rd rd rd erd erd ed ed ed rd ed had s 3 amp 3 2 z a s es 3 2 3 2 amp amp BW OA RedHat Instance The View All Elements
98. the group in the Group Name field Optionally enter a description of the group in the Group Description field To make this group a subgroup select the name of the existing group to which it will be subordinate in the Parent Groups list then click Add Infrastructure will appear in the dropdown list B If this is the first group that you have defined only My up time software 105 ainjonsjseajuy INOA BuiBeuey pue buiuijeg FI Defining and Managing Your Infrastructure Working with Groups 5 To give this group its own subgroups select one or more entries from the Available Groups list then click Add 6 Select the Elements that you want to add to this group from the Available Elements list then click Add 7 Select one or more sets of users who can view this group from the Available User Groups list then click Add 8 Click Save Adding Nested Groups 106 You can also create nested groups Nested groups enable you to further group your systems For example you can create a parent group called Datacenters and then add two nested groups called Production and Disaster Recovery You can assign the following to nested groups e groups of Elements individual Elements e the up time user groups that are allowed to view the systems or Elements in a group Note that you cannot assign a parent group to a subgroup or to any other ancestor Before you begin ensure that you have at least one parent group defined For
99. the system that will host the Monitoring Station For the Windows installer extract the contents of the archive using a utility like WinZip 2 From the distribution CD If you are installing up time from the distribution CD do the following e Insert the CD in the CD ROM drive e Ifyou are installing up time on Solaris or Linux mount the CD ROM drive if you are not using automount Change to the following directory on the CD up time software 29 Installing up time nsta ling the up time Monitoring Station up time MonitoringStation 3 Imported as a VMware Virtual Appliance If you are installing up time as an appliance on an ESX server you can download the package from the uptime software web site either directly or through the VMware Virtual Appliance Marketplace Unarchive the Virtual Appliance package and note its location you will need to locate the ovf file during the import procedure Once preparations have been made refer to the procedures in the Installing the Monitoring Station on Windows on page 30 Installing the Monitoring Station on Solaris or Linux on page 32 or Installing the Monitoring Station as a Virtual Appliance on page 35 for details on completing the installation for your platform Installing the Monitoring Station on Windows 30 To install the up time Monitoring Station on Windows do the following 1 If you are upgrading ensure you have logged out of the up time Web ap
100. time Working with Systems Working with Systems Systems are the network devices that you will monitor using up time You can add the following types of systems e Agent A system that has an up time agent installed on it In the Global Scan and My Infrastructure panels agent systems are denoted by this icon e Node A device without an agent but with which up time can communicate using an IP address In the Global Scan and My Infrastructure panels nodes are denoted by this icon 7 e Novell NRM A system that is running version 6 5 of Novell Remote Manager NRM a Web based interface to newer Novell NetWare servers Novell NRM saves server statistics in an XML file up time can retrieve the XML file parse it and then store the information in the DataStore e Net SNMP v2 or Net SNMP v3 Systems that use version 2 of the Net SNMP protocol or systems that use version 3 of the Net SNMP protocol to monitor and manage systems in a network that uses TCP IP Net SNMP version 3 adds security features that are lacking in Net SNMP version 2 All of the data gathered from Net SNMP is based on the following MIB implementations e RFC 1213 Management Information Base for Network Management of TCP IP based internets Presents network interface information e UCD SNMP MIB Presents general system state information e Host Resources MIB RFC 2790 Presents system performance data up time software 67 ainjonsjseajuy
101. up time 5 User Guide hy up time Topological Dependencies Topological Dependencies In large deployments a single system or node can act as the gateway to other entities or entity groups For example up time might need to go through a router configured as a node in up time to monitor one or more systems that are behind the node This situation is illustrated below Router Monitoring Station Systems being monitored If the router fails then up time generates alerts for the systems behind the routers because the service monitors cannot communicate with those systems Topological dependencies create parent child relationships between systems Both entities and entity groups can be dependent on a parent system or node A service monitor can determine that systems which are dependent on a specific system or node that is experiencing a problem will be unavailable until the problem is resolved Alerts will not be generated However the checks for the dependent systems will continue to be scheduled If a topological parent is down a descriptive message appears in the Global Scan panel for entities and services that are children of that parent up time software 159 SJOHUON 2914135 Huis ry Using Service Monitors Topological Dependencies Adding Topological Dependencies To add topological dependencies do the following 1 2 On the up time tool bar click Services In the Tree panel click Add Topolo
102. up time 5 User Guide f up time Reports for Availability A service is considered repaired or being repaired when its status changes from critical to one of MAINT UNKNOWN WARNING or OR For all report elements a service monitor is considered to have reached a critical state thus has caused an incident is contributing to downtime or is an ongoing failure when it actually generates an alert The period preceding the alert during which rechecks are intermittently being performed to avoid a false positive does not count See Understanding the Alert Flow on page 379 for information on rechecks leading to a generated alert Creating an Efficiency Report To create an Efficiency report do the following 1 2 In the Reports Tree panel click Efficiency In the Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 Service monitors that based on the selected time range are already in a critical state will be included in calculations for downtime incident counts and other report elements In the Report Options area select the charts you want included in the report In the Report Options section select the level of granularity at which the information will be presented i e daily weekly or monthly If you want to generate reports for groups of systems select the groups from the List of Gro
103. up time software 513 Using Graphs Disk Performance Statistics Graph Disk Performance Statistics Graph The Disk Performance Statistics graph charts a set of disk performance metrics returned by utilities such as perfmon on Windows and iostat or sar on Solaris that are running on a system Requests can experience delays proportional to the length of the request queue minus the number of spindles on the disks For optimal performance this difference should be less than two on average Generating a Disk Performance Statistics Graph To generate a Disk Performance Statistics graph do the following 1 Inthe Global Scan or My Infrastructure panel click the name of the system whose information you want to graph 2 Inthe Tree panel click the Graphing tab 3 Click Disk Performance Statistics 4 Select the start and end dates and times for which the graph will chart data For more information see Understanding Dates and Times on page 22 5 Select one of the following options e Percent Busy The percentage of the disk capacity that is being used For NFS systems 100 busy does not indicate that the 3 server itself is saturated but that the client always has outstanding requests to that server e Average Queue The average number of processes that are waiting to access the disk The length of the queue is affected by how busy the system is and the amount of time that each transaction requires to perform a disk
104. up time user all of these will need to be created e define an SLA and its objectives e use the SLA Detailed report to identify and resolve outages or underperforming Elements e use the SLA Summary report to develop a baseline Setting Up and Gathering Data for Monitors Determine which service monitors will best reflect the end user experience based on the aspect of your infrastructure that your SLA will cover See SLAs Service Monitors and SLOs on page 359 for some sample SLAs and objectives up time users who do not have existing service monitors should create them and allow them to accumulate data for at least one week Having historical data is essential to determining what level of service you should target Identifying Outages and Improvable Performance When added to an SLA service monitors that have been collecting data will immediately contribute to the SLA s reported status For example if all of an SLA s service monitors have a year s worth of historical data creating a trial SLA will allow you to see how it would have performed over that last year Having this historical data in SLA reports helps you analyze each component service monitor in the context of the SLA Consider a sample SLA called System Performance that is meant to ensure your application servers are not experiencing excessive loads this can be indicated by CPU usage and disk space The first service level objective is 366 up time 5 User
105. were logged within the last two hours up time 5 User Guide f up time Splunk Query You can enter any Splunk query string in this field For more information on the syntax of Splunk queries see the Splunk user manual Result count of splunk query Enables up time to alert you when the number of results that match your Splunk query exceeds the defined warning and critical thresholds For example you can configure the monitor to issue a Warning alert when five or more Splunk queries matching your query are returned and a Critical alert when 10 or more results for your query are returned Result count of splunk query v Warning is greater than or equal to z 5 results Critical is greater than or equal to z 10 results Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 3 To save the data from the thresholds for graphing or reporting click the Save for Graphing checkbox beside any of the options listed in step 2 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information Alert Profile settings see Alert Profiles
106. work a process has done by dividing this amount by a sample of time for example five minutes e Start Time The time at which the process started This can be used to determine the lifetime of a process The process information for the current date and time is displayed in the Graphing subpanel Generating Detailed Process Information To display detailed process information do the following 1 Inthe Global Scan or My Infrastructure panel click the name of the system whose information you want to graph 2 Inthe Tree panel click the Graphing tab 3 Click Detailed Process Information up time software 525 Using Graphs Detailed Process Information Displaying Detailed Process Information Select the start and end dates and times for which the graph will chart data For more information see Understanding Dates and Times on page 22 Click Display Process Information A window containing a chart that lists the process information for the time period that you specified appears The following image illustrates process information for a Solaris system Specific Date and Time C Last C Quick Date Display Process Information AIX5 aix5l Process Information displaying latest sample dated 2008 04 16 16 03 40 AIxPowerMgtDaemon biod diagd getty httpdlite IBM AuditRMd IBM CSMAgentRMd IBM ERrmd ksh ksh qdaemon med rpclockd rpcstatd sade uptmagnt uptmagnt writesry 526 P
107. you want your script or program to return you use forward slashes when specifying directory paths in your scripts regardless of the operating system e g C on Windows or opt on Solaris or Linux Many of the fields that you use to define an advanced monitor are the same as those used with agent and agentless monitors You can find more information about those fields in the following sections up time software To learn how to access the custom monitor definition window see Using Agentless Monitors on page 138 For a description of monitor identification information fields see Monitor Identification on page 141 For a description of monitor timing settings see Monitor Timing Settings on page 146 For a description of alert settings see Monitor Alert Settings on page 148 For a description of Alert Profiles see Alert Profiles on page 381 For a description of Action Profile see Action Profiles on page 389 323 vL gt Q lt o gt Oo a e gt e 7 Advanced Monitors Custom Monitors Custom Monitors A Custom monitor runs a script that captures information which is related to a situation that may be unique to your environment When the script is run the system being monitored returns a single line of information to standard output stdout The script reads stdout which may contain an error or return value This error or return value is then dis
108. 0 0ms average round trip time 388 up time 5 User Guide f up time Action Action Profiles Profiles Action Profiles are templates that direct up time when it encounters a problem on a monitored system You can associate an Action Profile to any Service Monitor Application or SLA if their state changes from OK to Warning or Critical Action Profiles are normally associated with any of these monitored Elements at the time of their configuration Action Profile assocations can also be changed when you are modifying existing service monitor definitions See Chapter 8 Using Service Monitors Working with Applications on page 101 and Adding and Editing SLA Definitions on page 371 for more information about configuring Service Monitors Applications and SLAs respectively Actions include one of the following tasks e write an entry to a log file e run a recovery script that can reboot a non responsive server or restart an application process or service e stop start or restart a Windows server e initiate a VMware vCenter Orchestrator workflow e send an SNMP trap to a specific traphost and trap community As templates Action Profiles can be reused for any number of Service Monitor configurations This means you can create a series of them as standard actions used to respond to typical types of problems you may encounter depending on what role a Service Monitor is playing e g availability or performan
109. 0 IV Jf Total an vA EAS at A 90 M D Sys 80 NW Bl vo 70 60 2 50 480 up time 5 User Guide hy up time Graphing in up time e In any Java enabled Web browser on any operating system for example in Firefox on Linux the graph is generated using a Java graphing applet as shown below Disk Performance 0 pet time busy You can click any line in the graph or any item in either axis to zoom in ona particular time period or value Click the R key on your keyboard to return to the original view m You can modify ActiveX graphs after they have been generated You cannot modify Java graphs Graphing Tool After you generate an ActiveX graph you can customize it using up time s graphing tool With the graphing tool you can do the following e apply graphing line styles e apply graphing and charting formats e apply titles text and dimensioning e manipulate a graphing axis apply dynamic motion to a graph up time software 481 Buiyde15 buipuejssopun oz Understanding Graphing Using the Graph Editor Using the Graph Editor The Graph Editor enables you to manipulate the presentation of your graphs as well as apply a variety of effects to a graph to change its overall look The following image illustrates the Graph Editor Chart Series Data Tools Export Print Series General Axis Titles Legend Panel Paging Walls 3D CE bax VI io bax Iv Oo sys au
110. 1 Warning There is a potential problem with one of more of the services being monitored e 2 Critical There is a critical problem with one or more of the services being monitored e 3 Unknown There is an error in the configuration of the monitor itself or up time cannot execute the service check up time captures the output from the script or program usually from standard output stdout The output appears in the service status section of the Global Scan panel see Understanding the Status of Services on page 21 The up time monitoring framework picks up any error codes and triggers the appropriate monitoring action If you have already written scripts or programs for other monitoring tools you can re use those scripts or programs with up time You simply point your advanced monitor to where your scripts or programs are located and up time will run them up time 5 User Guide l up time Overview The uptime user account on the up time Monitoring Station must be able to execute the script or program that you use Contact uptime software Client Care for help with creating advanced monitor scripts Before You Begin When creating a script or an executable for an advanced monitor you should ensure that the necessary interpreter for the scripting language that you are using is installed on the Monitoring Station you have determined the arguments that the script or program requires and the parameters that
111. 1 deleting 412 pending reports 410 running reports 410 viewing 411 reports 413 Application Availability 456 background 401 CPU Run Queue Threshold 445 CPU Utilization Ratio 422 CPU Utilization Summary 419 Disk I O Bandwidth 441 dynamic 401 Enterprise CPU Utilization 428 File System Capacity Growth 431 File System Service Time Summary 449 generating 401 generation options 402 email 402 to screen 402 XML 402 590 incidents 457 mean time between failure 457 mean time to repair 457 Multi System CPU 418 Network Bandwidth 438 overview 12 Report Log 410 Resource Usage 414 saving 404 saving to file system 404 scheduling 407 searching saved 406 Server Virtualization 432 Service Monitor Availability 460 Service Monitor Metrics 425 Service Monitor Outages 461 setting date and time ranges 22 Solaris Mutex Exception 436 viewing saved 405 VMware Workload 470 Wait I O 423 WebSphere 463 Reports panel 9 Resource Scan 130 Resource Usage report 414 Response Time 145 S scheduled maintenance 161 assigning to a host 162 assigning to a service 163 profiles 161 viewing profiles 162 Scrutinizer 72 133 512 542 search box 57 searching 57 Server Virtualization report 432 using 435 service groups 153 creating 153 editing 154 overview 20 service level agreements 357 adding and editing 371 creating 366 end user performance 228 objectives 359 365 373 up time 5 User Guide l up time reports for 453 status 363 v
112. 144 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information Click Finish up time 5 User Guide dy up time POP Email Retrieval POP Email Retrieval The POP Email Retrieval service monitor checks the status of POP2 servers which requires SMTP to send messages and POP3 servers Use the POP Email Retrieval monitor to verify whether a POP server is doing the following listening on a defined port running on a defined system running on a group of systems running a particular version of POP Configuring POP Email Retrieval Monitors To configure POP Email Retrieval monitors do the following 1 up time software In the POP Email Retrieval monitor template complete the monitor information fields To learn about monitor information fields see Monitor Identification on page 141 Complete the following fields Expected Server
113. 5 User Guide D upitime CHAPTER 14 Advanced Monitors You can configure advanced monitors to collect performance information Advanced monitors are described in the following sections OVET VIEW andakter kadkratke ane ke 322 Custom Montorss s simiadmikarvi rmd ane drjanerk bakker hd 324 Custom with Retained Data auuuuerannuvrnnnanvrenenerennan renener 326 External Check va isiis sitet krig sey arri neiaie de bbt des EE sene REPERE 328 Plug In MONitOrs rare neper ke eden 330 321 Advanced Monitors Overview Overview 322 In some cases the standard up time service monitors may not fully enable you to monitor all of the systems applications and proprietary devices in your environment in some cases you may need to capture unique metrics To do this you can configure advanced service monitors or download and install customized plug in monitors These advanced monitors can be simple scripts that run service checks on a host You can write a shell script or use a higher level scripting language like Perl Python or Ruby Or the advanced monitors can be binary programs that interact with more sophisticated applications On top of that advanced monitors do not require an agent to be installed on the system that you are monitoring Regardless of how you develop your advanced monitor scripts or programs those scripts or programs should return the following codes e 0 0K The services are functioning properly e
114. 5 minutes Service Level Objectives Name Description Achieving Target Mail Server Availability Mail server is always available ping check 99 43 99 0 Mail Server Performance Mail server consistently passes performance checks including CPU 90 03 99 0 usage swap space usage and network retransmit counts HE 2 66 of allowable downtime used 98 99 of target 99 0 WARN At the current rate this SLA will breach after 10 more minutes of downtime Service Level Objectives Name Description Achieving Target Application Server Availability and performance of WebSphere 99 49 99 0 Database Availability and performance of Oracle 99 49 99 0 Web Server Availability and performance of the web server 100 0 99 0 An SLA s compliance is based on the downtime of its component SLOs when one or more of the SLOs experience downtime it counts towards overall SLA non compliance 122 up time 5 User Guide f up time up time software Viewing All SLAs Clicking an SLO name displays the status of the SLO and all of the services that make up the SLO Service Level Objective Name WebSphere Description Monitoring Period very Mon Tue Wed Thu Fri 9 00AM 6 DOPM Compliance Period Type Target Percentage Member Service Monitors Name WebSphere Plants Response File System Capacity PING lab websphereS1 WebSphere lab websphereS k Default ping check for lab websphere51 Using the Detailed View allows yo
115. 505 Number of Processes This graph charts the number of processes that are currently running on a system The process count is taken from the system kernel and can be used to determine process usage trends Process Running Blocked Waiting This graph indicates whether or not there is enough CPU capacity for the processes that are being run on a system If the size of the blocked or waiting queue is disproportionate to the running queue then either the system does not have enough CPUs or is too I O bound A blocked process signals a disk bottleneck If the number of blocked processes approaches or exceeds the number of processes in the run queue you should tune the disk subsystem Whenever there are any blocked processes all CPU idle time is treated as wait for I O time If database batch jobs are running on the system that is being monitored there will always be some blocked processes However you can increase the throughput of batch jobs by removing disk bottlenecks up time software 501 sydes9 Buisn pe Using Graphs Graphing Processes Process Creation Rate This graph determines whether or not there are runaway processes on a system or if a forking based process like a Web server is spawning too many processes over a specified period of time Generating a Process Graph To generate a process graph do the following 1 Inthe Global Scan or My Infrastructure panel click the name of the system whose information you
116. 83 MB Exchange uptime exchange 3 75 315 MB 100 095 MB OMB WebSphere lab websphere51 208 791 MB 598 MB In this example the systems Brightmail and Weblogic Server have high levels of disk I O Based on this information you can generate a Disk Performance Statistics graph see page 514 for more information to get a better idea of why disk I O is so high on the system up time 5 User Guide hy up time Reports for Capacity Planning CPU Run Queue Threshold Report The CPU Run Queue Threshold report lists when a system s CPU reaches a high level of usage the number of jobs that were ready to run but waiting in a queue as well as the amount of time they were waiting If the size of the run queue is appreciably larger than the number of available processors on a system or the run queue is backlogged for long periods of time you can conclude that the server is overloaded You can use this report to pinpoint servers that are overloaded using the following factors the CPU is busier than a value that you specify e the length of the CPU run queue is greater than the threshold that you specify This report contains the following information e the display name of the system in up time e the number of CPUs on the system e the run queue threshold the minimum maximum and average length of the run queue i e the number of jobs waiting to be processed over the period of time that you specify e graphs that ill
117. A iE A EEEE EASTER 26 Installation Requirements rrrunnnnnnnnnnnnnnnsnnnnnnnnnnnnnnnnnnnr 27 Installing the up time Monitoring Station rarrrnnaannnnnn ann 29 Installing Agents cogs cae oven scs ence dete ordner kakene kerk kaka kake esses 40 25 Installing up time nsta lation Plan Installation Plan Before installing up time you must e identify the system that will act as a central Monitoring Station e ensure that all client systems that you want to monitor are accessible over the network All communication with client systems is over TCP using port 9998 However you can specify a different port during the installation process All communication originates from the Monitoring Station When a host that is being monitored is outside a firewall you only need to configure outbound port access If you purchased the boxed version of up time the Monitoring Station system must have a CD ROM drive from which to load the server software A CD ROM drive is not required if you have downloaded the up time software from the Internet The installation procedure creates the user ID uptime on the Monitoring Station The uptime user ID should also exist on all of the clients as using this ID will minimize any security risks by not running the agents as a privileged process Wherever possible do not use the root account to run the Monitoring Station or any up time agents You can use other existing user accounts
118. Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information Click Finish 317 EL 2 pr Oo Ww 92 Oo Oo Oo 2 Network Service Monitors TCP TCP The TCP monitor can determine whether or not a service or application is listening on a specific port This monitor can also execute commands against an application or a service listening on a port and evaluate the result By extending the TCP monitor to evaluate the returned string based on a command over a network using TCP you can test and monitor for a wide variety of responses For example to have up time generate an alert if the file Weekly Report was changed in your source code control system you can send the string get Weekly_Reportl and set the critical threshold value to 1 2 where 1 1 represents no changes and 1 2 or greater represents one or more changes to the document Configuring TCP Monitors 318 To configure TCP monitors do the following 1 Inthe TCP monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following fields e Port The number of the port on which the service or application that you want to monitor is listening To check whether or not an application is lis
119. Application s details If required associate Alert Profiles with the Application by clicking Edit Alert Profiles when viewing the Application s details In the Alert Profile Selector pop up window select one or more of the Available Alert Profiles from the list then click Save If required associate Action Profiles with the Application by clicking Edit Action Profiles when viewing the Application s details In the Action Profile Selector pop up window select one or more of the Available Action Profiles from the list then click Save up time 5 User Guide hy up time Working with Applications Viewing Details About Applications After you have added an Application to up time the name of the Application appears in the My Infrastructure panel The name of the Application is a hyperlink You can view detailed information about that Application by clicking the name of the Application which opens the Application General Information subpanel The Application Profile section of the subpanel displays the following information about the Application e the name of the Application e the description if available the group of systems to which the Application belongs e whether or not the Application is being monitored The Application Member Services section of the subpanel contains the following information about the service monitors that are part of the Application the name of the service that is being monitored e w
120. CPU USage ssssssssssnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn nnn 495 Generating a Multi CPU Usage Graph rrrnnnnnnnrnnnnnnvrrnnnnnnnrnnnnnn 495 Graphing Memory USage sssssssssnsunnnnnunnnnnnnnnnnnnnnnnnnnn nn 498 0 EEE EE 498 Cache Hit Rate ac iivsaued svnaseencsasnactsansosapandiananeupy Go levenseasaseaunicasincs 498 Paging SIAUSUICS vacieciasscncctcranccensincenatensdecesieespcduea A E 499 Free OE EE EN ERES 499 Generating a Memory Usage Graph vrnnrnnnnnnnnrnnnnnnvrrnnnnnnvrnnnnnn 500 Graphing ProCeSSeS ssssssssssssssuu0055uunuuunnunnnnnnnnnnnnnnn nnn 501 N mber OF Pr C SS S sensprita 501 Process Running Blocked Waiting rrrrrnnnnnnvrnnrrnnrrrennrrvrnnnnnn 501 Process Creation Rate mumunqmnrneuismnesuimvonrsem bn 502 Generating a Process Graph ernnnnrrvnnnnnnvrnnnonnnnvnnennnrnnnennennenne 502 Graphing TCP Retransmits narunnununnnnnnnnnnnnnnnnnnnunnnnnnnvnvnr 503 Generating a TCP Retransmits Graph rrrnnnnnrvrnnrnnnrrrnnrnnvrrnnnrn 503 Graphing User Activity nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 504 Generating a User Activity Graph s sssssssnnsssnnnsnnnnsnnnnnnneennneenne 504 Workload Graphs sssssssssssssnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn n 505 Generating a Workload Graph nrrrrvnnnnnnrrnnnnnnnnvnnrnnnrnnnnnnnnnnennen 506 up time 5 User Guide D up time Workload Top 10 Graphs annnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 508 Generating a Workload Top 10 G
121. Click Save The configuration window closes and you are returned to the VMware vCenter Orchestrator Configuration panel up time software 539 Configuring and Managing up time nterfacing with up time 10 To ensure the settings you provided are correct click the Test Configuration button The Monitoring Station will try to communicate with the VMware vCenter Orchestrator server If an error message appears in the subpanel edit and then re test the configuration Web Application Monitor Proxy Settings 540 When the Web Application Transaction monitor is recording a user session on an external site it is intercepting URLs by acting as your browser s proxy To do this you must replace your organization s proxy server information with the Web Application Transaction monitor in your browser settings In order for the monitor to access the Internet you must provide your proxy settings in up time For more information about the Web Application Transaction monitor see Web Application Transactions on page 223 You can change up time s proxy server configuration by manually inputting settings in the up time Configuration panel as outlined in Modifying up time Config Panel Settings on page 529 Changing Proxy Server Information for up time You can configure the proxy server settings used by up time when running the Web Application Transaction monitor through the following parameters e webmonitor proxyHost
122. Condensed View and Detailed View The latter view is suitable if you have one or two defined SLAs Condensed View The following image illustrates the Condensed View of the View SLAs subpanel Service Level Agreement Status Service Level Agreement Status amp emails OO O 63 of compliance period A 100 of allowable downtime used 89 47 of target 99 0 CRIT The allowable downtime has been exceeded by 15 minutes O lt OD 2 OD 2 gt e lt Oo lt n o pen i a 1 fa Enterprise Application SLA es 63 of compliance period Detailed Report OE 2 6 of allowable downtime used 98 target 99 0 WARN At the current rate this SLA will breach after 10 more minutes of downtime The Condensed View is the default view of this subpanel and displays the following information e the name of the SLA e astatus breakdown of the SLA for the current time period e time period elapsed available downtime used for the current time period e how close the SLA is to its performance target up time software 121 Overseeing Your Infrastructure Viewing All SLAs e status message Detailed View Click the Show Detailed View button to expand each SLA to include SLOs Service Level Agreement Status 5 EN HE 2 63 of compliance period Detailed Report HE 2 100 of allowable downtime used 89 47 of target 99 0 CRIT The allowable downtime has been exceeded by 1
123. DT 2008 V Gl Erio Receive rate IV 7 erid Send rate V HE Series2 484 up time 5 User Guide l up time Using the Graph Editor Creating a Trend Line To create a trend line do the following 1 Create a graph See Using Graphs on page 487 for more information 2 In the graph window click Show Editor Dialog 3 Click Add The Chart Gallery dialog box appears 4 Click the Functions tab and then click the Extended subtab 5 Click Trend and then click OK The Editing dialog box appears 6 Inthe Source Series subtab select one or more of the available data series and then click the Add button The data series that you select are the ones for which a trend line will be generated 7 Click Apply up time creates a trend line for each data series that you selected in step 6 Formatting Individual Graph Elements You can format individual graph Elements using the options available on the Series tab and apply a different graph chart style to each Element Using your graphed line data perform any of the following activities e Apply styles Changes the style of lines for example solid variety of dashes variety of dots line thickness visible not visible shape and width e Apply colors and color styles Applies any color image or logo to your graphed data up time software 485 Buiyde15 buipuejssopun oz Understanding Graphing Using the Graph Editor e Apply data point effects Makes data po
124. Da y zy zy y y y 218 0 to llo WebSphere lab websphere51 s 2003 Boxes The Syslist is also a tool for quick navigation within the up time Web interface Each display name is a hyperlink Click a display name to view the information about the system in the System Information subpanel Icons Entries in various panels have icons beside them These icons enable you to perform the following tasks i Clone Makes a copy of an entry in a panel You can then modify the entry pF Edit Opens a window in which you can modify any entry in a panel S View Displays the properties of any entry in a panel 10 up time 5 User Guide hy up time Understanding the up time Interface 3 Delete Deletes any entry in a panel You will need administrator privileges to delete certain entries These icons do not appear in the up time Web interface if users do not have permissions to access the functions represented by the icons System Icons The following icons appear in the Global Scan and My Infrastructure panels and identify the type of system that up time is monitoring 2 AL Linux AIX M Solaris Novell NRM Windows HP UX en VMware ESX Net SNMP IEM HMC VIO up time software 11 F Cc gt Q D Q gt 2 Q S D 3 D Understanding up time Understanding Reports and Graphs Understanding Reports and Graphs up time includes a powerful set of re
125. Dec 31 11PM 11 59PM Every Jan 1st 12AM 12PM Combining monthly recurrences Every month on the 2 Every month on the 16th Combining monthly ordinal recurrences Every month on the first Fri Every month on the third Fri Every month on the last Fri up time software 573 ER ES ETE Y 574 Time Period Definitions Note that when a time period consists of more than one component time period expression a condition met within any of those component time periods applies to the entire time period For example if a Monitoring Period named Open Hours is defined as Mon Fri 9AM 5PM Sat 10AM 5PM Sun 12PM 5PM An alert worthy event that occurs on Sunday at 1 00 p m means the entire time period definition has been fulfilled Exclusions Time periods can be excluded from greater time period definitions by using the term exclude as a prefix to the exclusionary expression The following examples demonstrate the use of exclusions in time periods Excluding a monthly recurrence from a weekly recurrence Sun 3PM 5PM Exclude every month on the last Sunday Defining two yearly recurrences to exclude from a weekly recurrence on Fri 2AM 3AM Exclude every Jan 1 Exclude every Jan 2 up time 5 User Guide uptime APPENDIX B End User License Agreement Before downloading up time obtaining a license key or using up time please read the following End User License Agreement for up time The up ti
126. External Check Monitors rrrrrnnnrrvnrnnnnrrrennnnrrnnnnrn 329 Plug In Monitors avg sessereeereTASdeGANe 330 Installing Plug In Monitors voskcisio tees cc vcevavariacradeievesiavigrareleccedeestenese 330 Configuring Users Working with User ROles arnannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 334 Adding USEF Roles vassssutranusuenemeG ua 334 Viewing User FOSS joe oeeeaejetr 335 Editing User Roles sacs 2 serosa ced case sities decodes ened sees segtescunpceavaaeas 336 Working With USerS ssssssssssss055uu5220055555u200555uuuuuununnnnnn 337 Adding USES ronorimennroneinia aia i Ri 337 Viewing GE EEE EE 340 Editing User INOMaNoN 5 cstauciccasvssciciasdseedseve tines ceesssearte neaevseavaases 340 Working with User Groups a nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 341 up time software av D OD ie k O Oo gt p D gt pr 7 xvi Adding User OS RE EE 342 Viewing User GR avse tee seteceseesde eta leaabenes 342 Editing User Groups xrnnnvnvnnnannrrvnnnnvvrrnnnonnvvnnnnnnnneennsnnnrennnnnnnnnne 342 Deleting User Groups saarqvaueeusreesagtntmnita 343 Managing Distribution Lists unrannannnunnnnnnnnnnnnnnnnnnnunnnnnn 344 Adding Distribution LES save aeekgesadenen 344 Viewing Distribution LISE uinuevimihujj jevtqsaaeesse 345 Editing Distribution LES s rsaassamsedeerdn naener 345 Working with Notification Groups asrnnnnnnnnnnnnnnnnnnnnnnnnn 347 Adding Notification GrOUDS 2 c cccccceieeet
127. File System Capacity monitor 167 File System Service Time Summary report 449 filtering 57 58 frequency definitions 566 FTP monitor 283 G generating reports 401 Global Element Settings 536 Global Scan 115 groups 118 overview 116 Resource Scan 130 chart 131 gauges 130 graph 131 up time 5 User Guide D up time view all Applications 124 view all Elements 127 view all services 129 Global Scan panel 7 Graph Editor 482 Graphing Tool 481 graphs ActiveX 339 480 appearance 486 CPU performance 491 generating 494 Run Queue Length 493 Run Queue Occupancy 493 Usage 491 disk performance statistics 514 displaying process information 524 exporting 486 file system capacity 518 formatting Elements 485 Graph Editor 482 Graphing Tool 481 instance motion 523 Java 481 LPAR entitlement 510 memory usage 498 Cache Hit Rate 498 Free Swap 499 generating 500 Paging Statistics 499 Used 498 Multi CPU Usage 495 network 511 errors 511 generating 512 I O 511 NetFlow 512 Novell NRM 521 overview 12 480 488 process 501 creation rate 502 generating 502 number of processes 501 running blocked waiting 501 Quick Snapshot 489 setting date and time ranges 22 TCP retransmits 503 top 10 disks 516 up time software trend lines 484 user activity 504 viewing quick snapshot 490 viewing system status 489 VXVM stats 519 workload 505 workload top 10 508 groups adding 105 adding nested 106 Global Scan 118 H host check 4
128. GES RESULTING up time software 581 JUudouIddJby su 17 gl NOTICE TO USER FROM ANY CLAIMS DEMANDS OR ACTIONS ARISING OUT OF OR RELATING TO THIS AGREEMENT INCLUDING WITHOUT LIMITATION UPTIME S INTELLECTUAL PROPERTY INDEMNIFICATION OBLIGATIONS SHALL BE LIMITED TO THE AMOUNT OF LICENSE FEES PAID TO UPTIME BY YOU UNDER THIS AGREEMENT BUT IN NO EVENT SHALL SUCH LIABILITY EXCEED CDN 2 000 00 IN THE AGGREGATE FOR ALL OCCURRENCES THIS LIMITATION APPLIES TO ALL CAUSES OF ACTION OR CLAIMS IN THE AGGREGATE INCLUDING WITHOUT LIMITATION BREACH OF CONTRACT BREACH OF WARRANTY INDEMNITY NEGLIGENCE STRICT LIABILITY MISREPRESENTATION AND OTHER TORTS IN NO EVENT SHALL UPTIME BE LIABLE TO YOU OR ANY PARTY RELATED TO YOU FOR ANY INDIRECT INCIDENTAL CONSEQUENTIAL SPECIAL EXEMPLARY OR PUNITIVE DAMAGES OR LOST PROFITS EVEN IF UPTIME HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES THE FOREGOING LIMITATIONS EXCLUSIONS AND DISCLAIMERS SHALL APPLY TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW EVEN IF ANY REMEDY FAILS ITS ESSENTIAL PURPOSE 8 General Terms 582 8 1 Governing Law and Choice of Forum This Agreement shall be governed by and interpreted in accordance with the laws of the Province of Ontario Canada without regard to the conflicts of law rules thereof Any claim or dispute arising in connection with this Agreement shall be resolved in the federal or provincial courts situated with the City of Toronto Ontario T
129. Guide ih up time SLA Creation Strategies based on the Performance Check monitor for the application servers A critical state occurs when CPU usage exceeds 90 The second service level objective is based on the File System Capacity monitor A critical state occurs when remaining disk space falls under 10 After creating an SLA based on these objectives the SLA is immediately shown to be in a critical state for the current Monitoring Period one or both of the objectives have already failed to meet the defined service level Lo EEE ia 63 of compliance period Detailed Report OE 2 100 of allowable downtime used 89 47 of target 99 0 as been exceeded by 15 minutes You can investigate outages using the SLA Detailed report In this example you determine that the cause the SLA failure was a prolonged disk space related outage that based on the outage graph appears to have been resolved SLO Disk Space SLO Overall Outages File System Capacity Outages Apr 23 00 00 1 Apr 01 00 00 4 Apr 03 00 00 4 Apr 05 00 00 4 Apr 07 00 00 4 Apr 09 00 00 4 Apr 11 00 00 4 Apr 13 00 00 4 Apr 15 00 00 4 Apr 17 00 00 q Apr 19 00 00 4 Apr 21 00 00 4 Outage Minutes Down From 2008 04 20 06 18 22 to 2008 04 20 06 22 01 3 From 2008 04 17 14 25 35 to 2008 04 17 14 42 25 16 From 2008 04 17 13 05 43 to 2008 04 17 13 42 30 36 From 2008 04 17 09 46 00 to 2008 04 17 10 22 48 36 From 2008 04 17 05 46 19 to 2008 04
130. I on the Windows domain 6 In the Password field enter the password for the WMI account 7 Click Save to retain your changes and close the pop up window 8 Click Test Configuration to ensure the credentials provided are correct up time 5 User Guide hy up time Interfacing with up time Configuring a Global up time Agent Configuration To provide up time Agent information that can be used to switch Windows Elements from agentless WMI based data collection do the following 1 On the up time tool bar click Config 2 Inthe Tree panel click Global Element Settings 3 Inthe up time Agent Global Configuration sub panel click Edit Configuration 4 inthe Edit Global Element Settings pop up window enter the port through which the up time Agents communicate with the up time Monitoring Station The port number entered reflects what the up time Agents 3 are configured to use this setting does not modify the agent side configuration 5 Select the Use SSL check box if the agents securely communicate with the Monitoring Station using SSL 6 Click Save to retain your changes and close the pop up window 7 Click Test Configuration to ensure the credentials provided are correct RSS Feed Settings up time displays a list of recent knowledge base articles in the My Portal panel This list is fed to the My Portal panel via RSS Really Simple Syndication a method for delivering summaries of and links to Web content Clicking the titl
131. ID 3902 6966 11616 10070 13934 12902 13676 24510 16814 17270 19538 23208 11098 13422 10590 10328 18446 15438 PPID 4726 Date Range From To UID root root root root root root imnadm root root root uptime uptime root root root daemon root uptime uptime root YYYV MM DD HH MM SS 2008 04 16 pae 2008 04 16 23 59 59 system system system imnadm system adm adm printq system system adm adm adm system Memory Used 2 59 MB 600 KB 1 50 MB 340 KB 284 KB 692 KB 428 KB MB 2 18 MB 2 72 MB 576 KB 508 KB 396 KB 2 60 MB 496 KB 1 96 MB 256 KB 240 KB 228 KB 464 KB RSS 840 KB 644 KB 96 KB 220 KB 304 KB 508 KB 336 KB 2 41 MB 2 41 MB 2 89 MB 784 KB 720 KB 264 KB 1 02 MB 180 KB 324 KB CPU Mem Runtime Children Runtime Start Time 0 2 4h 35m Os 2008 01 02 09 05 58 1h 57m s 2008 01 02 09 05 12 2008 01 02 09 05 58 2008 01 02 09 05 47 2008 01 02 09 01 02 09 06 01 2008 04 2008 04 2008 04 2008 04 16 2008 01 02 09 05 55 2008 01 02 09 06 03 2008 01 05 53 01 0 05 49 04 16 15 36 21 2008 04 16 15 2 2008 04 16 15 36 19 2008 01 02 09 05 57 From the dropdown list select the date and time for which you want to view process information up time 5 User Guide dy up time CHAPTER 22 Configu ring and Managing up time The configuration and management of up time mainly through the Config P
132. IT infrastructure s ability to meet performance goals particularly from the end user perspective Different goals can focus on different aspects of your infrastructure from underlying network performance to back end database availability to user facing application server response time Given this broad coverage a performance goal encompasses anything from a handful of monitored systems to an entire production center Defining and working toward fulfilling SLAs provides you with more insight into the performance and planning of your infrastructure e measure the performance of your infrastructure from the end user perspective An SLA can measure the success of your IT infrastructure by using end user focused service monitors such as the Web Application Transaction monitor and the Email Delivery monitor e translate IT infrastructure demands into quantifiable and reportable goals Use SLAs to methodically set expectations on all or the most critical aspects of your infrastructure SLAs provide you with metrics with which you can gauge the success of your network administration e use trends to anticipate new infrastructure requirements Trend lines in SLA reports can give you an estimate for when your current hardware deployment will require augmentation e generate SLA reports that demonstrate compliance and break down objectives Compliance reports quantify the value of the IT department s efforts and objective based reports exi
133. Info tab then click Info amp Rescan Click the Edit Collection Method link found beside the Collection Method setting as shown below Wat Veritas Cogicar Forde Manager 170 Collection Method up time Agent version 5 3 0 0 Edit Collection Method Is Mware Instance Ves up time 5 User Guide dh up time Working with Systems The Edit Data Collection Method window appears 4 Select the up time Agent data collection option 5 Select the Use up time Agent Global Configuration check box if it has been configured and you would like to use it see Configuring a Global up time Agent Configuration on page 537 for more information otherwise complete the following options e Port The port through which the up time Agents communicate with the up time Monitoring Station e Use SSL Select this check box if the agent securely communicates with the Monitoring Station using SSL 6 Click Save to retain your changes and close the pop up windo W Converting Multiple Elements to WMI Data Collection To change multiple agent based Elements to use WMI for data collection do the following 1 Ensure the global settings for WMI credentials have been set see Configuring Global WMI Credentials on page 536 for more information 2 On the up time tool bar click Config 3 Inthe tree panel click Bulk Element Conversion 4 Inthe Windows Agent Elements section select the check boxes that correspond to the age
134. L xe ce pa 7 Using Reports Reports for Capacity Planning 442 e where applicable the name of the file system on the disk e the total amount of data measured in megabytes that is being read from and written to the disk Using Regular Expressions You can use regular expressions to include or exclude disks and file systems when generating a Disk I O Bandwidth Report or a File System Service Time Summary Report as shown below Exclude Disks Exceptions Exclude File Systems Exceptions Note you can enter regular expresssions into these fields Note you can enter regular expresssions into these fields Using regular expressions you can focus on particular disks or file systems on a server and also decrease the length of your report The regular expression syntax used with the Disk I O Bandwidth Report or a File System Service Time Summary Report is similar to that used with the File System Capacity Growth report For example if you are generating a report on an Oracle volume and only want to focus on five specific file systems you can enter the regular expression u 0 4 in the Exceptions field If on the other hand you are working with a UNIX system with multiple disks and want to focus on disks whose names start with md1 but ignore those whose names start with md2 you can enter the regular expression md1 in the Exceptions field and md2 in the Exclude Disks
135. Lists on page 344 346 up time 5 User Guide hy up time Working with Notification Groups Working with Notification Groups When up time detects a problem with a system or service in your environment it can issue alerts to specific users If a group of users in your enterprise should receive certain notifications you can ensure that they do by defining Notification Groups and adding those users to the group A Notification Group specifies the users who will receive the notifications as well as the Alert Profile that will be used to react to the problems See the section Alert Profiles on page 381 for more information Users can only view the Notification Groups to which they are members While users can see the members of Notification Groups to which they belong they can only view detailed user information for users that belong to the same user groups EN Adding Notification Groups To add Notification Groups do the following 1 Click Users on the up time tool bar 2 Inthe Tree panel click Add New Notification Group 3 Type a descriptive name in the Name of Notification Group field You will select this name when defining Alert Profiles For more information on Alert Profiles see Alert Profiles on page 381 O e e lt gt e Cc 2 oD 7 4 Optionally type a description of the group in the Description of Notification Group field 5 Select one or more Alert Profiles to a
136. M are not supported up time also supports agentless monitors on any operating system which do not require you to install software on a system or device See Using Agentless Monitors on page 138 up time 5 User Guide hy up time Installing the up time Monitoring Station Installing the up time Monitoring Station The Monitoring Station is installed a single directory e usr local uptime on Linux e opt uptime on Solaris e C Program Files uptime software uptime on Windows On Windows the up time Monitoring Station is installed using a graphical installer that guides you through the steps of the installation process On Solaris or Linux the installer is a console application 5 a Si Ko xe 3 D m Before installing up time you must be logged in as a local i e non domain administrator in Windows or as root in Solaris or Linux In addition to the included MySQL database up time can also use either an Oracle or MS SQL Server database as its DataStore If you plan to use either of these databases refer to our Knowledge Base for the additional steps required to enable up time to work with these databases Before You Begin There are three ways in which to install the up time Monitoring Station 1 From an archive downloaded from the uptime software Web site If you have downloaded the up time distribution from the uptime software Web site copy the archive to a temporary directory on
137. Managing up time nterfacing with up time Configuring Global Data Collection Methods 536 A Windows based Element can retrieve metric data either through the up time Agent or via WMI Initially set when the Element is added to up time the data colletion method can be switched from an agent based to agentless method or vice versa This change can be made on a per Element basis or multiple Elements can be switched in a single batch See Agentless WMI Systems on page 81 for more information In order to use the latter option you must configure up time so that it is aware of a data collection source that will be used for bulk conversions For configuration you can provide information for either the up time Agent or your organization s WMI credentials or both Note that multiple Windows based Elements can only be converted to a particular data collection source when it has been configured in the Global Element Settings panel Configuring Global WMI Credentials To provide WMI credentials that can be used to switch Windows Elements from agent based data collection 1 On the up time tool bar click Config 2 Inthe Tree panel click Global Element Settings 3 inthe WMI Agentless Global Credentials sub panel click Edit Configuration 4 Inthe Edit Global Element Settings pop up window enter the Windows Domain in which WMI has been implemented 5 Inthe Username field enter the user ID that has administrative access to WM
138. Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information Click Finish up time 5 User Guide d up time NNTP Network News NNTP Network News NNTP is a protocol for distributing searching retrieving and posting of messages and news articles from USENET a global collection of online discussion groups NNTP stores content in a central database enabling subscribers to select only the messages and articles that they want to read The NNTP Network News monitor measures the performance of your NNTP server It can also determine the server status in terms of the following e Command Implementation Response Category Response Codes EL Command Implementation Status reports from the server indicate the response to the last command that was received from the client Status response lines begin with a three digit numeric code which is used to distinguish between all responses The first digit of the response broadly indicates the success failure or progress of the previous command e 1xx an informative message e 2xx the command is OK 2
139. NMP protocol which contains additional protocol operations as well as improved security and data authentication v3 The latest implementation of the SNMP protocol which adds security and privacy features that are missing in versions 1 and 2 of the protocol Using the SNMP MIB Browser 312 The and SNMP MIB Browser is a Java applet that enables you to locate MIBs their OIDs object identifiers on your local file system or your network Use the SNMP MIB Browser to do the following Loading MIBs from a File or a Server e Adding OIDs e Deleting OIDs B The MIB Browser requires version 1 5 of the Java Web browser plugin up time will install the newer Java plugin if it detects that your computer has version 1 4 2 or earlier of the plugin installed up time 5 User Guide f up time SNMP Loading MIBs from a File or a Server You can load MIBs and their associated OIDs into up time from your computer or from a server Once you have loaded the MIBs you can select the OIDs that you want monitored by the SNMP service monitor To load MIBs from a file or a server do the following 1 From the up time tool bar select Services 2 Inthe Tree panel click Add Service Instance 3 Inthe Add Service Monitor window click List agentless up time monitors then click SNMP and then click Continue The SNMP MIB browser applet appears applet is loading click Always or Accept depending on 2 If a Java security warning dialog b
140. O Customer Service SLA LSS 63 of compliance period 38 of allowable downtime used 99 39 of target 99 0 OK The SLA is performing within its target The Downtime progress bar allows you to gauge how close the SLA is to reaching a critical state e an SLA whose allowable downtime exceeds 100 reaches a critical state is highlighted with red and is accompanied by the critical state icon F e an SLA whose allowable downtime at the current rate of use will be depleted before the compliance period has ended enters a warning level state is highlighted with yellow and is accompanied by the warning state icon LA e an SLA whose graphed allowable downtime does not exceed the graphed progress through the compliance period is in a compliant state Note that once an SLA reaches a critical state it will remain in that state until the compliance period has restarted the following week or month an SLA that enters a warning level state can be downgraded to a normal state if the rate at which allowable downtime is used decreases to a safer value 120 up time 5 User Guide d up time Viewing All SLAs Generating an SLA Detailed Report Clicking an SLA s corresponding Detailed Report button instantly generates an SLA Detailed report for the last 24 hours See Reports for Service Level Agreements on page 453 for more information F SLA View Types The Service Level Agreements subpanel provides two types of views
141. PAU NOS COV GI sass ices i vases acest 74 Adding VMware Instances to up time rrrrrrnnnnnnnvvrrrrrrrrrrnn nn nvrrnnnn 79 Adding Individual LPARS to up time ovrrrennnnnnnnnnnrrrrrnrnnrnnnnnvrrrnnnnnn 81 Agentless WMI Systems urnnnrnnnnnnnnvnnnnnnnvonnnnnnrernnnrnnnennennnnennnennree 81 Novell NRM Systems ssccccceesecceeeeceeeeseaaeeeeesaaaeeeseeaeesnseeeeeeenss 86 Adding Multiple Systems rnnrrrnnnnnnvnnnrnnnrrnnnnrnnrarnnnnvnnnnnnnnrnnnennnnnn 92 Editing a System PONG ee 99 Working with Applications as annnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 101 Adding AD ONCANONS sanere puassd kdkaniteadjdienn 101 Viewing Details About Applications rvrrvrnnnnnnnonnnrrrrrnnnnnnnnnnvnner 103 Editing Applications EE EE 103 Working with SLAS sc sseciainad cccsctunsnedinedeasceasisasdeatsceasiantaas 104 Working with Groups xarnannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 105 Adding GrOuUpS EE cuaneeetnassadsenteaaaSeeutteceacncs 105 Adding Nested Graupsuaseartasksknsaussjmiiastsn 106 Editing LET 01 EE 107 Working with Views aarunnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 108 Adding VS 108 Adding Nested VICWS runnssinseimirmduor idium ieiiep vesnkd 109 Editing NE EE 110 Deleting Elements Applications and Views 005 111 Acknowledging Alerts asrannannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 112 Overseeing Your Infrastructure SE EEE CECE er Tre er err crete 116 Viewing More InfOrrmation csssccccec
142. PI NGS Laurssrmmlninenunm anden 391 Viewing PONG een 395 Editing Action Profiles eee niceeetecsecas 395 Monitoring PeriodS as xsxaaxnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 397 Adding Monitoring PSU OOS esse ce gne 397 up time software xvii xviii Understanding Report Options OVErviem oc ccccciccccssicnssencsncsesescsccanesssdsenssanetascnsnsseneassernns 400 Generating Reports narnnnnnnnnnnnnnnnnnnnnnnnnnnennnnnnnnnnnnnnnnnnnn 401 Report Generation Options rrrrvnnrnnrrrnnnnnnvnnnnnnrnnnnnnnrrnnnnrnvnnnnnen 402 Saving RepOrtS sssssssssunnunnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn nnn 404 Saving Reports to the File System rrrrnnnnnnnnnnnvonennnrrrnnnrnvnnnnnen 404 Viewing Saved Reports ccccssccccceeseceeeeeseceeteeeaeeeessaaeeeeeenaaeees 405 Scheduling Reports a nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 407 The Report LOG wiciccccdccancctscacnaacdcesmmsccminncennsceneamunaantaeacanse 410 Viewing Report L NS aus 411 Deleting Report Log Entries eeeennnnnnnnnvennnnnvnnnnnnnvoornnnvnn ennnnr 412 Using Reports Reports for Performance and Analysis runrnnnnnnnnnnnnnnnnnr 414 Resource Usage RO DOM wi csicicsscnsscoteesorssvenacetevesncversatdevasnessaeacets 414 Multi System CPU Report uauapnnd magele ruammn 418 CPU Utilization Summary Report rrrvnnnnnvvnnnnnnnnvnnrnnrrvenrrrvrnnnnnn 419 CPU Utilization Ratio AG DOR sauser annan 422 Vart VO ODOM APP 423 Service Monitor Metrics
143. Portal where you can download the update e The status of your license including the type of license and the numbers remaining before the license expires My Alerts The Current Issues section contains a list of systems that are in a warning or critical state up time software 63 Using My Portal Overview Saved Reports The Saved Reports tab lists the reports that you have scheduled and saved For more information on scheduling reports see Scheduling Reports on page 407 This section contains the following information about the reports e the name of the report e an optional description of the report e whether or not the report is scheduled to run at a specific time e whether or not the report will be saved to a directory on the Monitoring Station or on another server e the time at which the report will next be run in the following format Wed Oct 12 14 30 00 EDT 2005 The My Portal panel only displays the reports and graphs that you have defined However a system administrator or a user with administrator privileges can view all saved reports Custom Dashboards A custom dashboard tab displays the contents of an external Web page that is referenced by URL Creating one or more custom tabs allows up time users to view customized content through My Portal Custom dashboards are visible to members of specific dashboard related User Groups For information on configuring a custom dashboard see Custom Das
144. Response Enter the response from the server as a string that determines whether or not a connection is made to the POP service Then set the Warning and Critical thresholds For more information see Configuring Warning and Critical Thresholds on page 144 The expected server response is the same for Windows Solaris and Linux For example if the POP service is available then the following is an expected response 0K POP3 lt server name gt v2002 81 server ready If the POP service is not available the following is an expected response ERR Null command Response Time 305 EL pr Oo Ww O D Oo Oo Network Service Monitors POP Email Retrieval Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 3 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings se
145. S uanrunnnnnnnnnnnnnnnnnnvnnnnnnvnnnvnnnvnnnnnnr 251 up time 5 User Guide J up time Configuring MySQL Basic Checks MONItOrs cccccceeeeeees 251 Oracle Advanced Metrics ssssssnsnnnnnnnnnnnnnnnnnnnnnnnnnnn 253 Configuring Oracle Advanced Metrics Monitors rrrrnrnnnnnnnnn 253 Oracle Basic CHECKS ssicicsissiciessscccccsstciccnsencacacevenssnnsann 256 Configuring Oracle Basic Checks Monitors 1 cccccceeeeeees 256 Oracle Tablespace ChecK xanrunnnnnnunnunnnnnnnnnnnnnnnvnnnvnnvvnn 259 Configuring Oracle Tablespace Check Monitors rrrrrrrnnnrnnrn 259 SQL Server Basic Checks ruuunnnnuunnnnnnnnnunnnnnnnunnnnnnnnvvnn 262 Configuring SQL Server Basic Checks Monitors rrrnnrrrnnnn 262 SQL Server Advanced MetricS anrunnnunnnunnnnnnunnnunnnnunnvnn 266 Using Multiple SQL Server Advanced Metrics Monitors 266 Configuring SQL Server Advanced Metrics Monitors 267 SQL Server Tablespace Check rrnnrunnnvnnnunnnnnnnnnnnnnnvnnnenn 270 Structure of a SQL Server Database rrrrrerrrrrnnnrrrrrrrerrrrrnnnnner 270 Configuring SQL Server Tablespace Check Monitors 271 SY DASE EEE T T 275 Configuring Sybase Monitors rrnrrvrvnnnnnvrrnnnnnnnvnnrnnrrrnnnnnrrnnnnnn 275 Network Service Monitors DNS seende 280 Before You Begin arrrnnnnnnvvnnonnnronnnnnvrornnnnnvnnnnnnnnvnnennnnnnnneenennenne 280 Configuring DNS
146. Server Advanced Metrics Connection Memory KB The total amount of dynamic memory in kilobytes that the server is using to maintain connections SQL Cache Memory KB The amount of memory in kilobytes that the server is using for the dynamic SQL cache Total Server Memory KB The total amount of committed memory from the buffer pool in kilobytes that the server is using Response Time Enter the Warning and Critical Response Time thresholds If the amount of time taken to perform a check exceeds the defined thresholds it could indicate a problem that requires investigation To save the data from the thresholds for graphing or reporting click the Save for Graphing checkbox beside each of the metrics that you selected in step 3 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information Click Finish 269 ZL U iy o o 2 ce ce 7 Database Monitors SOL Server Tablespace Check SQL Server Tablespace Check The SQL Server Tablespace Check monit
147. Service Monitors MAP Email Retrieval If IMAP is not available then the following is an expected response BAD Null command By making string comparisons on the returned values to the monitor you can check e The version of IMAP that is running to support your network routing e The system on which IMAP is or is not running Response Time Enter the Warning and Critical Response Time thresholds for the length of time a service check takes to complete For more information see Configuring Warning and Critical Thresholds on page 144 3 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish 290 up time 5 User Guide f up time LDAP LDAP LDAP Lightweight Directory Access Protocol is a protocol that organizes directory hierarchies and enables communication with directory servers Individuals in an organization can use LDAP to search fo
148. Set Cookie String field e expires is an optional attribute that specifies the expiration date and time for the cookie HTTP Header Settings The HTTP header settings for the response The HTTP header settings define the syntax and semantics of all standard HTTP 1 1 header fields For entity header fields both sender and recipient refer to either the client or the server depending on who sends and who receives the entity 287 EL 2 pr Oo Ww 92 Oo Oo Oo 2 Network Service Monitors HTTP Web Services 288 Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information Click Finish up time 5 User Guide f up time IMAP Email Retrieval IMAP Email R
149. Splunk server you must do the following 1 Edit the liveSplunkHandler_v2 py script to point to the up time Monitoring Station e Navigate to the scripts directory on the Monitoring Station e Open the file LiveSplunkHandler py in a text editor e Find the following entry in the file Specify the up time server and port by setting the following two variables 239 LL gt xo O o fe 5 e Oo 7 Application Monitors Live Splunk Listener Run the shell script data splunk bin scripts liveSplunkHandler p host dev latest port 9996 message failed login windows status 1 240 host localhost port 9996 Change the values for host and port to the host name and port of the Monitoring Station Edit the script to configure how the Live Splunk is reported on the Monitoring Station For the message option enter a diagnostic message that accompanies a Live Splunk captured by the up time service monitor For the status option enter the status of the service being monitored For the monitorName option enter the name of the service monitor that is listening to the Live Splunk Save the file and exit the text editor Copy the liveSplunkHandler py script from the Monitoring Station s scripts directory to the data splunk bin scripts directory on the Splunk server Configure a Live Splunk For information on configuring Live Splunks see the Splunk user manual Whe
150. Summary 03 2008 03 09 finished 3 3 ReportSlaSummary 2 114 0 2 0 finished 3 3 E Ex Ex E ReportYmwareVWorkload 2 0 2 0 finished Viewing Report Logs To view report logs do the following 1 On the up time tool bar click Reports 2 Inthe Tree panel click Report Log The report log appears in the Reports subpanel If there are no reports in the queue up time displays a message similar to the following ones in the Pending Reports and Running Reports sections of the Report Logs subpanel No reports are pending No reports are running up time software 411 suondo 140day buipuejsjopun SL Understanding Report Options The Report Log Deleting Report Log Entries Completed reports are stored in a table in the up time DataStore To free space in the DataStore or to remove report log entries that you no longer need you can delete entries in the report log from the Report Log subpanel To delete entries in the Report Log do one of the following e Click the Delete icon i beside the entry that you want to delete e Ifyou want to delete all entries in the Report Log click the Remove Completed Reports button When prompted to confirm whether or not you want to delete the report log entry click OK 412 up time 5 User Guide D uptime CHAPTER 19 Using Reports This chapter describes the reporting features of up time in the following sections Reports for Performance and Analysis cccccccccccccc
151. TA e SCSI e iSCSI e SATA e SATA II e Fibre If none of the options above apply enter the data transfer speed of the disk measured in megabits per seconds in the MBps field From the Network I O dropdown list select the type of disk interface that is used on the target server e 10Mbit e 100Mbit 434 up time 5 User Guide hy up time Reports for Capacity Planning e 1Gbit e 10Gbit If none of the options above apply enter the data transfer speed of the network interface measured in megabits per seconds in the MBps field 5 If you want to generate reports for systems in specific groups select the groups from the List of Groups area 6 To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views 7 If you are generating reports for specific systems in your environment select them from the List of Systems 8 Select a report generation option See Report Generation Options on page 402 for details 9 Do one of the following e Click the Generate Report button e Enter a name for the report in the Save to My Portal As field and optionally enter text in the Report Description field Then click Save Report The report parameters are saved to the My Portal panel Doing this does not generate the report 10 To schedule the saved report to run at a specific time or interval click the Scheduled che
152. The script can return the following status codes e 0 OK The services are functioning properly e 1 Warning There is a potential problem with one of more of the services being monitored e 2 Critical There is a critical problem with one or more of the services being monitored up time 5 User Guide f up time Live Splunk Listener e 3 Unknown There is an error in the configuration of the monitor itself or up time cannot execute the service check e monitor in liveSplunkHandler v2 py monitor in alertUptimeStatusHandler sh The name of the up time monitor to which the information from the Live Splunk will be directed The following is an example of the script with all of its options specified liveSplunkHandler v2 py message sendmail has some traffic going through new command status 2 monitorName Live Splunk up time captures the output from the script which appears in the service status section of the Global Scan panel see Understanding the Status of Services on page 21 The up time monitoring framework picks up any error codes and triggers the appropriate monitoring action Before You Begin up time software Before you can configure a Live Splunk Listener monitor for Live Splunks generated on a Splunk server you must first configure the correct scripts depending on the version of Splunk you are using Using Splunk v2 Before you can monitor Live Splunks generated on a v2
153. Warning and Critical thresholds If the thresholds that you set are exceeded then up time generates an alert For more information see Configuring Warning and Critical Thresholds on page 144 e MySQL Port The number of the port on which the MySQL instance is listening The default is 3306 e Username The user name that is required to log into the MySQL instance e Password The password that is required to log into the MySQL instance 244 up time 5 User Guide dy up time up time software MySQL Advanced Metrics Uptime The number of seconds that MySQL has been running Questions The number of queries that have been sent to the database Slow Queries The number of queries that take longer than long_query_time to complete When started with the log slow queries file name option MySQL writes a log file containing all SQL statements that took more than the long_query_time to execute The time taken to acquire the initial table locks is not counted as execution time If the file name value is not specified the information is written to a file with the name of the host machine along with the suffix slow 1log Ifa filename is given but not as an absolute path name the file is written to the default MySQL data directory You can use the log queries not using indexes option to log queries that do not use indexes to the slow query log Queries handled by the query cache are not added to the s
154. WebSphere Monitors rrrrnrrnnnnnnvrvnrnnnrrrennnnnvnnnnnn 215 ESX Workload sssssssssnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn nnn 217 Configuring ESX Workload Monitors arrrrrrnnnrrrrrnnnnrrrnnnrnvrnnnnnn 217 ESX Advanced Metrics rannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 220 Configuring ESX Advanced Metrics Monitors rrrrrrnrnnnrnnnnr 220 Web Application Transactions a xannnnnnnnnnnnnnnnnnnnnnnnnnnnnn 223 Using the Web Application Transaction MOnitor 22 c ccccc00 223 Configuring Web Application Transaction Monitors 0000 224 Viewing and Diagnosing Web Transaction Performance 227 Using Web Transaction Performance in SLA Reports 228 Email Delivery Monitor asrannannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 230 Configuring Email Delivery Monitors arrvrnnnnnnnrrnrnnnrrrnnnnnvrnnnnrn 230 Diagnosing and Reporting Email Delivery Problems 0 233 Splunk QUERY serverer 236 Configuring Splunk Query Monitors arrrrvrnnnnnnnvnnnnnnrrrennrnnnnnnnnn 236 Live Splunk Listener as anrannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 238 Befor YOU BEN sara iea aani a EEEE 239 Configuring the Live Splunk Listener Monitor arnrrrrnnnnnrrnnnrrn 242 Database Monitors MySQL Advanced Metrics unrunnvnnnnnnnnnnnnunnnnnnvnnnvnnvvnnr 244 Configuring MySQL Advanced Metrics Monitors rrrnnn2rn0nn 244 MySQL Basic Checks
155. You can also change position of the legend and manipulate its size and format e Panel subtab Enables you to add delete and change the graph s background add images or color and apply logos to customize the look of your graph e Paging subtab Enables you to define the number of pages that your graph contains choose to display a numeric index and determine the number of data points that will be displayed on each page e Walls subtab Enables you to adjust the left right bottom and back walls of your graph up time software 483 Buiyde15 buipuejssopun oz Understanding Graphing Using the Graph Editor e 3D subtab Enables you to apply the following effects to graphs e rotation elevation and zoom to adjust the depth of the graph e horizontal and vertical offsets e changes to perspective Working with Trend Lines A trend line is a line on a graph that indicates a statistical trend Typically a trend line connects multiple points on a graph A trend line extends into the future and you can use it to identify current and potential increases or decreases in server performance You can create a trend line when you need to clarify graphed information A trend line can help you obtain a comprehensive view of the data and pinpoint any tendencies in server performance The following image illustrates a trend line up time 5 lab ti 4 10 1 1 Date Range Wed Apr 16 00 00 00 EDT rk I O Apr 16 17 07 25 E
156. a particular server and not the network infrastructure This theory can be further investigated by looking at the performance metrics for the server in question Use the Web Application Transaction monitor s playback script to verify which servers are being used during a problem checkpoint In the Service Instances panel click the monitor to view the script then locate the system that is being accessed e g with GET and POST commands Use this as an investigative starting point although an application or Web server is often referenced in the script the problem may be found deeper in the application stack e g a database server to which the referenced Web server makes calls during the checkpoint Using Web Transaction Performance in SLA Reports Your Web applications will typically call on systems on the application and database tiers as well as make use of internal and external facing network 228 up time 5 User Guide hy up time Web Application Transactions devices Since the Web Application Transaction monitor directly reports on the performance of a Web transaction it in effect indirectly reports on the health of your IT infrastructure as a whole This broad reporting coverage makes the Web Application Transaction monitor an ideal monitor to include in service level agreement reports For more information on SLA reports see Reports for Service Level Agreements on page 453 LL gt xo O o fe 5
157. a service The following image illustrates the Monitor Timing Settings area of the monitor template Timing Settings Monitored Vv Timeout 60 sec Check Interval 10 min k Interval 1 The monitor timing settings enable you to set up a master service monitor that you can apply to multiple systems You can do this when setting up a deployment where you may want to apply a service monitor to a large number of entities or want to apply a very similar service monitor and then make further customizations to it and its children up time 5 User Guide hy up time The Monitor Template Timing Settings Options The following options are available in the Timing Settings area e Monitored Turns a monitor on or off The Monitored setting is on by default e Timeout How long a monitor runs before up time issues an error message A timeout occurs when the Monitoring Station has not received a status from the named service monitor after a period of time has passed When a service monitor does not return data the status of the monitor changes to Unknown When a service monitor times out an error message appears on the Global Scan panel e Check Interval How frequently the monitor checks the status of an entity The minimum check interval is one minute and the default is 10 minutes There is no maximum check interval e Re Check Interval The amount of time between checks A recheck sho
158. a waiver by that party as to subsequent enforcement of rights or subsequent actions in the event of future breaches 8 6 Amendment Uptime reserves the right in its sole discretion to amend this Agreement from time to time If there is a conflict between this Agreement and the most current version of this Agreement posted at www uptimesoftware com the most current version will prevail If you do not accept amendments made to this Agreement then this license will be immediately terminated pursuant to Section 4 up time software 583 JUudouIddJby su 17 gl NOTICE TO USER 8 7 Taxes You shall in addition to the license fees required under this Agreement pay all applicable sales use transfer or other taxes and all duties whether national provincial or local however designated that are levied or imposed by reason of the transaction contemplated under this Agreement excluding income taxes on the net profits of Uptime You shall reimburse Uptime for the amount of any such taxes or duties paid or incurred directly by Uptime as a result of this transaction 584 up time 5 User Guide f up time Index A acknowledging alerts 112 Action Profiles 389 action profiles creating 391 editing 395 viewing 395 Active Directory authentication 349 monitor 187 adding alert settings 149 Applications 101 Distribution Lists 344 groups 105 monitor information 142 multiple systems 92 nested groups 106 nested views 109 Notificatio
159. abelled ACK appears in the Service Status section of Global Scan When the current status of a monitor is acknowledged it appears in the ACK column instead of in the WARN or CRIT column You can enable or disable status acknowledgement i e add or remove the ACK column from the status tables through the following parameter the default value is shown acknowledgedSeparate fals S When performance and availability graphs are generated the Graph Editor is used to manipulate the appearance of graphed data see Using the Graph Editor on page 482 Transformations from a three dimensional perspective are possible if the user account permits it see Adding Users on page 337 and the user is connecting to the Monitoring Station using Internet Explorer This 3D presentation option can be disabled outright You can determine whether ActiveX graphs are displayed in 3D for users with Internet Explorer through the following parameter the default value is shown default3DGraphs true up time software 561 Configuring and Managing up time Monitoring Station Interface Changes Custom Dashboard Tabs 562 Custom dashboards can be added to My Portal to display custom content that is relevant to the particular user who is currently logged in Up to 50 dashboards can be added each of which is accessed through and viewed in its own tab at the top of My Portal A custom dashboard tab is configured by pointing up time t
160. accept the default location for example usr local uptime datastore on Red Hat and SLES e Type a new location at the command prompt for example opt uptime datastore then press Enter This should be the full path to the DataStore ma Because the DataStore can grow very large in excess of 100 GB you can install the DataStore in another folder on the file system if you are monitoring a large number of systems and retaining data for extended periods 9 Do one of the following to specify the basic up time configuration information e Press Enter to accept the default for each option that is listed below e Type new information for each of the following options e Web Server Name The name of the computer that is hosting the Web server This name is written to the file httpd conf which contains configuration information for the Web server used by up time e Web Server Port The number of the port on which the Web server for the Monitoring Station will listen for requests The port number is written to the file httpd conf e up time email address The email address from which the Monitoring Station will send alerts and reports to users 34 up time 5 User Guide dy up time Installing the up time Monitoring Station e DataStore Port The number of the port on which the DataStore the up time database will listen for requests The port number is written to the file uptime conf 10 On the Install Summary pa
161. ache hits to determine the number of query results taken directly from the cache instead of executing them When this number is exceeded up time generates an alert This metric shows the number of query results taken directly from the query cache instead of executing them You should compare the value of QCache Hits to the total number of your SELECT queries to determine the current hit rate Then you can increase or decrease the query cache size to find the value which provides optimal performance e QCache Lowmem Prunes The number of QCache_lowmem_prunes that can be deleted from the cache because of low memory ZL This variable counts the number of queries that have been removed from the cache to free up memory for caching new queries The query cache removes the least recently used queries from the cache e QCache Not Cached The maximum number of queries that are not cached e QCache Free Memory U iy o o 2 ce ce 7 The amount of free memory for the query cache e QCache Free Blocks The number of free memory blocks in query cache e QCache Table Blocks The amount of query cache memory fragmentation Response Time Enter the Warning and Critical Response Time thresholds for the overall time required to perform a service check For more information see Configuring Warning and Critical Thresholds on page 144 up time software 249 Database Monitors
162. achine Installing Agents on Windows 40 The installer for Windows up time agents uses a wizard that guides you through the installation process If the Windows installer requires unavailable service packs for example SiteServer or Terminal Server send an email to support Quptimesoftware com and request the extracted agent which can be installed without using the Windows installer Note If the Windows installer requires unavailable service packs for example SiteServer or Terminal Server send an email to support Quptimesoftware com and request the extracted agent which can be installed without using the Windows installer To install an agent on Windows do the following 1 Copy the installer setup exe for the Windows agent to the system on which you want to install the agent 2 Log in to the Monitoring Station as the local administrator up time 5 User Guide f up time Installing Agents up time may not function properly if the Monitoring Station is installed when you are logged in as a domain or non local administrator In Windows Explorer double click the file setup exe On the installer Welcome screen click Next On the Select Installation Folder screen type the path to the folder in which you want to install the agent in the Folder field Alternatively click the Browse button and use the dialog box that appears to search for the folder Select the checkbox Make available for Everyone op
163. acle Tablespace Check Monitors To configure Oracle Tablespace Check monitors do the following 1 In the Oracle Tablespace Check monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following fields up time software 259 ZL U iy o o 2 ce ce 7 Database Monitors Oracle Tablespace Check 260 NG Port The number of the port on which the Oracle service is listening The default is 1521 Username The user name that is required to login to the Oracle database Password The password that is required to login to the Oracle database SID The Oracle System Identifier SID that identifies the Oracle instance The SID defaults to the database name The SID is a unique name for an Oracle instance to switch between Oracle databases The SID is included in the CONNECT DATA paths of the connect descriptors in the tnsnames ora file As well the SID is in the definition of the TNS listener in the 1istener ora file If you do not complete the Username Password SID fields up time will attempt to connect to the database If connection fails the database returns a SOL exception error Full Warning Threshold Mandatory Enter a value that will change the status of the Oracle Tablespace Check from OK to Warning The warning threshold should be a perce
164. administrators See Reports for Service Level Agreements on page 453 for more information 370 up time 5 User Guide up time Adding and Editing SLA Definitions Adding and Editing SLA Definitions Adding and using an SLA requires that you first define the SLA then add one or more SLOs to it When you create an SLA it will be inserted into the current compliance period For example a newly created SLA that reports over a monthly compliance period will if created on the 15th of the month already be around 50 through the period Adding a Service Level Agreement To add a service level agreement to up time do the following 9L 1 Inthe My Infrastructure panel click Add Service Level Agreement The Add Service Level Agreement window appears http lab1 load 8c 9999 Add Service Level Agreement Mozilla Firefox Add Service Level Agreement fe x Ke gt w D S Oo D r D lt gt Q D D 3 D 7 Name of Service Level Agreemen t Description of Service Level Agreemen t Parent Group Monitoring Period Dan Help F Target Percentage 99 0 Compliance Period Type Weekly Scheduled Maintenance TA deine gcbscenee Done ed ES A up time software 371 Working with Service Level Agreements Adding and Editing SLA Definitions 372 Enter a descriptive name for the SLA in the Name of Service Level Agreem
165. ail Delivery Monitors Define the Email Delivery monitor by providing information about the outgoing and incoming mail servers 230 up time 5 User Guide l up time Email Delivery Monitor 1 Complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 B Once created the Email Delivery monitor service can be included with status reports for the system or group you select If this monitor is reporting outgoing mail delivery times the system should be a monitored SMTP server if incoming mail delivery times are being measured the system should be a monitored POP3 IMAP mail server 2 Complete the Outgoing Email Settings B up time software SMTP Hostname Provide the name or IP address of the SMTP server LL SMTP Port Provide the port used to communicate with the SMTP server Leave this field blank to use the default SMTP port 25 SMTP Username Provide the authenticated SMTP user name SMTP Password Provide the authenticated SMTP user password gt xo O o fe 5 e Oo 7 SMTP Uses SSL Specify whether the SMTP server sends and receives encrypted communication using SSL Destination Email Address Enter the test email address used by the monitor The monitor sends an email to this address and this address is checked for receipt of the test email Although the Email Delivery monitor att
166. ailability he E IT Solaris Boxes I Uni Boxes VMware Boxes Applications IT Windows 2003 Be IT Windows Boxes List of Views Select All Views I up time Tool Bar The up time tool bar provides quick access to the following panels e Global Scan 6 up time 5 User Guide f up time up time software Understanding the up time Interface e My Portal e My Infrastructure e Services e Users e Reports e Config Global Scan The Global Scan panel provides information about the status of your resources You can drill down by system group system or alert status to manage the resources in your infrastructure For more information about using the Global Scan panel see Overseeing Your Infrastructure on page 115 My Portal When you log into up time the first screen you see is the My Portal panel The My Portal panel gives quick access to basic up time functions and to saved reports The My Portal panel is divided into the following sections e Assistance e My Preferences e Latest News e My Reports For more information about using the My Portal panel see Using My Portal on page 61 My Infrastructure The My Infrastructure panel provides an inventory of your network resources You can view information about systems and their monitoring status From the My Infrastructure panel you can add and view e Systems w dn Buipueyssapun F Understanding up time Understanding the up time Interface
167. all Elements that are a part of your infrastructure Some users may for example only need to be interested in five to 10 of the available servers You can limit the servers that one or more users will see by creating specific views which are subsets of the servers in your environment By creating views it becomes easier for users to not only monitor systems but to also browse and compare historical data Views appear in the Views section on the Infrastructure panel as well as the the Global Scan panel Adding Views 108 To add a view do the following 1 Inthe Infrastructure panel click Add View 2 Inthe Add View window enter a descriptive name in the View Name field This name will appear when listing views in the Infrastructure panel 3 Optionally enter a description in View Description field 4 To make this view a child of an existing one select it from the Parent View dropdown list If this is the first group that you have defined only My 3 Infrastructure will appear in the dropdown list 5 To give this view its own child views select one or more entries from the Available Element Views list then click Add 6 Select one or more Elements from the Available Elements list then click Add If you have combined your Elements into groups select a group from the dropdown at the top of the list Or select All from the dropdown to display all of the Elements in your environment 7 Select one or more users from the Availab
168. ame For example you can assign the display name Toronto Mail Server toa system with the host name 10 1 1 6 up time 5 User Guide hy up time Working with Custom Alert Formats Variable Definition DATETIMES The date and time at which the alert was generated This appears in the subject line of the message SERVICENAMES The name of the service along with the name of the host for which the alert was generated For example if the alert was generated by the ping check for the server MailHub then PING MailHub appears in the alert This appears in the subject line of the message S SERVICESTATES One of the following OK WARN CRIT MAINT UNKNOWN This appears in the subject line of the message SDATES The date on which the alert was generated STIMES The time at which the alert was generated SHOSTNAMES The name of the host as saved in up time for which this alert was generated SHOSTSTATE WV The status of the host which can be one of the following OK WARN CRIT MAINT UNKNOWN up time software 387 Alerts and Actions Working with Custom Alert Formats Variable Definition STYPES The type of notification which can be one of the following Problem Recovery OUTPUTS The output of the monitor that generated the alert For example Ping completed 1 sent 100 0 loss
169. ame gt is the name of the archive that contains the agent that you are installing e g uptmagnt AIX lt version gt tar e Run the following command to install the agent INSTALL sh Installing the agent on a pSeries server without an HMC Before you can monitor the logical partitions on an IBM pSeries server you must install an agent on each partition Use the following instructions to install the agent on an IBM pSeries LPAR that is not managed by an HMC but whose partitions are managed by the Integrated Virtual Manager IVM To install the agent do the following 1 If Linux is running on the LPAR do the following e Log into the LPAR as root e Copy the RPM file containing the agent to the LPAR up time software 45 Installing up time nsta ling Agents e Run the following command rpm i lt agent name gt rpm Where lt agent name gt is the name of the rpm file for the agent that you are installing e g Upt imeAgent Linux lt version gt rpm m If you are using SuSe Linux Enterprise Server 9 you must update the kernel to the latest version using the YAST package manager If you do not upgrade the kernel the agent will not be able to gather workload data 2 If AIX is running on the LPAR do the following e Log into the LPAR as root e Copy the archive containing the agent to the LPAR e Extract the contents of the archive using the following command tar xvf lt agent name gt Where lt
170. an influence the response time including network connectivity the type of information that is being collected and the availability and performance of the service Configuring Response Time To configure response time do the following 1 For each threshold select an option from the Select a comparison method dropdown list as illustrated below Response time Warning Select a comparison method z ms Critical Select a comparison method z ms Enter a Warning threshold in milliseconds For information on configuring Warning thresholds see Configuring Warning and Critical Thresholds on page 144 up time software 145 SJOHUON 2914135 bulsn ry Using Service Monitors The Monitor Template 3 Enter a Critical threshold in milliseconds For information on configuring Warning thresholds see Configuring Warning and Critical Thresholds on page 144 ma If you select a comparison method you must enter a value in the corresponding field for the threshold Monitor Timing Settings 146 Monitor timing settings determine e whether or not the monitor is active the length of time in seconds to wait before determining that a monitor has timed out e the interval in minutes at which the monitor will perform a service check e the interval in minutes at which the monitor will recheck the status of a service e the maximum number of times that the monitor will recheck
171. anced Metrics Uptime Agent Windows Event Log Scanner Windows Service Check 137 SJOHUON 2914135 bulsn ry Using Service Monitors Using Service Monitors Using Agentless Monitors Agentless monitors do not require an up time agent to be installed and running on the system that you want to monitor Your Monitoring Station communicates with the remote system to e determine the status of the service that is being monitored e collect information from the service that is being monitored The monitors that do not require an agent are e Active Directory POP Email Retrieval e DNS e SSH Secure Shell e FTP e SMTP Email Delivery e HTTP Web Services e SNMP e IMAP Email Retrieval e SQL Server Advanced Metrics e LDAP e MySQL Advanced Metrics e MySQL Basic Checks e SQL Server Basic Checks e SQL Server Tablespace Check lt NES e Sybase e NIS YP eTo e WebLogic e NNTP Network News e Oracle Advanced Metrics WebSphere e ESX Workload e ESX Advanced Metrics e Windows File Shares SMB e Oracle Basic Checks Oracle Tablespace Check e Ping Using Advanced Monitors You can configure monitors to carry out service or performance checks that may be specific to your environment Using advanced monitors you can e monitor any service that does not have an up time service monitor 138 up time 5 User Guide dy up time up time software Using Service Monitors monitor the perfo
172. anel and uptime conf file is described in the following sections OVEIVICW epi iaer aa ENEE DER akselen EOE E ae 528 Interfacing with up time 1 cece cece ee eee tennant 532 Archiving the DataStore arnaanuanunvnranrannarnnnnrrrrannnennennr 545 UpstiMe DIAGNOSIS audiens ag EEE EEEE EEE hegg i 551 up time Measurement Tuning raaanannranrannrernnnrnnnrennnenr 554 Report Storage Options ccccccccccccccccccuceeeeeeeeneeeseeeseeeensanas 558 Resource Usage Report Generation nnannnnrrannnnnnennnevrnenn 560 Monitoring Station Interface ChangQes 11sccceeeeeeeenneees 561 License Inform ton iuuraklvekvineiedakvemreekearee LAYA 563 527 Configuring and Managing up time Overview Overview 528 up time includes user definable parameters that can control some aspects of its behavior including the following e Database Settings e Mail Server Settings e Global Scan threshold settings Resource Scan threshold settings e Proxy settings e Remote reporting settings RSS feed settings e Splunk integration settings e Web monitor settings From a configuration perspective there are two types of parameters parameters whose modification does not require a restart of the Core service also known as the up time Data Collector service these parameters can be modified in up time on the Config panel e parameters whose modification requires a restart of the Core service these paramet
173. appears in the subpanel Click the Edit icon 7 beside the name of the server whose host check you want to change A list of the available host checks appears in a new window Select a host check and then click Save up time 5 User Guide hy up time The Platform Performance Gatherer The Platform Performance Gatherer The Platform Performance Gatherer is a host check that collects basic performance metrics for example CPU performance and disk statistics from a system in order to determine whether or not that system is functioning You can edit the following monitor settings for the Platform Performance Gatherer from the Info amp Rescan subpanel Editing the Platform Performance Gatherer To edit the Platform Performance Gatherer settings the following 1 Inthe Global Scan or My Infrastructure panels click the name of a server 2 Click the Info tab and then click Info amp Rescan 3 Click the Edit Performance Monitor link that is beside the Monitoring Interval setting as shown below CUFFEnUY Beng Monitored V ver Monitoring Interval 5 min Edit Performance Monitor The Edit Service Monitor window appears 4 Edit the settings for the Platform Performance Gatherer While you can edit any setting the settings that you are most likely to change are e Port Number The number of the port on which the Platform Performance Gatherer is collecting data from a host For most systems this setting is labell
174. application e SLO 3 email service e x co gt w D S Oo OD Fr lt D gt Ko OD OD 3 D pr 7 up time software 359 Working with Service Level Agreements Viewing Service Level Agreements Viewing Service Level Agreements Service level agreements and the type of information displayed are viewed in the Global Scan panel from a monitoring perspective and in My Infrastructure from a configuration perspective Viewing SLA Status You can view the status of all your SLAs in the Service Level Agreements subpanel which can be accessed by clicking the View SLAs tab when you are in the Global Scan panel e up time 5 DEMO Current User admin x Global Scan My Portal My Enterprise Services Users Reports Config Global Scan Service Level Agreements View SLAs View Applications View All Elements View Resource Scan View All Services Service Level Agreement Status Show Detailed View Service Level Agreement Status f HEN m 53 63 of compliance period Detailed Report HE 2 100 of allowable downtime used 89 47 of target 99 0 CRIT The allowable downtime has been exceeded by 15 minutes A Enterprise Application SLA P 63 of compliance period Detailed Report mH 66 of allowable downtime used 98 96 of target 99 0 WARN Atthe current rate this SLA will breach after 10 more minutes of downtime 8 Customer Service SLA Se 63 of compliance period Detailed Repor
175. are issued The default is 120 minutes e Alert on Critical Sends an alert when a monitor reaches a Critical status threshold e Alert on Warning 148 up time 5 User Guide hy up time The Monitor Template Sends an alert when a monitor reaches a Warning status threshold e Alert on Recovery Sends an alert when a monitor recovers from a Warning or Critical status e Alert on Unknown Sends an alert if any metric or time value for a monitor returns a status of Unknown Adding Monitor Alert Settings Information To add monitor alert settings information do the following 1 Click the Notification check box to turn on alert notifications If you do not click the Notification check box none of the remaining boxes in monitor alert settings template are active 2 Enter an amount of time in minutes in the Alert Interval field The alert interval is the frequency at which an alert is repeated if a monitor does not have an OK status 3 Click one or more of the following checkboxes e Alert on Critical e Alert on Warning e Alert on Recovery e Alert on Unknown up time software 149 SJOHUON 2914135 Huis ry Using Service Monitors The Monitor Template Monitoring Period Settings The Monitoring Period settings determine the time periods at which up time sends alerts For more information see Alerts and Actions on page 377 To set the Monitoring Period do the following 1 Select one of the following option
176. as a UI instance must have the same database settings as the data collecting Monitoring Station See Database Settings on page 532 for more information Scrutinizer Settings 542 Scrutinizer is a NetFlow analyzer that can be installed to monitor network traffic managed by compatible switches and routers Scrutinizer can be integrated with Global Scan as well as up time s graph generation for node type Elements and other hosts that are also monitored with Scrutinizer In order to access Scrutinizer up time needs to be pointed to your installation up time 5 User Guide dy up time Interfacing with up time Modifying the Scrutinizer Settings You can configure Scrutinizer s integration with up time through the following parameters e netflow enabled Determines whether Scrutinizer is integrated with the Monitoring Station e netflow hostname The host name or IP address of your Scrutinizer installation netflow port The HTTP port through which Scrutinizer sends and receives communication e netflow username The user name required to log in to Scrutinizer e netflow password The password required to log in to Scrutinizer Splunk Settings Splunk is a third party search engine that indexes log files and data from the devices servers and applications in your network Using Splunk you can quickly analyze your logs to pinpoint problems on a server or in a network or ensure that you are in comp
177. asured i e weekly or monthly e Service Level Objectives a listing of the SLOs into which the SLAs services have been organized For more information about system information in general see Viewing System Information on page 50 You can view information about the services that make up the SLA by clicking the Services tab in the Tree panel The options available in the Tree panel are summarized in Viewing Service Information on page 52 Clicking the Graphing tab in the Tree panel then clicking Current Status displays a verbose status summary of the SLA that includes the following e Trend Analysis SLA status indicator for the current compliance period 361 9L e x co gt w D lt Oo OD Fr lt D gt Ko OD OD 3 D pr 7 Working with Service Level Agreements Viewing Service Level Agreements e Compliance Period and Allowable Downtime Used the current progress through the compliance period and how close the SLA is getting to reaching a critical state e Achieving SLA how close the SLA is to its performance target how recoverable a failing SLA is based on how far it is from its target e Achieving SLOs an SLO level breakdown of how well or poorly each SLO is meeting its performance target how recoverable failing SLOs are based on how far it is from its target See A Note About SLOs and Compliance on page 365 for more information about SLOs and the
178. ata from the thresholds for graphing or reporting click the Save for Graphing checkbox beside each of the metrics that you selected in step 2 up time software 195 Application Monitors Exchange 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish Configuring Exchange Monitors To configure an Exchange monitor for your Micorsoft Exchange 2007 or 2010 server do the following 1 Complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following settings by clicking the checkbox beside each option and then specifying a warning and critical threshold If the thresholds that you set are exceeded then up time generates an alert For more information see Configuring Warning and Critical Thresholds on page 144 e SMTP Bytes Sent Per Second The total number of bytes sent per second by the Exchange SMTP server 196 up time 5 User Guide l up time Exchange
179. atform Performance Gatherer cssssssececeeeeeeees 157 Topological Dependencies ranrnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 159 Adding Topological Dependencies erennnnnnnnnvvvvenennnnnnneevennnnnr 160 Viewing Topological Dependencies eeerenennnnnnnvov0nnnnnnnnannennnnr 160 Scheduling Maintenance a nannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 161 Creating Scheduled Maintenance Profiles rrnrrrrrrrenrrnnrnrnnnr 161 Viewing Scheduled Maintenance Profiles s es 162 Scheduling Maintenance for a HOSt rarnnnnnrnnnnnnrrrnnnnnrrrnnnrnnnnnnn 162 Scheduling Maintenance for a Service rrrrrrrnnrrrrrrrrrrrrrrnrrnnnr 163 Agent Monitors TT EEE Me ASNE 166 File System Capacity sssssssssusunnnunnnnnnnunnnnnnnnnnnnnnnnnnnnn 167 Configuring File System Capacity Monitors rrrnrrrrrrrrerrrrnnrrrrnr 167 up time 5 User Guide D up time Performance ChecCkK s sssssssnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 170 Configuring Performance Check Monitors rrrrnnnrrrrrrrrerrrnnnnrnrer 170 Process Count Check sssssssssnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 174 Configuring Process Count Check Monitors rrnrrrrrrrrrrnrnnnnrnnnr 174 Microsoft Windows Monitors Windows Event Log Scanner annannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 178 Configuring Windows Event Log Scanner MonitorS 0 178 Windows Service Check rannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 182 Configuring W
180. ating an extensive report for a large group of Elements can take several minutes If exhaustive report generation is necessary but taking too long you can increase the number of report images the default being 6 that up time concurrently generates for this type of report Note that the default number is optimal in most cases increasing the amount may improve performance but the law of diminishing returns applies as too many concurrent threads can tax the PDF generation process overall Logging is configured through the following uptime conf parameter reporting prefetch images threads 6 560 up time 5 User Guide J up time Monitoring Station Interface Changes Monitoring Station Interface Changes Status Al 3D Graph Some configuration options affect the Monitoring Station interface These can be modified by manually inputting settings in the up time Configuration panel as outlined in Modifying up time Config Panel Settings on page 529 ert Acknowledgement When services reach a warning or critical state administrators can flag an alert as acknowledged which prevents subsequent alerts from being broadcasted giving them time to investigate the issue See Acknowledging Alerts on page 112 for more information Service status alert acknowledgements can be reported in the status tables on the Global Scan panel By default status alert acknowledgement counts are not shown if enabled a new column l
181. ation on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information Click Finish up time 5 User Guide hy up time Web Application Transactions Web Application Transactions A Web transaction is a series of Web pages that together fulfill a specific function for end users A common Web transaction example is the checkout process on an e commerce site during which end users select a shipping option pay for their items and have their credit card verified During this transaction many calls are made to the application and data layers as the end user provides and the servers process information Although the type of Web application that is monitored by up time users is typically different e g intranet applications the structure of the transaction is the same an end user steps through a sequence of Web pages that take inputted information and initiate appropriate actions with application or database servers LL The up time Web Application Transaction monitor tests the speed and availability of an end user Web transaction Specifically the Web Application Transaction monitor performs two roles it
182. ation Enabled check box All user synchronization configuration options appear 3 In the Synchronize Users field enter the frequency at which up time user information will be synchronized with the Active Directory listing By default synchronization occurs every hour 4 Inthe AD Group Distinguished Name field enter the name of the AD group of up time users e g CN uptime users CN Groups DC yourdomain DC com EN 5 If required enter an appropriate administrative AD Username and AD Password required to access the directory 6 In the User Name field provide the name attribute used to retrieve the user name e g SAMAcountName For AD synchronization a user name is the minimum amount of directory information up time needs to map to a user profile 7 For the remaining Field Mappings provide attibutes for other user details you would like to synchronize with the up time user profile O e e lt gt e Cc 2 oD 7 i First Name e g givenName ii Last Name e g sn iii Location e g physicalDeliveryOfficeName iv Email Address e g userPrincipalName v Pager Cellphone vi User s Windows Desktop Host Name vii User s Windows Desktop Workgroup Any user attributes chosen to be synchronized with the directory will not be editable in up time up time software 351 Configuring Users Changing How Users Are Authenticated 8 Select a User Role to which any newly detected users wi
183. ation travelling between an SNMP instance that is using version 3 of SNMP and up time 5 Complete the following fields 316 Warning and Critical Thresholds Enter the Warning and Critical thresholds for each OID that you added using the SNMP MIB Browser For more information see Configuring Warning and Critical Thresholds on page 144 up time 5 User Guide l up time up time software SNMP Each OID has one or more settings associated with it as shown in the following image 11 3 6 1 6 3 1 1 4 1 v snmpTrapOID v Warning Select a comparison method oO Critical Select a comparison method 1 3 6 1 6 3 1 1 6 1 snmpSetSerialNo v Warning Select a comparison method zl Critical Select a comparison method z 1 3 6 1 6 3 1 2 2 5 snmpSetGroup v Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information
184. ations that are installed on the systems that you are monitoring Agents do the following e collect information from a remote server e send the collected service data to the Monitoring Station Certain up time monitors poll the agents for data at a frequency that you can configure The data collector component of the Monitoring Station then stores the results in the up time DataStore for use in a report or graph Agents enable you to collect very detailed information about a system such as information about processes and low level system statistics The level of granularity of the information collected by agents is greater than that of the information collected by agentless monitors Each up time agent is configured by default to collect and return performance information for every up time agent service monitor You do not need to configure the agent to collect information for a service On Windows an agent is installed with the up time Monitoring Station However you will need to deploy the agent on the systems you are monitoring On other operating systems you must download the agent from the uptime software Web site and manually install it Understanding Major and Minor Versions When you install up time you install a Monitoring Station and one or more up time agents You could have different versions of Monitoring Stations and agents For example you could have different platforms and different up time agent versions running on eac
185. atistic returns a Good status then the storage system is experiencing reads or writes and there are no pending disk I Os If the status is Suspect the storage system has disk I Os pending no reads or writes have occurred and less than four samples have been taken If the status is Bad the storage system has disk I Os pending no reads or writes have occurred and four or more samples have been taken Adding Multiple Systems 92 It can be time consuming to add large numbers of systems to up time using the Web interface You can however add multiple systems to up time using the addsystem command line tool and a text file A text file called a hosts file contains entries which mirror the fields in the Add System window of the up time Web interface These fields contain information about the systems that you want to add See and for more information You can find examples of entries in a hosts file in the section Examples of Hosts File Entries on page 97 In the hosts file e The information for each host consists of a name value pair Each name value pair is on a separate line and is separated by a colon For example Group Solaris Servers The information for each host is separated by two percentage signs on a new line Creating a Hosts File There are a number of ways in which you can create a hosts file The simplest way is to use a text editor to type the entries in a file If you have a large number o
186. bLogic 8 206 In order for up time to collect information from a WebLogic 8 1 server the file weblogic jar must be deployed on the Monitoring Station To deploy the weblogic jar file do the following 1 Locate the weblogic jar file on the WebLogic server The file is located in the 1ib folder in the directory in which WebLogic is installed For example on Windows the default folder is C bea weblogic81 server lib Copy the file to the externaljar directory on the Monitoring Station For example on Windows copy the file to the following directory up time 5 User Guide l up time WebLogic C Program Files uptime software uptime externaljar earlier for their WebLogic 8 1 server applications should note that the monitor was renamed to WebLogic 8 starting with up time 5 1 The WebLogic monitor is used to monitor WebLogic 9 10 or 11 B Users who deployed WebLogic monitors from up time 5 0 or Configuring WebLogic 8 Monitors To configure WebLogic 8 monitors do the following 1 In the WebLogic 8 monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 Complete the following fields e WebLogic Port The number of the port number on which the WebLogic server is listening The default is 7001 e Username The user name that is required to log into the WebLogic server e Pass
187. base to allow logins with a user If B name and password and you specify the script file but no login information the script will fail and an error message appears in the Global Scan panel The script will run if you have configured your database to allow logins without a user name and password 264 Script Click the Script checkbox and then type or copy the script that you want up time to against the database into this text box Use this option if you do not have access to the file system on the Monitoring Station or if your script is short or will not regularly change up time 5 User Guide f up time SOL Server Basic Checks Match The value to match the script results against which can be either a string or a regular expression For more information see Comparison Methods on page 143 For example you can enter the following in the Match text box OK Where e means start the match at the beginning of the line e OK is the pattern to match e is the pattern to match anywhere on the line The value that your script returns can be a string that you can match to If you match to the value you checked for the status of the service monitor is OK Otherwise the status of the service monitor is Critical Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 3 Click the Save for Graphing checkb
188. bed in the section Adding Systems or Network Devices on page 69 Otherwise click Save in the Add System Network Device window 7 Repeat steps 5 and 6 for any other systems that you want to add Using Auto Discovery to Add pSeries Servers Managed by an HMC The Hardware Management Console HMC is an interface for managing and configuring pSeries servers that are hosting multiple logical partitions LPARs When an HMC is attached to one or more pSeries servers with LPARs the servers are considered managed servers In this configuration the HMC manages all I O requests from the LPARs Use the Auto Discovery feature to detect the managed servers and add them to up time software 77 ainjonsjseajuy INOA BuiBeuey pue buiuijeg FI Defining and Managing Your Infrastructure Working with Systems 78 Monitoring up time Through the HMC up time polls the agents installed on the VIO and the LPARs on a pSeries server for workload and other data as illustrated below tg Auto Discovery gt gt up time Monitoring Hardware Station Management Console Managed Server 5 Managed Server 1 VIO LPAR LPAR LPAR LPAR In order to monitor the managed servers and their LPARs up time must communicate with the HMC Before up time can communicate with an HMC you must enable SSH on the latter See the uptime software Knowedge Base article entitled Enabling SSH on the Hardware
189. being met Although the main summary displays the status of the SLA definition as a whole you can also expand the view to verify how well component service level objectives SLOs are meeting targets SLOs are made up of monitored services that as a group are used to measure a specific performance goal F In the Service Level Agreements subpanel accessed by clicking the View SLAs tab the following SLA information is provided in the default view e the list of SLAs and whether any are in a critical or warning level state e headway into the time period during which compliance is measured the percentage of allowable downtime used after which the SLA s status becomes critical O lt OD 2 OD 2 gt e lt Oo lt n o pen i a 1 up time software 119 Overseeing Your Infrastructure Viewing All SLAs SLA Status Indicators The color coding used in the Service Level Agreements subpanel indicates at a glance whether the SLAs respective limits are in danger of or have already been exceeded Service Level Agreement Status Service Level Agreement Status LO Email SLA e 63 of compliance period E 100 of allowable downtime used 89 47 of target 99 0 CRIT The allowable downtime has been exceeded by 15 minutes OE 66o of allowable downtime used 98 97 of target 99 0 WARN At the current rate this SLA will breach after 10 more minutes of downtime L
190. ber of unique users currently logged in to Outlook Web Access This counter decreases when users manually log out or their sessions time out Webmail User Logons Per Second The number of Outlook Web Access logins or login attempts per second RPC Averaged Latency The average time in milliseconds it takes for the last 1 024 packets to be processed RPC Operations Per Second The rate that RPC operations occur and implicitly how how many RPC requests are outstanding RPC Requests The number of client requests that are currently being processed by the Exchange store Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 To save the data from the thresholds for graphing or reporting click the Save for Graphing checkbox beside each of the metrics that you selected in step 2 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information up time 5 User Guide hy up time Exchange Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information
191. bility 2 Inthe Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 3 Click the Show Details option to generate a full listing of information about the availability of the Applications which is broken down by individual Applications 456 up time 5 User Guide up time Reports for Availability If you do not select this option then a summary of the status of all Applications appears on a single line as shown below Application Availability Report Date Range 2008 04 17 00 00 00 to 2008 04 17 13 44 49 Application Availability Summary 4 If you want to generate reports for groups of systems select the groups from the List of Groups area 5 To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views 6 If you are generating reports for specific Applications in your environment select them from the List of Applications 7 Select a report generation option See Report Generation Options on page 402 for details 8 To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information Incident Priority Report The Incident Priority report provides
192. ble also lists all changes to the states and substates for services and host checks for example from OK to CRIT and then from CRIT to OK As well up time displays a message describing the outage for example Socket error has occurred connecting to elinux Error text Connection timed out connect If you are using the Splunk IT search engine with up time the Splunk icon splunk gt appears beside the names of services that are in WARN or CRIT states You can click the icon to check the Splunk logs for information about the outage Availability Lists the state OK WARN CRIT MAINT UNKNOWN of the monitors that are associated with a specific host or device as well as 53 Getting Started Viewing System and Service Information e the amount of time that the services have been in each state and the total of all times e the percentage of time each service has been in each state The Availability table is shown below Availability As Time Monitor Time Ok Time WARN Time CRIT Time MAINT Time UNKNOWN Total Time File System Capacity 17 days 15h 4daysl4h 21 days Oh Os 1 day 19h 45 days 1h PING lab websphereS1 42 days 19h Os 2 days Th 1s 45 days 3h Plants Response 41 days 16h Os 1 day 13h 1 day 19h 45 days th UPTIME lab websphere51 43 days 7h Os 37m 335 1 day 19h 45 days 3h WebSphere 42 days 23h Os 7h 3m 1 day 19h 45 days th Availability As Percent Monitor s Time Ok 3 Time WARN Time CRIT Time MAINT 3 T
193. bpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information CPU Utilization Ratio Report 422 The CPU Utilization Ratio report charts in a table the ratio of the percentage of CPU usage over a specified period of time The ratio is derived by dividing the percentage of system time that is being used by the percentage of user time For example if the amount of system time that is being used is 22 12 and the amount of user time is 5 2 then the CPU utilization ratio is 4 25 This report contains the following information e the names of the hosts for which the report has been generated e the percentage of CPU time that is being used to carry out user processes USR the percentage of CPU time that is being use to carry out system processes SYS e the CPU utilization ratio for each host which is derived by dividing SYS by USR Creating a CPU Utilization Ratio Report To generate a CPU Utilization Ratio report do the following 1 Inthe Reports Tree panel click CPU Utilization Ratio 2 Inthe Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 up time 5 User Guide hy up time Reports for Performance and Analysis 3 If you want the report to only include data from certain hours during the day select those hours from the dropdown lists in the Daily Hours section as shown b
194. cal Partitions LPARs e VMware Workload Report e VMware Infrastructure Density Report e LPAR Workload Report VMware Workload Report 470 VMware ESX enables you to consolidate several servers or applications in a virtual environment Using VMware ESX you can run multiple servers or applications on a single system but without using as much hardware Each server or application runs in its own VMware instance Virtual Infrastructure 3 VI3 or VirtualCenter is a software suite that manages multiple physical VMware ESX v3 servers The latest version that supports ESX 4 is called vSphere 4 or vCenter VI3 or vSphere 4 enable you to manage and monitor virtual servers as well as allocate resources among virtual machines A VMware server often slows down because an instance on the server is consuming large amounts of such system resources as CPU disk I O and memory The problem could lie with an instance that is currently slow or another instance on the same server The VMware Workload report charts the workload of both the server on which VI3 or vSphere 4 is running and the ESX servers that it is managing It does this by graphing the key performance counters the up time collects from VI3 or vSphere 4 You can also use the VMware Workload report to determine whether or not you are using a particular VMware server to its optimal capacity The VMware Workload report can be a useful tool for determining whether or not a VMware server
195. can generate reports on the status of the servers in your environment based on criteria that you specify A report uses data that up time has collected from a system over a period of time that you specify You can configure reports to run between certain hours of the day Reports are useful when you need to pinpoint the source of a problem within you environment With a report you can visually analyze how individual critical resources such as memory CPU and disk resources are being consumed You can dynamically generate and view reports schedule and email reports to other up time users This chapter looks at the options that you can set to generate save and schedule reports For more information about the individual reports and how to configure them see Using Reports on page 413 up time 5 User Guide f up time Generating Reports Generating Reports You can generate reports either dynamically or in the background Dynamic reports are reports that up time displays in a new Web browser window Dynamic reports appear within several seconds or several minutes depending on the type of report that you are generating and on the information that the report collects Background reports are reports that you schedule to be run at specific intervals using the up time report queue When it is time for a scheduled report to run up time puts the report into the report queue and determines that status of the report based on the f
196. can use the graphs to collect and display information for entities services and configurations You have different graphing options depending on the operating system that is running on a host The metrics that up time agents capture and return to the Monitoring Station differ from operating system to operating system If a graph is not available in the Tree panel for a given host the host does not provide the metric that the graph requires Also if you add a node or a virtual node such as a router or IP address you can only see them in the Config and the Services tabs other metrics such as CPU and disk usage are not available from the node UNIX vs Windows Performance Monitoring In most cases you can interpret performance data from different platforms such as Windows UNIX and Linux in similar ways When the interpretation of the data is different the up time interface displays operating system specific information such as the performance counters being used as necessary 488 up time 5 User Guide l up time Viewing the Status of a System Viewing the Status of a System You can view the status of a system in your environment using a Quick Snapshot The Quick Snapshot summarizes key hardware and process information for a system for the last 24 hours If there is not 24 hours worth of data available then up time uses data from as far back as possible to generate charts The Quick Snapshot is ty
197. cccnnnenes 414 Reports for Capacity Planning w cccccccccccccccccccccecnenenseeeeeeanas 428 Reports for Service Level Agreements cccccccccccccececcnenenaes 428 Reports for Availability rarrmnnrnnnaa cece cece nnn n nent tent e ne enanees 456 Reports for J2EE Applications cciccccccccccccccceeecneneneeneeneanes 463 Reports for Virtual Environments cccccccccccccccccnennnecssnauues 470 413 Using Reports Reports for Performance and Analysis Reports for Performance and Analysis The following reports enable you to visualize the overall performance of a system in the up time environment as well as analyze the information to determine the cause of problems with those systems Resource Usage Report e Multi System CPU Report e File System Capacity Growth Report e CPU Utilization Ratio Report e Wait I O Report e Service Monitor Metrics Report Resource Usage Report 414 The Resource Usage report tracks the usage of system resources and performance information for systems over a given period of time In addition to the usage information being reported on the report displays the following information e the name and description of the system an overview of the system configuration including architecture memory size operating system version number of CPUs and host ID Creating a Resource Usage Report To create a Resource Usage report do the following 1 Inthe Reports Tree panel click Resource Usage
198. ce VMware vCenter Orchestrator Workflow Actions If an administrator has integrated up time with VMware vCenter Orchestrator see VMware vCenter Orchestrator Integration on page 539 you can configure Action Profiles to initiate Orchestrator workflows Orchestrator is a VMware vCenter Server add on that allows its administrators to create workflows that automate vCenter management tasks These Orchestrator workflows are open ended all vCenter actions are available for automation through the processing of parameters and up time software 389 Alerts and Actions Action Profiles runtime arguments up time Action Profiles can be configured to provide input parameters to specific workflows thus integrating vCenter management with up time s monitoring and alerting capabilities For example if up time is monitoring memory CPU and hard disk use for a virtualized server the passing of performance thresholds can trigger an Action Profile that in turn triggers an Orchestrator workflow that creates a new virtual machine to alleviate resource strain In a converse example if up time is monitoring a virtualized server for long periods of inactivity a triggered Action Profile can initiate an Orchestrator workflow that shuts down the instance to free up resources By tightly integrating up time s monitoring and alerting with VMware vCenter Orchestrator s automated virtual environment administration you can accelerate your organi
199. cenarios in which you might not want the node to be pingable e g you have a firewall in place Before selecting this check box you should try to contact the node using the ping utility If you cannot ping the node ensure the check box is left cleared Then change the default host check for the node See Changing Host Checks on page 156 for more information Exports NetFlow Data to Scrutinizer If Scrutinizer has been integrated with up time and is also receiving NetFlow data from the node select this check box You will then be able to call a Scrutinizer instance directly from the node s Graphing tab in up time 11 If you selected Novell NRM in step 4 enter information in the following fields Username The user name that is required to access the Novell NRM Web interface Password The password that is required to access the Novell Web interface up time 5 User Guide hy up time Working with Systems 12 If you selected VMware ESX in step 4 enter information in the following fields e User Name The user name required to log into the VMware ESX server e Password The password required to log into the VMware ESX server 13 If you selected WMI Agentless in step 4 enter information in the following fields e Windows Domain The Windows domain in which WMI has been implemented e User Name The name of the account with access to WMI on the Windows domain e Password The password for the acco
200. ckbox See Scheduling Reports on page 407 for more information on configuring a scheduled report Using the Server Virtualization Report The results of a Server Virtualization report can help you to determine which physical servers to combine on a single virtual server In order to effectively use the report you must analyze the results in more depth up time software 435 Using Reports Reports for Capacity Planning First look at the average number of power units used by the systems that you want to consolidate on a virtual server That figure should be less than the total number of power units available on the target system Next look at the disk I O for the individual systems If the system is running an application that has high levels of disk usage for example a database that system might not benefit from virtualization If however the target system has a very fast disk you can still consider moving the candidate system to it Also consider the geographical locations of the systems for which you are generating the report For example the report states the four systems of a similar type are good candidates for virtualization However two of those system are in different parts of the country or the world In this case adding them to a virtual server is not a viable option Solaris Mutex Exception Report 436 Solaris system with two or more CPUs can suffer from mutex mutual exclusion locks when two or more th
201. cking the checkbox beside each option then specifying a warning and critical threshold ZL If the thresholds that you set are exceeded then up time generates an alert For more information see Configuring Warning and Critical Thresholds on page 144 e Lock Wait Sec The amount of time in seconds to wait for a database lock For more information about locks see the Knowledge Base article SQL Server Locks Lock Requests Sec U iy o o 2 ce ce 7 The number of new database locks and lock conversions that are requested from the lock manager every second For more information about locks see the Knowledge Base article SQL Server Locks e Average Lock Wait Time The average time in milliseconds that you must wait for database locks to clear before up time sends an alert e User Connections The number of user connections that are allowed before up time sends an alert For example a single host is running two databases There are five users logged on to the first database and three users logged on to the second database The total number of user connections is eight up time software 267 Database Monitors SQL Server Advanced Metrics e Transactions Sec In the Warning and Critical threshold fields enter the number of transactions started for the databases across the host per second e Data File s Size KB The cumulative size of all the files in a
202. cknowledge and agree that as between you and Uptime Uptime owns and shall continue to own all right title and interest in and to the Software and Documentation including associated intellectual property rights under copyright trade secret patent or trade mark laws This Agreement does not grant you any ownership interest in or to the Software or the Documentation but only a limited right of use that is revocable in accordance with the terms of this Agreement Any and all trade marks or service marks that Uptime uses in connection with the Software or with services rendered by Uptime are marks owned by Uptime This Agreement does not grant you any right license or interest in such marks and you shall not assert any right license or interest in such marks or any words or designs that are confusingly similar to such marks 2 4 Confidentiality You shall permit only authorized users who possess rightfully obtained license keys to use the Software or to view the Documentation Except as expressly authorized by this Agreement you shall not make the Software Documentation or any license key available to any third party You will use your best efforts to co operate with and assist Uptime in identifying and preventing any unauthorized use copying or disclosure of the Software Documentation or any part thereof 3 License Fees The Software will be available to you for use upon your receipt of one or more license keys Upon acceptance of this Agre
203. close display sublicense lease rent or lend your rights in the Software Documentation or license keys as granted by this Agreement for any purpose or in any manner up time software 577 JUudouIddJby su 17 gl NOTICE TO USER 1 5 Licenses Required for Third party Software The Software enables you to monitor multiple instances of third party operating systems and application programs You are responsible for obtaining and complying with any licenses necessary to operate any such third party software including Operating Systems and or application programs 2 Intellectual Property and Confidentiality 2 1 Use Reporting License Violations and Remedies Uptime reserves the right to gather data on key usage including license key numbers server IP addresses domain counts and other information deemed relevant to ensure that its products are being used in accordance with the terms of this Agreement Uptime expressly prohibits simultaneous multiple installations of its licensed products and domain count overrides without its prior written approval Any unauthorized use shall be considered by Uptime to be a violation of this Agreement Uptime reserves the right to remedy violations immediately upon discovery by charging the then current list price of unauthorized keys to the end user or by any other means necessary You agree not to block electronically or otherwise the transmission of data required for compliance with this Agreemen
204. complete the following fields e Windows Host The name of the host on which the service is running e Agent Port up time 5 User Guide dh up time Action Profiles The port on which the up time agent that is installed on the system is listening The default is 9998 e Use SSL Select this option if up time will securely communicate with the host using SSL Secure Sockets Layer e Agent Password Enter the password that is required to access the agent that is running on the system that is being monitored For information on setting the agent password see the uptime software Knowledge Base article entitled What is the password for the Windows agent e Windows Service The name of the specific Windows service to which the Action Profile will apply e Action Select one of the following actions e None e Start e Stop e Restart 7 Ifyou are setting up an Action Profile for a Windows server that is using a WMI implementation you can also select the Windows Service as WMI and complete the following fields e WMI Host The name of the host on which the service is running e Windows Domain The Windows domain in which WMI has been implemented e Username The name of the account with access to WMI on the Windows domain up time software 393 Alerts and Actions Action Profiles e Password The password for the account with access to WMI on the windows domain e Windows Service The na
205. confirms the general availability of an end user Web transaction by executing a previously recorded script then reporting whether all pages that make up the web transaction were successfully processed e it reports on the speed of the Web transaction both as a whole and broken down by previously defined stages Both the availability and speed of Web transactions can be used in reports and as triggers for alerts gt xo O o fe 5 e Oo 7 Using the Web Application Transaction Monitor Use the Web Application Transaction monitor to record a series of URLs that together make up a transaction This recording should be of a transaction that acts as a suitable test of your Web application delivery infrastructure During the recording process declare checkpoints that demarcate significant stages in the Web transaction Isolating the different stages in an end user transaction allows you to view stage specific speed tests in reports which ultimately helps you identify where problem areas exist For example if a transaction relies on processing on the application layer makes multiple calls to the data layer and is accessible worldwide creating up time software 223 Application Monitors Web Application Transactions checkpoints during the recording phase helps you ascertain whether the application server database management server or network may be the reason behind a poorly performing transaction The f
206. ct this option if the user is working with Internet Explorer ActiveX graphs are only available to users accessing up time with Internet Explorer Click the Show Tips option to disable graphical tool tips on pages like View Notification Groups Select a role for the user from the User Role dropdown list For more information on user roles see the section Working with User Roles on page 334 339 EN O e e lt gt e Cc 2 oD 7 Configuring Users Working with Users 19 In the Available User Groups field select the user group to which this user will belong and then click Add For more information on user groups see the section Working with User Groups on page 341 20 Click Save Viewing Users To view users do the following 1 Inthe Tree panel click View Users A list of users appears in the Users subpanel Editing User Information To edit user information do the following 1 Do one of the following e Click the Edit icon beside the name of the user e Click the name of the user whose information you want to edit and then click Edit User on the User Information page The Edit User window appears 2 Edit the information as described in the section Adding Users on page 337 340 up time 5 User Guide f up time Working with User Groups Working with User Groups up time software User groups are sets of up time users who have been assigned sim
207. ction ad Follow Follow any re direction e Warning Return a Warning status for any re direction Critical Return a Critical status for any re direction up time 5 User Guide dy up time up time software HTTP Web Services POST String The URL encoded POST string to be sent to the server This string simulates what a Web browser sends to a Web server CGI script or binary You can use the POST string to for example simulate logging into a Web application For example if you define the POST string as userid bob amp sku 123456 the page to request would be cgi bin sku lookup The text SKU count is is the expected response If the SKU lookup is not successful or if the response from the application server is not fast enough then up time generates an alert Set Cookie String Enter a cookie string which can take the following form Set Cookie name value xpires date path pathname domain domainname secure Where e name is a name by which you can later reference the cookie e value is a regular string to be stored as a cookie The string should be encoded using URL style xx encoding which converts all reserved and unsafe characters such as tildes and spaces to their ASCII equivalents For example using xx encoding the URL http www mydomain com jdoe index html becomes http www mydomain com 7ejdoe index html The name value pair is the only required attribute of the
208. ctly Matches from the dropdown list and type UNKNOWN in the field For more information on comparison methods see Comparison Methods on page 143 Response Time Optionally enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 3 Click the Save for Graphing option to save the output in the DataStore You can later use the retained data to generate a report or a graph 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish up time software 325 SJOHUONN poouRApY vL Advanced Monitors Custom with Retained Data Custom with Retained Data Custom monitors with Retained Data return the following information e upto 10 values that you can save and evaluate e areturn status of 0 to 3 see Overview on page 322 for more information As well you can specify that the monitor writes any returned data to the up time DataStore You can use the retained data to later generate a Servic
209. d Defining and Managing Your Infrastructure on page 65 Configuring the Monitoring Station to Use Oracle If this Monitoring Station installation is for a standalone up time instance that is not part of a multi datacenter deployment skip this section and use the default bundled MySQL implementation otherwise you must configure the Monitoring Station to write to an Oracle database instance instead of MySQL To switch the database used by the Monitoring Station edit the uptime conf file To edit the uptime conf file to use an Oracle database instance instead of MySQL do the following 1 Remove or comment out the default MySQL settings as shown below db db db db db dbType mysql Driver com mysql jdbc Driver Hostname localhost Port 3308 dbName upt ime Username uptim Password uptime 2 Show i e uncomment the Oracle database settings 3 For the dbHostname and dbPort settings enter the address and port for your Oracle database server up time software 37 5 a Si Ko xe 3 D Installing up time Post Installation Tasks 4 For the dbName setting provide a name for the Enterprise Monitoring Station s Oracle database instance 5 In the dbUsername and dbPassword fields enter the authentication details to access and write to the database 6 Save your changes 7 Use the resetdb utility with the really option to delete then recreate the database str
210. d as only having one CPU up time can also collect and chart information for systems running Net SNMP that have two or more CPUs However if the system was recently added to up time or if the HOST RESOURCES MIB which is used to collect data from the system has not been properly installed and configured up time cannot collect CPU performance data You must either wait until up time is able to collect performance data or check whether or not the HOST RESOURCES MIB is properly installed and configured on the system that is being monitored Generating a Multi CPU Usage Graph To generate a Multi CPU Usage graph do the following 1 Inthe Global Scan or My Infrastructure panel click the name of the system whose information you want to graph 2 Inthe Tree panel click the Graphing tab 3 Click Multi CPU Usage 4 Select the start and end dates and times for which the graph will chart data For more information see Understanding Dates and Times on page 22 up time software 495 Using Graphs 496 Multi CPU Usage 5 Click one of the following options Pn User The percentage of CPU user processes that are in use For Windows systems this option is User Time System The percentage of CPU kernel processes that are in use For Windows systems this option is System Time Privileged Time On Windows systems the percentage of time that the CPU spends executing kern
211. d in the HTTP requests to the application server and should be provided in case the application blocks access by scripts Checkpoint Times Enter the Warning and Critical Checkpoint Time thresholds An alert is generated with these thresholds if any of the recorded Web transaction s checkpoint times exceeds the supplied values up time 5 User Guide f up time Web Application Transactions Response Time Enter the Warning and Critical Response Time thresholds An alert is generated with this threshold if the entire transaction playback time exceeds the supplied values For more information see Configuring Warning and Critical Thresholds on page 144 9 Enter Warning and Critical level thresholds for the overall Viewing a up time software response time of the monitor Most of the monitor s Response Time is comprised of the Delivery Time and the Retrieve Time Ensure the values provided for the Response Time thresholds roughly correspond with those provided for the other thresholds For more information see Configuring Warning and Critical Thresholds on page 144 10 Complete the following settings e Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile
212. describes each up time graph in the following sections OVEIVIOW EEE EE 488 Viewing the Status of a System amnnnnnnannannnnnnrarrennnennnenr 489 Monitoring CPU Performance annnnnnnannnnnnnnrrannnnnnennnnnnenn 491 Multi CPU Usage os coca cassdaccncathcaccn ears cach eu PRETERE ENADE EN 495 Graphing Memory Usage rnannnnnnnr rann rann e teen nnn n rnnnnennnenn 498 Graphing Processes rrmmnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnssssssnrnnnnne 501 Graphing TCP Retransmits annnnnnnrrannne nr e tenn ERENER IYE 503 Graphing User ACTIVITY 0 cc cccccc cece cece nee e eect e teen enna een eetetenees 504 Workload Graphs aranuvvanavavavvvvvvvnnnnnn an nrr nr nnnnnnnnnnnnnr 505 Network Graph S oriire sii aaan Ener LNT eee tap tes 511 Disk Performance Statistics Graph ccicccccccccccccceeeeceeneneeees 514 Top 10 DISKS Graph riririnirisirinresrrrnibinir tirir reb ninesri nin iniii 516 File System Capacity Graph ccccccccccccccccccceeeeeeenseensnenenenees 518 VXVM Stats Graph aaununnnnavvvvnnnnnnnnnnn rn Ie REEERE ERE 519 Novell NRM Graphs sirssiriiriprirsi eni eri nE EREN EREE EEEE TETERE T 521 Instance Motion Graphs vicccccccccccccccccccceeeeeeeeeeensnenenenensaaes 523 Displaying Detailed Process Information rarrrnnaannnnnrannr 524 487 Using Graphs Overview Overview up time can display the performance and availability statistics for the systems that you are monitoring in a graph You
213. dy up time 5 up time User Guide version 5 5 up time software cause downtime is not an option Copyright O 2011 uptime software inc uptime software inc considers information included in this documentation to be proprietary Your use of this information is subject to the terms and conditions of the applicable license agreement Restricted Rights Legend This product or document is protected by copyright and distributed under licenses see End User License Agreement on page 575 restricting its use copying distribution and decompilation No part of this product or document may be reproduced in any form by any means without prior written authorization of uptime software and its licensors if any Third party software is copyright and licenced from uptime software suppliers Documentation is provided as is and all express or implied conditions representations and warranties including any implied warranty or mechantability are disclaimed except to the extent that such disclaimers are held to be legally invalid Trademarks up time is a registered trademark of uptime software inc IBM is a registered trademark of International Business Machines Corporation iText is used under the Lesser General Public License LGPL Oracle and Solaris are registered trademarks and the Oracle product names are registered trademarks or trademarks of Oracle Corporation Microsoft Windows Microsoft SQL Server and other such t
214. e Action Profiles on page 389 for more information 5 Click Finish 306 up time 5 User Guide d up time SSH Secure Shell SSH Secure Shell The SSH Secure Shell monitor determines if the secure shell utility SSH is available and is running on the defined port SSH is both a program and a network protocol for securely logging into and executing commands on a remote computer It provides secure encrypted communications between two untrusted hosts over an insecure network Configuring SSH Secure Shell Monitors To configure SSH Secure Shell monitors do the following 1 up time software In the SSH Secure Shell monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 Complete Secure Shell monitor settings by entering the appropriate Warning and Critical thresholds For more information see Configuring Warning and Critical Thresholds on page 144 Port The number of the port on which SSH is listening The default is 22 Major The major version number of SSH This is the number immediately to the left of the decimal in the version number In the following example the major version number is 2 SSH 2 0 SUN SSH1 0 Minor The minor version number of SSH This is the number immediately to the right of the decimal in the version number In the following example the major version number is 0
215. e For example if the number of power units used is 104 and the total number of available power units is 2 346 then the enterprise CPU utilization is 4 34 Creating an Enterprise CPU Utilization Report To create an Enterprise CPU Utilization report do the following 1 Inthe Reports Tree panel click Enterprise CPU Utilization 2 Inthe Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 3 If you want the report to only include data from certain hours during the day select those hours from the dropdown lists in the Daily Hours section as shown below Daily Hours Include data samples between these hours only End 21 00 z For example if you want to report to cover the hours from 1 00 a m to 1 00 p m select 1 00 from the Start dropdown list and 13 00 from the End dropdown list 4 Select one of the following options from the Sort by dropdown list to sort the results that up time returns e Hostname the default e of CPUs e CPU Speed e Power Units Total e Power Units Used Total e Power Units Used Partial up time software 429 Using Reports 430 10 11 Reports for Capacity Planning e CPU Utilization Total e CPU Utilization Partial Select Ascending or Descending from the Sort Direction dropdown list Select one or more of the following CPU statistics at which the report will look
216. e Metrics report see Service Monitor Metrics Report on page 425 or a Service Metrics graph see Viewing System and Service Information on page 50 Configuring Custom Monitors with Retained Data 326 To configure Custom monitors with Retained Data do the following 1 Inthe Custom with Retained Data monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following fields e Script Name The name of and path to the script or program on the Monitoring Station that will collect metrics on the system m The script or program that you specify must be executable by the uptime user account on the up time Monitoring Station Ensure that the permissions are set correctly e Arguments Optional Specify any arguments required by the script or program e Variable 1 to Variable 10 Optional Specify up to 10 variables that your custom script will return to the up time Monitoring Station If you click the Save for Graphing checkbox these variables will be saved to the DataStore up time 5 User Guide f up time Custom with Retained Data Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 3 Complete the following settings Timing Settings see Adding Monitor Timing Settings Informatio
217. e describe the purpose of the monitor for example Ping Web Server Optionally enter a description of the monitor in the Description field Assign the monitor to a system by doing one of the following Click the Single System option and then select the name of the system that you want to monitor from the dropdown list Click Service Group to attach the monitor to multiple systems Then select the service group from the dropdown list For more information about service groups see Service Groups on page 153 Click the Unassigned option This step is mandatory Complete the following fields Port The number of the port on which up time is listening Use SSL Select this option if the up time agent is configured to use SSL Secure Sockets Layer for security If you have configured your agent to use SSL but do not select Use SSL up time will not receive performance information Monitor Settings Configuration Each up time service monitor has settings particular to the service that is it monitoring 142 up time 5 User Guide l up time The Monitor Template The following image illustrates a setting from a MySQL Basic Checks monitor MySQL Basic Checks Settings Port 3306 Port Check r Port Check Username Password v Database v Script File Script Comparison Methods You can configure settings that compare the Warnin
218. e information from all four databases It can also aggregate the information to present a single performance value for each metric Using Multiple SQL Server Advanced Metrics Monitors 266 You can create several SQL Server Advanced Metrics monitors for a system if you must separately capture different SQL Server performance metrics For example the SQL Server Advanced Metrics monitor provides metrics for SQL Server locks including lock requests waits and averages For information about locks see the Knowledge Base article SQL Server Locks Lock requests do not always provide meaningful information When you compare the length of waits with the number of lock requests the length of the lock waits should be much lower than requests If the lengths of waits and requests are about the same then there is a performance problem When the average lock wait time is high there is a problem with SQL Server up time 5 User Guide hy up time SQL Server Advanced Metrics Configuring SQL Server Advanced Metrics Monitors To configure SQL Server Advanced Metrics monitors do the following 1 In the SQL Server Advanced Metrics monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Inthe Instance field type the name of the SQL server instance to which you want to connect 3 Complete the following options by cli
219. e 141 up time software 271 Database Monitors SOL Server Tablespace Check 272 2 Complete the following fields MY SQL Server Port The number of the port on which the SQL Server is listening SQL Server can use static or dynamic ports For information about SQL ports and how to determine and configure port allocation see the Knowledge Base article Configuring SQL Server Ports Username The user name that is required to login to the SQL Server database When a user connects through a Windows user account SQL Server re validates the account name and password by contacting a Windows domain controller to determine the network user name SQL Server then verifies the credentials of the users and then permits or denies login access Password The password that is required to login to the SQL Server database When a user connects with a specified login name and password from a non trusted connection SQL Server determines if a SQL Server login account has been set up and if the specified password matches the one previously recorded If SQL Server does not find a login account authentication fails and the user receives an error message SQL Server authentication is provided for backward compatibility because applications written for SQL Server version 7 0 or earlier may require the use of SQL Server logins and passwords If you do not complete the Username and Password fields up time will attempt to connect to the da
220. e 22 3 If you want to generate reports for groups of systems select the groups from the List of Groups area 4 To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views 5 Ifyou are generating reports for specific systems in your environment select them from the List of Systems and Nodes up time 5 User Guide up time Reports for Availability 6 Select a report generation option See Report Generation Options on page 402 for details 7 To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information Service Monitor Outages Report The Service Monitor Outages report lists all warning or critical events for services that have occurred over a specified time period Use this report to determine the cause of a problem by analyzing the declining availability of a server or Set of servers The Service Monitor Outages report contains the following information e the date and time at which metrics were gathered for each service e the duration of the outage e whether or not a notification was sent or an action was taken e the status of each service e a short message about the status for example UPTIME filter up time agent running on f
221. e 546 Restoring Archived Data siyeh cast ono tsteesuerst sebeeletielelarueteasendaneseds 547 Exporting and Importing the DataStore uannsrvernnnnrrrennonvrrnnnrn 548 up time Diagnosis ssssssssssusnuunuunnuunnnnnnnnnnnnnnnnnnnnnnnnnn na 551 System Event Logging xs cesdicccsaieincsteeias aden ti cscenteverecuses Ascaeseiitudents 551 ALIS vasse 552 Problem Reporting 22sram andet ednadnidee ruan 552 up time Measurement Tuning a asrasnunnnnnnnnnnnnnnnnnnnnnnnnnr 554 Service Monitor Thread COUnlS uuaunsnpesssenamnesjlmineden 554 Status TEST OE asessisne orcesdeueavsdatesadiassenssucacsessiasssasshapedpsebencdiueees 554 Platform Performance Gatherer Check Intervals errrrrrrnrnn 557 xxii up time 5 User Guide D up time Report Storage OptiOnS rasnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 558 D Changing the Number of Days Reports Are Cached 00 558 keg Changing the Published Report Location srvrvrvorevvvvvvvrvvvrevvrsenn 559 pg Resource Usage Report Generation as narnnnnnnnnnnnnnnnnnnnnr 560 z 3 Monitoring Station Interface ChangesS a xaannnnnnunnnnnnnn 561 2 Status Alert Acknowledgement 1 ssccccsescsesscsctsescssesssseseseteeseees 561 D EN 561 Custom Dashboard Tabs rrrnvrrvnnnnnvrrvnnnnvvrrnnnnnvnvnnrnnnrrrennsrvnnnnnnn 562 License Information annnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 563 Reference Frequency Defimitions axrusrnsnnnnnnnnnnnnnnnnnnnnn
222. e 81 ainjonsjseajuy INOA BuiBeuey pue buiuijeg FI Defining and Managing Your Infrastructure Working with Systems 82 managing and updating all of the systems on which an up time Agent has been installed VWMI based monitoring can only be performed if the Monitoring Station itself is running on Windows An Element can be set to use WMI through the following methods e its system type is set to WMI Agentless when it is first added to up time e its system type was set to Agent when originally added to up time but is being individually modified to use WMI e itis part of a bulk agent to WMI conversion with other agent based Elements Globally defined WMI credentials can be used for the second and third method In the latter s case configuring these is mandatory Refer to Configuring Global WMI Credentials on page 536 for more information Regardless of which method is used when changing a Windows Element s data collection method all historical data is retained WMI Requirements In order to monitor agentless systems through WMI in a secure environment e g through a firewall you need to create an exception for WMI on the host end For example to allow WMI access through Windows Firewall refer to the following MSDN articles for Windows XP or Windows Server 2003 http msdn microsoft com en us library aa389286 28v V5 85 29 aspx e for Windows Vista or Windows Server 2008 http msdn micr
223. e Information To view system information do the following 1 Inthe Global Scan or My Infrastructure panels click the name of a system 2 Click the Services tab in the Tree panel 3 Click one of the following options in the Tree panel e Status Lists the status of each service assigned to the system for example up time agent running on subway up time agent running on subway up time agent 4 0 solaris m An arrow P at the end of a status message indicates that there is more text Hold your mouse over the arrow to view the full message When up time issues an alert you can acknowledge the alert in the Status subpanel For more information see Acknowledging Alerts on page 112 e Trends Displays one or more graphs that chart the status of the services associated with a host as shown below File System Capacity trend for WebSphere Critical Warning Okay Unknown Maintenance 2008 03 04 12 24 24 2008 03 09 00 31 09 2008 03 13 13 37 54 2008 03 18 01 44 39 2008 03 22 13 51 24 2008 03 27 01 58 10 2008 03 31 14 04 55 2008 04 05 02 11 40 2008 04 09 14 18 25 2008 04 14 02 25 11 52 up time 5 User Guide d up time Outages Outage Time ce Name Status From SubStatus From Status To Mon Apr 07 10 20 EDT 2008 Configuration Update Ok UNKNOWN Agent up time Mon Apr 07 10 O EDT 2008 Platform Performanc CRIT UNKNOWN host down Mon Apr 07 10 2 CRIT UNKNOWN
224. e Monitors There are two types of monitors for MySQL Oracle and SQL Server databases e Basic Checks These monitors determine whether or not the database is running and listening on the expected port You can also run queries against the databases using scripts e Advanced Metrics These monitors collect detailed information about database processes which you can later use for reporting and graphing Understanding Agentless Monitors Using Net SNMP Net SNMP suite of command line and graphical applications that interact with SNMP agents that are installed on hosts Net SNMP presents a set of SNMP MIBs Management Information Base which is a listing that defines variables needed by the SNMP protocol to monitor and control network equipment The MIBs are used to collect system performance information for use by the up time Monitoring Station up time software 17 w dn Buipueyssapun F Understanding up time Understanding Service Monitors 18 The Net SNMP monitor uses the HOST RESOURCES MIB to collect the following data Configuration System name Number of CPUs The size of the system memory The network interfaces on the system as well as their MTU speed and physical address The sost resources MIB can collect other configuration data but the Monitoring Station does not use this information e Performance Data CPU e CPU user time e CPU system time e CPU wait I O time Memo
225. e Range area of the Reports and Graphing subpanels as shown below Specific Date and Time Date Range VYYY MM DD HH MM 55 Last C Quick Date To set dates and times for a graph or report do one the following e Click the Specific Date and Time option Then in the Date Range area select the start date and time of the report by e entering the start and end times HH MM SS in the From and To text boxes e entering the start and end dates YYYY MM DD in the From and To text boxes You can also click the calendar icon EE to select dates e Click the Last option then do the following e select a number from 1 to 10 from the first dropdown list e select Days Weeks or Months from the second dropdown list The end date for any of these options is the current date and time For example if you select 1 and Days then the graph or report will cover the 24 hour period from the previous day until the date and time on which you created the report 22 up time 5 User Guide hy up time Understanding Dates and Times e Click the Quick Date option and then select one of the following options from the dropdown list e Today e Yesterday e This Week e Last Week Sun Sat e This Month e Last Month first day of the current month to the day on which the report or graph is being generated The Last Month option collects information from the beginning to the end of the previous month B The This Mon
226. e Report Options section indicate whether you want to Include Charts for Individual ESX Servers by selecting or clearing the check box When this option is enabled a separate chart with VM counts will be created for each ESX server that is included in the report 4 Inthe Report Options section select the level of granularity at which the virtual infrastructure density information will be presented i e daily weekly or monthly 5 If you want to generate reports for groups of systems select the groups from the List of Groups area 6 To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views 7 Ifyou are generating reports for specific systems in your environment select them from the List of Systems and Nodes 8 Select a report generation option See Report Generation Options on page 402 for details 9 To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel up time 5 User Guide hy up time Reports for Virtual Environments See Saving Reports on page 404 and Scheduling Reports on page 407 for more information LPAR Workload Report The LPAR Workload report charts the workload of the individual logical partitions LPARs on an IBM pSeries server It does this by graphing the following workload data CPU e Memory
227. e Scan thresholds 556 Config panel Problem Reporting 552 configuring response time 145 configuring user roles 334 configuring users 333 CPU Run Queue Threshold report 445 CPU Usage graph 491 Linux UNIX Novell 492 Windows 491 CPU Utilization Ratio report 422 CPU Utilization Summary report 419 creating Alert Profiles 382 critical threshold 144 custom alert formats 385 D database monitors MySQL Advanced Metrics 244 MySQL Basic Checks 251 Oracle Advanced 253 Oracle Basic Checks 256 Oracle Tablespace Check 259 SQL Server Advanced Metrics 266 SQL Server Basic Checks 262 SQL Server Tablespace Check 270 Sybase 275 DataStore archiving 549 restoring 549 dates and times 22 deleting Applications 111 systems 111 views 111 Disk I O Bandwidth report 441 Disk Performance Statistics graph 514 Distribution Lists 344 586 adding 344 editing 345 viewing 345 DNS monitor 280 E editing Action Profiles 395 Alert Profiles 384 Distribution Lists 345 host check 156 Notification Groups 348 service groups 154 system profile 99 user groups 342 user roles 336 editing views 110 email delivery time monitor 230 enabling Windows Messaging Service 381 end user monitoring 223 230 Enterprise CPU Utilization report 428 ESX Workload monitor 217 Exchange monitor 194 exiting up time 49 External Check monitor 328 F File System Capacity 167 File System Capacity graph 518 File System Capacity Growth report 431
228. e Virtual Appliance e the host or cluster on which the Virtual Appliance will run e the resource pool within which it will be run e the datastore in which the appliance s data will be kept e the network the appliance will use Review your selections then click Finish Wait for the import process to complete In the Virtual Infrastructure Client navigate to select the up time appliance and power it on Click the Console tab for the appliance After initialization ensure the appliance time is correct The default time zone is PST The appliance time zone must match that of your monitored infrastructure in order to correctly collect and report performance data After the appliance configuration has been completed you can log in to the Monitoring Station to begin setting up your monitored environment It can take up to a minute for the up time services to start Wait before attempting to log into the Monitoring Station up time 5 User Guide f up time Post Installation Tasks Post Installation Tasks After installing up time you will need to do the following e setup the administrator account when you first log in see Setting Up the Administrator Account on page 48 e provide the host name of the SMTP server when you first log in see SMTP Server on page 534 e install the license for up time see License Information on page 563 e add users and systems see Configuring Users on page 333 an
229. e contents of the up time software 15 w dn Buipueyssapun F Understanding up time Understanding the up time DataStore DataStore into such tools as MySQL Query Browser Microsoft Excel and Crystal Reports Before you can connect to the DataStore using ODBC the client system that is accessing the database must have the MySQL ODBC driver installed The ODBC driver enables the client system to communicate with the DataStore For detailed information on installing and configuring the MySQL ODBC driver see the uptime software Knowledge Base article Connecting to the up time DataStore via ODBC 16 up time 5 User Guide hy up time Understanding Service Monitors Understanding Service Monitors up time service monitors ensure the performance and availability of services in your environment Using service monitors you can ensure that the systems in your environment including databases mail servers networking protocols and file systems are operating as required up time also captures performance metrics collected from hardware profiles of physical systems in your environment and can present this data in a graph up time can track the performance of services using over 30 monitors As well up time enables you to configure custom monitors that you can use to extend your service monitoring capability For detailed information on service monitors see Using Service Monitors on page 135 Understanding Databas
230. e in the User Name field 4 Enter your assigned password in the Password field 5 Click the Login button Exiting up time To exit up time click the Logout button x Log Out h in the top right corner of the screen i up time software 49 Getting Started Viewing System and Service Information Viewing System and Service Information You can view information about the following e basic configuration of systems in your environment e services and service groups assigned to the system e user groups assigned to the system Viewing System Information To view system information do the following 1 Inthe Global Scan or My Infrastructure panels click the name of a system The general information for the system appears in the sub panel 2 Click the Info tab and then click one of the following options in the Tree panel Info amp Rescan Lists the basic information about the system including the following e the display name of the system in up time e the host name the number of processes the monitors will retrieve e whether or not the system is being monitored e the name of the domain on which the system resides e g uptimesoftware com e the name and version of the operating system that is running on the system e the number of CPUs on the system the amount of memory in megabytes on the system e the size of the paging file in megabytes on the system 50 up time 5 User Guide hy up time Viewing Sys
231. e instance while performing a hardware upgrade The Instance Motion graph enables you to keep track of a moving VMware instance For a given ESX instance the graph charts which systems it has been running on over a given time range Generating an Instance Motion Graph To generate an Instance Motion graph do the following 1 Inthe Global Scan or My Infrastructure panel click the name of the ESX instance whose motion you want to graph 2 Inthe Tree panel click the Graphing tab 3 Click Instance Motion 4 Select the start and end dates and times for which the graph will chart data For more information see Understanding Dates and Times on page 22 5 Click Generate Graph up time software 523 Using Graphs Displaying Detailed Process Information Displaying Detailed Process Information Detailed process information provides an insight into how various user and system processes are consuming system resources The information is not presented in a graph it is a table that contains the following information e Process The name of the process which is taken from its executed path name e PID The number that identifies the process e PPID The number that identifies the parent process The PPID can help identify possible relationships between processes On Windows systems the PPID is called the Creating Process ID e UID The ID of the user or account that has been consuming CPU time On Windows systems the
232. e occupancy is high and the queue is long then there is a capacity problem However a system should always have some idle time Having consistently low idle time usually means that your system is working near its maximum capacity sydes9 Buisn pe up time software 493 Using Graphs Monitoring CPU Performance Generating a CPU Performance Graph To generate a CPU performance graph do the following 1 Inthe Global Scan or My Infrastructure panel click the name of the system whose information you want to graph 2 In the Tree panel click the Graphing tab 3 Click one of the following options e Usage busy e Run Queue Length e Run Queue Occupancy 4 Select the start and end dates and times for which the graph will chart data For more information see Understanding Dates and Times on page 22 5 Click Generate Graph 494 up time 5 User Guide f up time Multi CPU Usage Multi CPU Usage The Multi CPU Usage graph charts the performance statistics for systems with more than one CPU These statistics indicate whether or not a system is effectively balancing tasks between CPUs or if processes are being forced off CPUs in certain circumstances You can also use this graph to determine whether or not there are too many system interrupts that are using a CPU or that are overloading a CPU If there is only one CPU on the system the following message is displayed instead of a graph This system is currently liste
233. e of an article opens it in your Web browser By default RSS feeds are drawn directly from the uptime software Support Portal without the use of proxy server information If your Monitoring Station accesses the Internet through one feeds will most likely not be up time software 537 Configuring and Managing up time nterfacing with up time 538 available and the following message will appear in the My Portal panel f 7 My Preferences h admin Latest up time Articles ase articles from URL Click here to to read what s new You can change the RSS feed settings to point to the proxy server rather than directly to the uptime software Web site by manually inputting settings in the up time Configuration panel as outlined in Modifying up time Config Panel Settings on page 529 Changing Proxy Server Information for RSS Feeds You can manually configure the settings for RSS feeds through the following parameters default values if applicable are shown e rssFeedUrl http support uptimesoftware com rss kb xml The URL of the RSS feed e httpProxyHost The host name of the proxy server that the Monitoring Station uses to access the Internet e httpProxyPort The port through which the Monitoring Station communicates with the proxy server e httpProxyUsername The user name required to use the proxy server httpProxyPassword The password required to use the proxy server up time 5 User Guide
234. e panel click Auto Discovery The Auto Discovery window appears 2 To scan for agents in the Agent Check section and in the Network Address field type the range of IP addresses that you want up time to scan For example typing 10 1 1 will scan all systems on your network that have an IP address starting with 10 1 1 3 If you would like to scan for systems using WMI to collect metrics enter the login information for an administrative Windows account in the following fields e Windows Domain optional The Windows domain in which WMI has been implemented e User Name The name of the account with access to WMI on the Windows domain e Password The password for the account with access to WMI on the windows domain Note that this option is only available on Monitoring Stations running on the Windows platform up time software 75 ainjonsjseajuy INOA Huibheuew pue buiuijeg FI Defining and Managing Your Infrastructure Working with Systems 76 For the Default SNMP read community field which contains a string that acts like a user ID or password giving you access to the Net SNMP instance do one of the following accept the default value public e enter a new value e g private Click Continue up time returns a list of the systems that have an IP address within a range that you specified Click the Add button beside the system that you want to add The Add System Network Device window appears If nec
235. e port value from the SID of the Oracle instance U iy o o 2 ce ce 7 e Port Check Optional Select this option to open a socket connection that determines whether or not the database is listening on the defined port e Username The user name that is required to login to the MySQL database e Password The password that is required to login to the MySQL database e Database The name of the MySQL database instance up time software 251 Database Monitors MySQL Basic Checks e Script Type or copy the script that you want up time to match against the database Use this option if your script is short or will not regularly change This option is required if you do not have access to the file system on the Monitoring Station Script File As an alternative to directly entering a script enter the full path on the Monitoring Station to the script that this monitor will run against the database e Match Enter a string that you want to match against the return value from the script Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 3 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph 4 Complete the following settings e Timing Settings see Adding Monitor Timing Settings Informatio
236. e posting permission Code 400 is sent when the NNTP server discontinues service for example by request of the operator The 5xx codes indicate that the command could not be performed for some unusual reason up time 5 User Guide dy up time NNTP Network News Configuring NNTP Network News Monitors To configure NNTP Network News monitors do the following 1 up time software In the NNTP Network News monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 Complete the following fields Port The number of the port on which the NNTP server is listening The default is 119 Server Response The server response according to the value that you want to measure For information on command implementation see Command Implementation on page 299 For information on response categories see Response Category on page 300 For information on general response see Response Codes on page 300 Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph 301 EL 2 pr Oo Ww 92 Oo Oo Oo 2 Network Service Monitors
237. e the loadpluginmonitor script which is found in your up time scripts directory 3 Inacommand line shell change to the UP TIME HOME scripts directory and locate the loadpluginmonitor script 4 Run the loadpluginmonitor script with a single argument that points to the location and name of the plug in monitor you downloaded up time 5 User Guide d up time Plug In Monitors The plug in monitor will be installed in a subdirectory under the scripts directory The installation directory is determined by the plug in monitor s XML file 5 Run the up time GUI 6 Click Services on the up time tool bar 7 Click Add Service Instance in the Tree Panel The Add Service Monitor window appears 8 Inthe Advanced Monitors section you will see the plug in monitor you added to up time You can now select and configure the plug in monitor YL gt Q lt o gt Oo a e gt e 7 up time software 331 Advanced Monitors Plug In Monitors 332 up time 5 User Guide D upitime CHAPTER 15 Configuring Users This chapter describes the up time user management functions in the following sections Working with User Roles ssannaannnnnnannnnnnnnnnnnnnnnnnnnnnnna 334 Working with USers raanunnnanrrnnnara rann aner annnennennnnerrennennn 337 Working with User Groups arrnnnanannnvrenrennnernnnnnnerrennnenr 341 Managing Distribution LIStS rronnnnnaan naar rn annan nn nr nnn r
238. ecks The SQL Server Basic Checks monitor compares the performance of SQL Server databases and instances running on a system to the thresholds that you define The SQL Server Basic Checks monitor does the following e determines whether or not SQL Server is running on your system e checks whether or not SQL Server is listening on a specific port e determines whether or not SQL Server can process queries e checks for values in base and computed tables You can use regular expressions to identify a wide range of responses and to detect problems after they occur You can also run scripts through up time to alert you when a database component that is being monitored is not performing as required strong knowledge of regular expressions Transact SQL To properly configure this monitor you should have a and SQL Server Configuring SQL Server Basic Checks Monitors 262 To configure SQL Server monitors do the following 1 Inthe SQL Server Basic Checks monitor template complete the monitor information fields To learn about monitor information fields see Monitor Identification on page 141 2 Complete the following fields e SQL Server Port The number of the port on which SQL Server is listening SQL Server uses Static Port Allocation or Dynamic Port Allocation ports For more information see the Knowledge Base article SQL Server Ports up time 5 User Guide dy up time SQL Server Basic Checks e P
239. ecting data every 30 seconds The graph displays a 10 second history If the status is Suspect or Bad determine which thread or module is causing the most CPU cycles and take appropriate action including the following e unloading and reloading the module e reporting problems to the vendor of the module e loading an updated module up time software 89 ainjonsjseajuy INOA Huibheuew pue buiuijeg FI Defining and Managing Your Infrastructure Working with Systems 90 To determine which thread or module is using the most CPU cycles do the following 1 In Novell NRM click Profile Debug 2 Do one of the following e View the Execution Profile Data by Thread data e Click Profile CPU Execution by NLM Connection Usage up time monitors connections on a per server basis NRM displays only the following metrics e the number of connections that are being used e the peak number of connections used on this server Available Memory This statistic enables you to view the amount of memory that is not allocated to any service Most if not all of this memory is used by the file system cache When available memory gets too low modules might not be able to load or file system access might become sluggish DS Thread Usage This statistic enables you view the number of server threads that Novell eDirectory uses The server thread limit ensures that threads are available for other functions as needed for example when large nu
240. ed Agent Port Number For systems running Net SNMP this setting is labelled SNMP Port and for Novell NRM version 6 5 systems this setting is labelled Novell NRM Port Number e User Name and Password For Novell NRM systems the user name and password that are required to access the system up time software 157 SJOHUON 2914135 Huis ry Using Service Monitors The Platform Performance Gatherer e Username The name that is required to connect to the instance of Net SNMP v3 e Authentication Password The password that is required to connect to the instance of Net SNMP v3 e Authentication Method The method by which encrypted information travelling between the Net SNMP instance and up time will be authenticated e Privacy Password The password that will be used to encrypt information travelling between the instance of Net SNMP v3 and up time e Privacy Type The method by which information travelling between the instance of Net SNMP v3 and up time will be encrypted e Use SSL HTTPS Select this option if the Platform Performance Gatherer will securely communicate with the host using SSL Secure Sockets Layer e Check Interval The frequency in minutes at which the host will be checked If the Check Interval is longer than the Alert Interval the following message appears Warning The alert interval is less than the check interval up time will only send alerts after performing checks 5 Click Save 158
241. ed Maintenance Profiles do the following 1 On the up time tool bar click Services 2 Inthe Tree panel click View Maintenance Profiles 3 Inthe Services subpanel click the name of the Maintenance Profile that you want to view The scheduled Maintenance Profile appears in the Services subpanel and contains the following information e the name of the profile e the time period over which the profile is applied to a system or service the names of the systems and services if any to which the profile has been applied Scheduling Maintenance for a Host 162 To schedule maintenance for a host do the following 1 On the up time tool bar click Services In the Tree panel click Host Maintenance Windows Click the Assign Maintenance to Host tab in the subpanel RR WO N In the Host Maintenance window select the Maintenance Profile to use from the Maintenance profile dropdown list If you have not created a Maintenance Profile the message No profiles exist appears in the dropdown list 5 Select one or more systems from the Available Host list The hosts that you select will be the hosts to which the Maintenance Profile applies 6 Click Add and then click Save up time 5 User Guide f up time Scheduling Maintenance Scheduling Maintenance for a Service To schedule maintenance for a service do the following 1 RR OO N On the up time tool bar click Services In the Tree panel click Service Maintena
242. ed Process Information rrrrnnrrrrrrrerrrnnnnrnnnr 525 up time software xxi D OD ie k O Oo gt p D gt pr 7 Configuring and Managing up time TE anniaaa iasi 528 Modifying up time Config Panel Settings rrrrrvrrrnnnnnvrvnnrnnvnnr 529 Modifying uptime conf File Settings 1 ccccccccesssceesssteeeeeesees 529 Stopping and Restarting up time Services rrrrrrrrrnnnrrrenrnnvrnnnnn 530 Interfacing with up time nannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 532 Database SOMOS varde EAEE 532 Monitoring Station Web Server rrrrnnnrrnnnnnvvrrnnnnnnvrnnrnnnvrennrnnnnnn 534 SMTP Server asa a ee as ciyse eat cee ace ces eenad ec ecteaseecen ste saeee tases seatete 534 Configuring Global Data Collection Methods ccccccceeeeeees 536 RSS Feed SOUS mener cavedcedsyseacsasdanpeasuecess 537 VMware vCenter Orchestrator Integration srrrrnrnnnnrrrnnnnnvrnnnnrn 539 Web Application Monitor Proxy Settings rrrrnnnnvvrrnnnnnvrnrnnnvnnn 540 Remote Reporting Settings cccccceccccccescecessenseeeessnseeesenteeees 541 User Interface Instance Settings rsrnnnnnnnnnrrrnnnrnnvrrrnnrnvvnnnnnnn 542 Scrutinizer SettiNgS massar essensen ede 542 SEE ERE ANTE 543 Archiving the DataStore a nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 545 Archive Categories urnrrrnnnnnnnvvnnnnnnrronnnnnnvarnnnnnnvnnnnnnnnrnnnennnnnensen 546 Configuring an Archive Policy masmseanemsenmuendiikennnsen
243. ed but which are not currently being used If the available ECB count is zero the server will become sluggish until enough ECBs are created to fill the demand The server will recover as long as the number of Packet Receive Buffers does not increase to the maximum that can be allocated LAN Traffic This statistic shows whether or not your server can transmit and receive packets If this statistic returns a Good status the server is able to accept or transmit packets through the network board If the status is Bad the network board is not transmitting or receiving packets All servers should be able to transmit or receive packets If your server is not transmitting your LAN is not functioning properly Check the drivers and protocol bindings for the network board on the server If the drivers and protocol bindings are functioning properly then the network board is probably faulty If the network board is functioning you should perform a diagnostic on your LAN Available Disk Space This statistic enables you to view the status of the available disk space on all mounted volumes on a server This statistic returns the following statuses Disk Throughput This statistic enables you to view the status of amount of the data that is being read from and written to the storage media on this server up time software 91 ainjonsjseajuy INOA Huibheuew pue buiuijeg FI Defining and Managing Your Infrastructure Working with Systems If this st
244. een created but which are not currently being used e LAN Traffic Whether or not the NRM system can transmit and receive packets up time software 521 Using Graphs Novell NRM Graphs Available Disk Space The status of the available disk space on a server Disk Throughput The status of amount of the data being read from and written to the storage media on the server Connection Usage The number of connections that are being used and the peak number of connections used on this server For more information about Novell NRM systems see Novell NRM Systems on page 86 Generating a Novell NRM Graph To generate a Novell NRM graph do the following 1 522 In the Global Scan or My Infrastructure panel click the name of the Novell NRM system whose information you want to graph Novell NRM systems are denoted by this icon N In the Tree panel click the Graphing tab and then click one of the metrics in the list Select the start and end dates and times for which the graph will chart data For more information see Understanding Dates and Times on page 22 Click Generate Graph up time 5 User Guide hy up time Instance Motion Graphs Instance Motion Graphs The VMware VMotion tool enables you to move ESX instances from one server to another without any downtime or loss of data You would use VMotion to for example move an instance to newer and faster hardware or to temporarily relocate th
245. efault generated reports are cached on the Monitoring Station for 30 days additionally the location for published reports is also on the local Monitoring Station file system Both options can be modified In the latter case automatically publishing reports to a publicly accessed directory on the network is an ideal way for non IT staff to view them See Saving Reports to the File System on page 404 for more information Changing the Number of Days Reports Are Cached You can change a report s expiry time limit by manually inputting settings in the up time Configuration panel as outlined in Modifying up time Config Panel Settings on page 529 Change the expiry limit through the following parameter the default value is shown reportCacheExpiryDays 30 558 up time 5 User Guide d up time Report Storage Options Changing the Published Report Location This can be modified with the following uptime conf parameter publishedReportRoot lt location gt If the intended published report directory is on a system other than the Monitoring Station the provided location should be a full network path to the system in addition to the directory path on that system zZz O le e E 5 Q Ss D up time software 559 Configuring and Managing up time Resource Usage Report Generation Resource Usage Report Generation Due to the large number of options available for the Resource Usage report gener
246. efining the report Generating a File System Service Time Summary Report To generate a File System Service Time Summary report do the following 1 In the Reports Tree panel click File System Service Time Summary 2 Inthe Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 If no data available for the date range the report displays a message indicating that there is no data for the time period up time software 449 Using Reports 450 Reports for Capacity Planning To only include data from certain hours during the day select those hours from the dropdown lists in the Daily Hours section as shown below Daily Hours Include data samples between these hours only End 21 00 z For example if you want to report to cover the hours from 8 00 a m to 6 00 p m select 8 00 from the Start dropdown list and 18 00 from the End dropdown list Select one of the following options from the Primary Sort by dropdown list to sort the results that up time returns e System Name e Disk e High Service Time the default e Low Service Time e Average Service Time e High Percentile Select Ascending or Descending from the associated dropdown list Optionally do the following e Select another sort criteria from the Secondary Sort by dropdown list e Select Ascending or Descending from the associated dropdow
247. el commands Wait I O The percentage of time that a process which can be run must wait for a device to perform an I O operation SMTX The number of read or write locks that a thread was not able to acquire on the first attempt as reported by the mpstat command While it is trying to acquire locks the thread is active but is not performing any tasks XCAL The number of interprocess cross calls In a multi processor environment one processor sends cross calls to another processor to get that processor to do work Cross calls can also be used to ensure consistency in virtual memory Heavy file system activity such as NFS can result in a high number of cross calls Interrupts The number of CPU interrupts For Windows systems this option is Interrupt Time Interrupts are a mechanism that a device uses to signal to the kernel that it needs attention and that immediate processing is required on its behalf up time 5 User Guide f up time Multi CPU Usage e Interrupts sec On Windows systems rate at which CPU handles interrupts from applications or hardware each second If the value for Interrupts sec is high there could be problems with the hardware on the system e Total On Solaris systems the total amount of User System and Wait I O On Windows systems this option is Total and is the total amount of User Time Privileged Time and Interrupt Time 6 Select the CPUs to graph f
248. elect one of the following e Show all non ranged metrics on one chart This option combines all of the variables you selected in one chart Any ranged metrics will appear in their own charts 426 up time 5 User Guide hy up time Reports for Performance and Analysis e Display charts as stacked area Each chart in the report will have two or more data series stacked on top of each other rather than the line graph that usually appears in the report 9 To save the report do the following e Enter a name for the report in the Save to My Portal As field e Optionally enter text in the Description field e Click Save Report The report parameters are saved to the My Portal panel Doing this does not generate the report 10 To schedule the saved report to run at a specific time or interval click the Scheduled checkbox See Saving Reports on page 404 and Scheduling Reports on page 407 for more information up time software 427 Using Reports Reports for Capacity Planning Reports for Capacity Planning The following reports enable you to visualize the resource usage of systems in your up time environment and then use that information to better plan deploy and consolidate your server resources e Enterprise CPU Utilization Report e File System Capacity Growth Report e Server Virtualization Report e Solaris Mutex Exception Report Network Bandwidth Report e Disk I O Bandwidth Report e CPU Run Queue Thres
249. elect the cpu statistic Otherwise an error message appears when you try to generate the report 8 Optionally in the Architectures to exclude field enter either the name of a system architecture or a regular expression that up time will use to ignore certain system architectures when generating the report For example if you want to exclude all Solaris systems from the report enter SunOS in the field up time determines the architecture of a system by 3 checking the output of the uname a command on UNIX or Linux or by analyzing one or both of the following Windows registry keys HKEY_LOCAL_MACHINE Software Microsoft WindowsNT CurrentVersion HKEY_LOCAL_MACHINE Software Microsoft Windows CurrentVersion 9 If you want to generate reports for systems in specific groups select the groups from the List of Groups area 10 To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views 11 If you are generating reports for specific systems in your environment select them from the List of Systems up time software 421 Using Reports Reports for Performance and Analysis 12 Select a report generation option See Report Generation Options on page 402 for details 13 To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the su
250. elow Daily Hours Include data samples between these hours only End 21 00 z For example if you want to report to cover the hours from 1 00 a m to 1 00 p m select 1 00 from the Start dropdown list and 13 00 from the End dropdown list 4 Optionally enter a value in the Highlight ratios over threshold field Any ratios that exceed the value in this field will be highlighted in the report For example if you enter 2 and a server returns a ratio of 3 5 that ratio is highlighted 5 Ifyou want to generate reports for groups of systems select the groups from the List of Groups area 6 To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views 7 Ifyou are generating reports for specific systems in your environment select them from the List of Systems 8 Select a report generation option See Report Generation Options on page 402 for details 9 To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information Wait I O Report The Wait I O report enables you to determine the amount of time that processes spend waiting on I O from a system device up time software 423 Using Reports Reports for Performance and Analysis
251. ement Tuning 556 Resource Scan Threshold Settings You can modify the Resource Scan threshold settings through the following parameters default values are shown resourcescan cpu warn 70 The Warning level range in the CPU Usage gauge begins at this value 70 and ends at the Critical level range resourcescan cpu crit 90 The Critical level range in the CPU Usage gauge is between this value 90 and 100 resourcescan memory warn 70 The Warning level range in the Memory Usage gauge begins at this value 70 and ends at the Critical level range resourcescan memory crit 90 The Critical level range in the Memory Usage gauge is between this value 70 and 100 resourcescan diskbusy warn 70 The Warning level range in the Disk Busy gauge begins at this value 70 and ends at the Critical level range resourcescan diskbusy crit 90 The Critical level range in the Disk Busy gauge is between this value 70 and 100 resourcescan diskcapacity warn 70 The Warning level range in the Disk Capacity gauge begins at this value 70 and ends at the Critical level range resourcescan diskcapacity warn 90 The Critical level range in the Disk Capacity gauge is between this value 70 and 100 up time 5 User Guide hy up time up time Measurement Tuning Platform Performance Gatherer Check Intervals The Platform Performance Gatherer is a core performance monitor that resides on all agent based Elements
252. ement you may obtain one or more temporary license keys and permanent license keys using the procedure set forth on Uptime s web site including but not limited to payment of Uptime s license fees The license fees paid by you are paid in up time software 579 JUudouIddJby su 17 gl NOTICE TO USER consideration of the license granted under this Agreement Uptime does not refund license fees By accepting this Agreement you fully understand that once license fee payment is made to Uptime you will have no recourse for receiving a refund of any part of the fees 4 Term and Termination This Agreement is effective upon your acceptance of the Agreement or upon your downloading accessing and using the Software even if you have not expressly accepted this Agreement This Agreement shall continue in effect until terminated Without prejudice to any other rights this Agreement will terminate automatically if you fail to comply with any of the limitations or other requirements described herein If you have a temporary key and fail to pay the applicable license fees for continuation of use the key will expire You may terminate this License Agreement at any time by 1 providing written notice of your decision to terminate the Agreement to Uptime and ii either returning the Software Documentation all copies thereof and all license keys that you have obtained to Uptime or destroying all such materials and providing written verificatio
253. emory that is being used by a user group or process On Windows systems RSS is called Working Set Workload graphs that are generated for SNMP agents only 3 chart the Memory Size metric up time software 505 Using Graphs Workload Graphs Generating a Workload Graph To generate a workload graph do the following 1 5 In the Global Scan or My Infrastructure panel click the name of the system whose information you want to graph In the Tree panel click the Graphing tab Click one of the following options e Workload User e Workload Group e Workload Process Name Select the start and end dates and times for which the graph will chart data For more information see Understanding Dates and Times on page 22 Click one of the following metrics e CPU e Memory Size or Virtual Bytes on UNIX and Windows respectively e RSS or Working Set on UNIX and Windows respectively You can only graph one metric at a time Select one or more of the available users groups or processes from the list If you are generating a workload graph by processes 1 e Workload Process Name graph enter a regular expression in the Process Selection Regex field to automatically add matching process names for graphing and avoid dealing with ungainly lists of system processes The list of available process will vary by server and by 7 506 operating system Click Add up time 5 User Guide f up ti
254. empts to promptly find and delete test emails network issues may prevent timely cleanups To avoid potential Inbox clutter it is recommended that you create a dedicated test email account as the destination address 231 Application Monitors Email Delivery Monitor 232 Delivery Time Enter the Warning and Critical Delivery Time thresholds The smallest unit of time used for these thresholds is seconds Given the speed at which SMTP servers should finish processing an outgoing email is it recommended that you set the Warning threshold to one second Complete the Incoming Email Settings POP3 IMAP Hostname Provide the name or IP address of the mail server POP3 IMAP Port Provide the port used to communicate with the mail server Leave this field blank to use the default POP3 or IMAP port 110 and 143 respectively POP3 IMAP Username Provide the login name for the destination email account POP3 IMAP Password Provide the password for the destination email account POP3 IMAP Uses SSL Specify whether the mail server sends and receives encrypted communication using SSL Retrieve Time Enter the Warning and Critical retrieval time thresholds The smallest unit of time used for these thresholds is seconds and the monitor checks for receipt of the test mail in five second intervals Enter values in multiples of five Enter Warning and Critical level thresholds for the overall response time of the monitor Ent
255. en rolled back LiveCount The number of servlet sessions that are currently cached in memory PoolSize The average number of threads in the servlet connection thread pool TimeSinceLastActivated The difference in milliseconds between the previous and current access time stamps of a servlet session This counter does not include session time out values Before up time can start collecting performance data from a WebSphere server you must deploy the WebSphere performance servlet up time software 213 LL gt xo O o fe 5 e Oo 7 Application Monitors WebSphere Deploying the WebSphere Performance Servlet 214 The WebSphere performance servlet uses WebSphere s Performance Monitor Interface PMI infrastructure to retrieve performance information from a WebSphere Application Server The information that the servlet collects is saved to an XML file By default the PMI is enabled on the WebSphere server and is set to collect the performance metrics that up time supports Before up time can begin collecting information from a WebSphere server you must deploy the performance servlet in the WebSphere directory that contains your Web application The following steps must be completed for each Web application server that you want to monitor with up time To deploy the performance servlet do the following 1 On the WebSphere server locate the following file install_root perfS
256. eneeeeeeeceeeeeteneeeeetenaeess 347 Viewing Notification Groups rrvrrnnnvrrnnnnnnvrnrnnnnnnrnnrnnrrrenrrn rennene 348 Editing INOUMC ATION GOE een 348 Changing How Users Are Authenticated n anrnnnnnnnnnnnnn 349 Active Directory Authentication rnnnavvvnnnnnnnnnnnnevennnnnnnnannrevennnnr 349 LDAP Authentication uuensaasd aai hue 352 up time DataStore Authentication axs ernnvvnonnvnvnnvvrn nnvvennvvrnene 354 Working with Service Level Agreements OVEerV EW REE EE 358 SLAs Service Monitors and SLOS uuunnnnvunnnnnnnnnnnnnnnnnvun 359 Viewing Service Level AgreementS ruarannnnnnunnnnnnnnnnnnunnn 360 Viewing SLA SAUS aj asussemeuss eden inne 360 Viewing SLA Details am nunssssrerunernernrergsendev saseigsnesn 360 SLA Compliance Calculation rasnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 363 Reporting SLA Status cccccccsssccceeeseceeeseeceeeeseaaceeeesesseeeeseneees 363 Handling Simultaneous Service Downtime anrrrrnnnnnvrrnnnnnvrnnnnrn 364 A Note About SLOs and Compliance aorrrnnnnnnnrnnnnnnvrrnnnnnnnrnnnnnn 365 SLA Creation Strategies a an nannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 366 up time 5 User Guide D up time Setting Up and Gathering Data for Monitors rrrrrnnrrrvnnnnrrnnnrrn 366 5 Identifying Outages and Improvable PerforMancCe c 366 U Developing BaS lin S ccccccsccsesseseseseserseseseserstsssesesesstassenes 368 pg Working with SLA Reports rrnrnnr
257. enn 344 Working with Notification Groups rnnrmnrennnannnrnnnnr rennene 347 Changing How Users Are Authenticated ammnnrnnnnnnnaannnnnr 349 333 Configuring Users Working with User Roles Working with User Roles User roles define the following e whata user will see when they log in to the up time Monitoring Station e the items that a user can add view edit or delete when using the Monitoring Station The user roles that you create should reflect that needs of the users to whom the roles will apply For example a user who only needs to generate graphs and reports does not need to be able to view or add accounts for other up time users Adding User Roles To add user roles do the following 1 On the up time tool bar click Users 2 Inthe Tree panel click Add New User Role The Add User Role window appears 3 Type a name for this role in the Name of User Role field This name will appear in the up time Web interface 4 Optionally type a short description in the Description of User Role field 5 In the first Permissions area of the Add User Role window you assign the user permissions to View Add Edit or Delete the following items by clicking the checkbox beside each item e Users e Elements e Services Element Groups e Action Profiles e Alert Profiles 334 up time 5 User Guide f up time Working with User Roles e Time Periods e Service Level Agreements e Element Views 6 Optionall
258. ent If on the other hand the LPAR has a soft entitlement one which can use spare processing power from another CPU on the server and its CPU usage is consistently at or greater than the entitlement you can increase it up time software 477 Using Reports Reports for Virtual Environments 478 up time 5 User Guide f up time CHAPTER 20 Understanding Graphing This chapter introduces the graphing features of up time in the following sections Graphing in up timMe rrannnvnvnnavvvvvv va nn AEREN A EIEEE ARRERA i 480 Using the Graph Editor rrnnnaannnnnnnnrrnn naa na annnrrannnnnnennnner 482 479 Understanding Graphing Graphing in up time Graphing in up time You can graph performance information to learn about the behavior of a system in your environment Graphs visualize information about CPU memory and process usage as well as network disk and user activity For more information about specific graphs see Using Graphs on page 487 up time can generate performance data graphs in two ways e In Internet Explorer the graph is generated using an ActiveX graphing control as shown below lab t1 4 10 1 1 234 CPU Usage Date Range Wed Apr 16 00 00 00 EDT 2008 Wed Apr 16 16 37 1 Microsoft Internet r ioj xj Show Editor dialog Show Print Preview Export chart 7 30 upitime 5 lab t1 4 10 1 1 23 CPU Usage Date Range Wed Apr 16 00 00 00 EDT 2 ed Apr 16 16 37 19 EDT 2008 10
259. ent field This name will appear in both the My Infrastructure and Global Scan panels Optionally enter a description for the SLA in Description of Service Level Agreement field Although this step is optional this description will appear in generated SLA reports therefore it is recommended that you provide a detailed description of the SLA including what it is meant to accomplish and of which SLOs it consists Optionally select the group of systems in your up time environment with which this system will be associated from the Parent Group dropdown list By default the SLA is added to the My Infrastructure group For more information on groups see Working with Groups on page 105 If it is not continuous i e 24x7 enter a Monitoring Period during which the SLA s compliance will be measured You will need to create a time period definition e g Every Mon Sat 8AM 6PM See Monitoring Periods on page 397 and Time Period Definitions on page 567 for more information If it is not the default 99 0 enter a Target Percentage against which the SLA s compliance will be measured Ensure you have selected the correct Compliance Period Type from the dropdown list Indicate whether scheduled system maintenance will count as downtime Click Save Once saved the SLA s Service Level Agreement General Information subpanel is displayed see Viewing SLA Details on page 360 for more informati
260. ent for AIX use the following command tar xvf uptmagnt AIX lt version gt tar Type the following command at the command line INSTALL sh Follow the prompts to complete the installation Installing Agents on Linux 42 You can install up time agents for Linux using the RPM utility or the Debian package management utility dpkg This enables you to easily update and perform mass installations of agents Before trying to install an agent ensure that the RPM or dpkg utilities are installed and are in the path by typing one of the following commands at the command line which rpm which dpkg To install an agent on a Linux system do the following il 2 Log into the system as user root Using telnet or FTP transfer the rpmor deb file containing the agent to the system up time 5 User Guide hy up time Installing Agents 3 If you are installing the agent using the RPM utility type the following at the command line rpm i lt agent name gt Where lt agent name gt is the name of the rpm file for the agent that you are installing For example upt imeagent 4 0 rpm 4 If you are installing the agent using the dpkg utility type the following at the command line dpkg i lt agent name gt Where lt agent name gt is the name of the deb file for the agent that you are installing For example upt imeagent 4 0 deb 5 a Si Ko xe 3 D Installing Agents on IBM pSeries Servers
261. enter the following in the Exclude File Systems and Exceptions fields e The name of the file system e A regular expression See Using Regular Expressions on page 442 for more information 7 To generate reports for groups of systems select the groups from the List of Groups area 8 To generate reports for one or more views select the groups from the List of Views area up time software 443 Using Reports Reports for Capacity Planning 444 See Working with Views on page 108 for more information about views 9 If you are generating reports for specific Applications in your environment select them from the List of Entities 10 Select a report generation option See Report Generation Options on page 402 for details 11 To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information Using the Disk I O Bandwidth Report The following is an example of a Disk I O Bandwidth report Disk 1 0 Bandwidth Report Date Range 2008 04 17 00 00 00 to 2008 04 17 11 27 36 between the time range 00 00 to 23 59 Output displayed in megabytes 512 bytes per block Hostname Disk Name File System I O Total AIX DEV LPAR 10 1 1 57 hdiskO 265 MB AIX QA LPAR 10 1 1 56 hdisko 1 176 MB AIXS aix5I hdisko 201 MB hdisk1 0 MB 0 MB ELinux elinux 1
262. er interface Any Monitoring Station that is accessed by users or administrators requires a URL The Web address used to access the Monitoring Station is configured through the following uptime conf parameter httpContext http lt hostname gt lt port gt e lt hostname gt is the host name of the server on which up time is running e g Localhost e lt port gt is the port on which the up time Web server is listening for requests e g 9999 you can optionally omit the port number If the up time interface is being accessed via SSL the value for this parameter should be stated as https instead of http SMTP Server 534 up time uses a mail server to send alerts and reports to its users After installing up time for the first time the administrator was asked to enter SMTP server information These initial values can be modified in the Mail Servers configuration panel Modifying the SMTP Server Used by up time To configure up time s mail server do the following 1 On the up time tool bar click Config 2 Inthe Tree panel click Mail Servers 3 Inthe sub panel click Edit Configuration 4 Type the name of the mail server in the SMTP Server field up time 5 User Guide dy up time 10 11 Interfacing with up time This value was set the first time the up time administrator logged in after installation the default value is the name of the host on which the Monitoring Station was installed at that time T
263. er alias name that the SQL Server version 7 0 client components can use to connect to a named instance 263 ZL U D iy Oo o 2 D Oo e 7 Database Monitors SOL Server Basic Checks A computer can concurrently run any number of named instances of SQL Server An instance name cannot exceed 16 characters Database The name of the SQL Server database that you want to monitor up time views each database along the path lt system gt lt instance gt lt database gt Each instance of SQL Server has four system databases master model tempdb and msdb and one or more user databases Depending on their permissions users can access some or all of the databases in an instance A connection to an instance is associated with a particular database on the server called the current database You can switch from one database to another using the Transact SQL USE database_name statement up time gathers information from all of the databases in all instances on a system and aggregates this information in the metrics it returns to you Unless you must identify a particular database on your system for example you have applied a name to the default instance you should leave the Database field blank Script File Click the Script File check box and then enter the full path on the Monitoring Station to the script that this monitor will run against the database you configured your data
264. er for each processor on systems with multiple CPUs The value returned by the counter represents the sum of processor time on a specific processor To determine the average for all processors monitor the System Total Processor Time metric up time software 491 Using Graphs Monitoring CPU Performance Optionally you can monitor the following metrics e Processor Privileged Time The percentage of time that the CPU spends executing Windows kernel commands If this metric is consistently high you should consider using a faster or more efficient disk subsystem e Processor User Time The percentage of time that the CPU spends executing user processes e Processor Interrupt Time The time that the CPU spends managing hardware requests This metric enables you to determine the level of device activity e System Processor Queue Length The number of threads that are waiting for processor time CPU Usage in UNIX and Linux In UNIX and Linux up time graphs the following metrics e User Time per CPU The amount of time that the CPU spends in user mode During user time the CPU is processing application threads or threads that support tasks which are specific to applications e System Time per CPU The amount of time that the kernel spends processing system calls If all of the CPU time is spent in system time there could be a problem with the system kernel or the system is spending too much time processing I O int
265. er the Warning and Critical Response Time thresholds An alert is generated with this threshold if the combined email delivery and response time exceeds the supplied values For more information see Configuring Warning and Critical Thresholds on page 144 up time 5 User Guide hy up time Email Delivery Monitor 5 Complete the following settings e Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 6 Click Finish LL Diagnosing and Reporting Email Delivery Problems If the Email Delivery monitor reaches a Critical state the first investigation step is to review the message produced by up time In the System Status panel view the message belonging to the system to which the monitor is attached which should point you in the right direction For example the status message below indicates the monitor reached a critical state because the retrieval time from an external POP3 server exceeded the defined threshold your SMTP server is most likely not responsible for the delay gt xo O o fe 5 e O
266. ere host The host name of the server that is running up time port The up time port on the server usually 9996 status The status of the service being monitored See Overview on page 322 for more information message A human readable diagnostic message e monitorName The name of the service monitor to which the output will be returned Before using an External Check monitor contact uptime software Client Care for assistance You will need specific instructions for configuring this monitor depending on the nature of the applications that will be generating asynchronous events for up time up time 5 User Guide hy up time External Check Configuring External Check Monitors To configure External Check monitors do the following 1 In the External Check monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following settings e Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information vL e Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for mo
267. errupts e Wait I O Time per CPU The amount of waiting time that a runnable process for a device takes to perform an I O operation Wait I O problems are frequently related to problems with a disk 492 up time 5 User Guide up time Monitoring CPU Performance Run Queue Length The Run Queue Length graph counts the number of processes that are not currently running and which are waiting to be served by the CPU If several processes are trying to use CPU time you might need to install a faster processor or add an another processor if you are using a multiprocessor system A long queue increases the time that a request waits before it is carried out by the CPU However it does not affect the time that is required to process each request once the CPU starts carrying out the request up time counts the number of processes that are waiting in queue at a particular point in time If the run queue or load average is greater than four times the number of CPUs then processes must wait too long for the CPU to process the requests Run Queue Occupancy The Run Queue Occupancy graph charts the percentage of time that one or more services or processes are waiting to be served by the CPU If the run queue occupancy is close to 100 and the run queue length is considered low the CPU is not necessarily overloaded While there may always be services waiting to be processed the CPU may still be able to quickly process them If the run queu
268. ers are found in the uptime conf file up time 5 User Guide D up time Overview Modifying up time Config Panel Settings Configuration parameters that are not directly tied to thus do not require a restart of the up time Core service can be modified directly in the up time GUI shown below up time Configuration up time Configuration Help on Configuration Options uptime Monitoring Station lt uptime dev slal uptimesoftware com gt 0 1 0 98 In general to edit these configuration settings in the up time interface do the following 1 On the up time tool bar click Config 2 Inthe Tree panel click up time Configuration 3 Enter the configuration variable and new value 4 Click Update to save your changes Only the variables whose default values have been modified 3 appear in up time Configuration Modifying uptime conf File Settings Configuration parameters that are directly tied to the up time Core service are found in the uptime conf file uptime conf is a text file that you can modify in any text editor and can be found in the root up time installation directory up time software 529 zZz O le e E 5 Q Ss ae D Configuring and Managing up time Overview Stopping 530 In addition to the up time database upt ime conf parameters affect a variety of up time behavior Not all of the settings listed in this section will necessarily be found in your part
269. erting features the monitoring periods when alerts can happen as well as the configuration of post alert actions Understanding Alerts sriserssrrur irina E noT EnD ELLELE PEE KETE REPEEPE 378 Alert PrOTHOS sci vasen sunde ia debts EE REEERE dnne Coen es 381 Working with Custom Alert FormatS rannannnannnannnnnrnnnennr 385 Action Pronlesuamesdnarsd pendlere de beseglet ar te 389 Monitoring Periods again pare ap eee 397 377 Alerts and Actions Understanding Alerts Understanding Alerts 378 When a problem occurs at a Datacenter Application or SLA the Monitoring Station can send aferts to users Alerts are notifications that inform users who are configured to receive alerts of the problem The notification message contains the following information e the type of notification either Problem or Recovery e the date and time when the problem occurred e the name of the host on which the problem occurred e the status of the host see Understanding the Status of Services on page 21 for more information the name of the service that is experiencing the problem e the current state of the service e any output from the monitor Whenever the status of an Element changes for example from Critical to Warning up time sends an alert You can also configure a ert escalations that occur if a warning is sent and is not acted upon For example if an alert is sent to a system administrator and the administrator d
270. ervletApp ear Where install root is the directory under which WebSphere is installed 2 Copy the file perfServletApp ear to the directory in which your Web application is installed For example install_root installedApps lt cell_name gt DefaultApplication ear DefaultApplication war WEB INF classes Where e install root is the directory under which WebSphere is installed e lt cell name gt is the name of the WebSphere node under which your Web application is installed Deploying the Performance Servlet on WebSphere 6 If you are using WebSphere Application Server version 6 you will need to change two settings in the WebSphere management console to avoid an Access Denied error when up time attempts to connect to the performance servlet to collect metrics up time 5 User Guide D up time WebSphere To make the changes do the following 1 In the WebSphere management console modify the following settings e Under Security Secure administration applications and infrastructure turn Application Security on e Under Enterprise Applications perfServletApp Security role to user group mapping turn Everyone off 2 Restart the server up time should now be able to connect to the servlet and gather performance metrics Configuring WebSphere Monitors LL To configure a WebSphere monitor do the following 1 On the WebSphere monitor template complete the monitor information fields To learn how t
271. es a critical alert up time software 217 Application Monitors ESX Workload 218 Network Bandwidth Warning Threshold The amount of network traffic in and out of the server measured in megabits per second Mbit s that must be exceeded before up time issues a warning Network Bandwidth Critical Threshold The amount of network traffic in and out of the server measured in megabits per second Mbit s that must be exceeded before up time issues a critical alert Disk Usage Warning Threshold The amount of data being written to the server s hard disk measured in kilobytes per second kB s that must be exceeded before up time issues a warning Disk Usage Critical Threshold The amount of data being written to the server s hard disk measured in kilobytes per second kB s that must be exceeded before up time issues a critical alert Memory Usage Warning Threshold The amount of overall system memory measured in megabytes MB that must be exceeded before up time issues a warning Memory Usage Critical Threshold The amount of overall system memory measured in megabytes MB that must be exceeded before up time issues a critical alert Percent Ready Warning Threshold The percentage of time that one or more instances running on an ESX server is ready to run but cannot run because it cannot access the processor on the ESX server If the valued returned from the server exceeds this threshold then up time issu
272. es a warning Percent Ready Critical Threshold The percentage of time that one or more instances running on an ESX server is ready to run but cannot run because it cannot access the processor on the ESX server If the valued returned from the server exceeds this threshold then up time issues a critical alert up time 5 User Guide f up time up time software ESX Workload Percent Used Warning Threshold The percentage of CPU time that an instance running on an ESX server is using If the valued returned from the server exceeds this threshold then up time issues a warning Percent Used Critical Threshold The percentage of CPU time that an instance running on an ESX server is using If the valued returned from the server exceeds this threshold then up time issues a critical alert For more information about setting thresholds see Configuring Warning and Critical Thresholds on page 144 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information Click Finish 219 LL gt xo O o
273. es how encrypted information travelling between the SNMP instance and up time will be authenticated e MD5 A widely used method for creating digital signatures that are used to authenticate and verify the integrity of data e SHA A secure method of creating digital signatures SHA is considered the successor of MD5 and is widely used with network and Internet data transfer protocols 2 Ensure that the authentication method you select in up time matches the method that is used by the system you want to monitor up time software 315 EL 2 pr Oo Ww 92 Oo Oo Oo 2 Network Service Monitors SNMP v3 Auth Password The password that is required to connect to an SNMP instance that is using version 3 of SNMP v3 Privacy Method If the server uses version 3 of SNMP select one of the following options from the list The option that you select determines how information travelling between the SNMP instance and up time will be encrypted e DES An older method used to encrypt information DES is considered weak compared to more modern encryption methods e AES The successor to DES which is used with a variety of software that require encryption including SSL servers Ensure that the privacy method that you select in up time matches the method that is used by the system you want to monitor v3 Privacy Password The password that will be used to encrypt inform
274. essary edit the details of the system as described in the section Adding Systems or Network Devices on page 69 Otherwise click Save in the Add System Network Device window Repeat steps 4 and 5 for any other systems that you want to add Using Auto Discovery to Add ESX Systems Virtual Infrastructure 3 VI3 also called Virtual Center is a software suite that manages multiple physical VMware ESX 3 servers The latest version which supports ESX 4 is known as vSphere 4 vCenter You cannot directly add VI3 or vCenter systems to up time you can however use the Auto Discovery feature to point up time to a VI3 or vSphere 4 system then add any or all of the ESX servers it is managing To use Auto Discovery to add ESX systems do the following 1 In the My Infrastructure panel click Auto Discovery The Auto Discovery window appears Click the ESX Discovery option Complete the following fields e Virtual Center Host Name The name of the VI3 system e User Name up time 5 User Guide hy up time Working with Systems The user name required to log into the VI3 system e Password The password required to log into the VI3 system 4 Click Continue up time returns a list of the ESX servers that are being managed by the VI3 or vSphere 4 system 5 Click the Add button beside the system that you want to add The Add System Network Device window appears 6 If necessary edit the details of the system as descri
275. esystems over full field This value is expressed as a percentage The report displays the information for file systems whose used disk space is less than the amount you enter in this filed For example if you set this field to 45 the report only displays file systems whose percentage used values are less than or equal to 45 Click the Show totals for each system only checkbox to report only on the total amount by which all file systems on all disks drives have grown rather than displaying amounts for each file system If you want to generate reports for systems in specific groups select the groups from the List of Groups area To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views If you are generating reports for specific systems in your environment select them from the List of Systems Select a report generation option See Report Generation Options on page 402 for details To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information Server Virtualization Report 432 Many organizations have a number of production servers that are not being used to their full capacity For example a server could be running one or two applicat
276. etrieval The IMAP Email Retrieval monitor confirms whether an IMAP server is doing the following listening on a defined port running on a defined system or on a group of systems using a particular version Configuring IMAP Email Retrieval Monitors To configure IMAP Email Retrieval monitors do the following 1 up time software In the IMAP Email Retrieval monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 Complete the following fields Port The number of the port on which IMAP is listening The default is 143 If you are applying a monitor to a service group ensure that all of the systems use the defined port Otherwise create a monitor for each IMAP instance that listens on a different port For information on service groups see Service Groups on page 153 Server Response Select a comparison method and then enter the Warning and Critical thresholds for the server response For more information see Configuring Warning and Critical Thresholds on page 144 The server response is the same for Windows UNIX and Linux For example an expected response is OK CAPABILITY IMAP4REV1 LOGIN REFERRALS STARTTLS AUTH LOGIN filter IMAP4revl 2002 336 at Thu 2 Jun 2005 10 55 02 400 EDT EL 2 pr Oo Ww 92 Oo Oo Oo 2 289 Network
277. ettings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information e Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish 310 up time 5 User Guide f up time SNMP SNMP Simple Network Management Protocol SNMP is a widely used protocol that monitors the health of computer and network equipment The SNMP monitor enables you to query SNMP devices or systems for a given object identifier OID of an SNMP Management Information Base MIB A MIB a listing that defines variables needed by the SNMP protocol to monitor and control network equipment The OIDs identify the managed variables in a system Each OID is represented by a set of numbers separated by periods for example 1 3 6 1 2 1 1 1 0 The period at the start of an OID indicates that the name of the OID begins at the root of its associated MIB However each object is also assigned a unique name for example sysObject ID that makes it easier to identify that object The SNMP monitor enables you to compare the response to a specific pattern If the device is protected by a community password you can specify the password
278. eval annannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 305 Configuring POP Email Retrieval Monitors rsrnrrrernnnvrnnnrrn 305 SSH Secure Shell lt cccscciccsseccsesssacsceccescstsesasendescnsescennaescs 307 Configuring SSH Secure Shell Monitors rrrnnrrrrrrrernrnnnnrnnnr 307 SMTP Email Delivery arrannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnen 309 Configuring SMTP Email Delivery Monitors rornnrrnnnnnnrrnnnrrn 309 SNMP EE vanana waaa dandanadan adunan 311 Ne SNMP iise ai a aea ara a a araia i 311 SNMP MIB Browser ajuccsissvescetacasds sedezecessaieaseccantdetisavesssaseennszacianss 312 Supported Versions of SNMP eminente 312 Using the SNMP MIB Browser rrrrvrnnnnnvrvnnnnnnvrrennnrvrrnnnnnnnennnnnnnr 312 up time 5 User Guide D up time Configuring SNMP MonitorS rrrrnnnnvrnnnnnnvrnnnnnnnnrnnrnnrrrnnnrnnnnnnnnn 315 TCP serverer 318 Configuring TOP MONO tassels csocssenetostnepsantues avesecuanstndesevers sudenes 318 Advanced Monitors 8 ET ceciciccestsccanscccccesssdcansesasecesasescansensscscnenacsacnssconns 322 Before You Begin seaman 323 Custom Monitors ssssssssssssssnnunnuunnnnnnnnnnnnnnnnnnnnnnnnnnnn nnna 324 Configuring Custom MONO sarte 324 Custom with Retained Data a ranrnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 326 Configuring Custom Monitors with Retained Data nnnnnnnnnr 326 External Ch ck sessassasasennsnsansansnnassnsnnansanee annen dn nadnannnndnnnn 328 Configuring
279. f systems to add you can copy and paste an entry and modify the fields as needed If you keep a list of all the systems in your environment in a spreadsheet you can save the list as a text file or a comma separated values csv file Then you can write a script that can manipulate the text or csv file into the proper format up time 5 User Guide dh up time Working with Systems Fields in the Hosts File The following table explains the fields that you can include in the hosts file The fields that are needed to add a system will vary depending on the type of system that you want to add For example to add an agent system you only need to include the Host Name Type and Port fields See Working with Systems on page 67 for more information up time software 93 ainjonsjseajuy INOA Huibheuew pue buiuijeg FI Defining and Managing Your Infrastructure Working with Systems Field Description Host Name Display Name The name or the IP address of the system that you want to add to up time The name for the system that will appear in the up time Web interface Description A short description of the system This field is optional Type The type of system which can be one of the following Agent Node Novell NRM Net SNMP v2 Net SNMP v3 pSeries LPAR Server HMC Virtual Node WMI Agentless Service Group The name of the up time service group which enables you to s
280. f the connection pool the number of connections that are waiting or delayed and the number of failures to reconnect to the server 466 up time 5 User Guide f up time Reports for J2EE Applications Enterprise Beans The report charts the number of Enterprise Java Beans EJB that are active or have been moved to secondary storage the number of time that a container can and cannot find an EJB in the cache as well as the total number of EJBs in the cache This report returns information for e Stateful EJBs which hold data for a client between calls to the EJB Stateful EJBs can use considerable amount of server resources e Stateless EJBs which hold data for only one call to the EJB and then deletes that data Stateless EJBs use fewer system resources than stateful EJBs JVM Runtime The report charts the heap size in kilobytes of the Java Virtual Machine JVM on the WebLogic server as well as amount memory in kilobytes available to the JVM Transaction Manager The report charts the number of transactions that were committed or completed successfully as well as total number of transactions that are rolled back Servlets The report charts the number of requests that were made to the HTTP servlets that are running on the WebLogic server Optionally click Select All Options to use all of the options that are listed above 4 If you want to generate reports for groups of systems select the groups from t
281. f your ESX infrastructure To accomplish this virtual machine counts are tracked and reported on a daily basis where the peak VM count for a given day is used as that day s tally The information available in the report includes the following e Virtual Infrastructure Density The total number of virtual machines in relation to the total number of ESX servers over a given time period A trend line is mapped onto the totals indicating whether VM counts and corresponding workloads are increasing or decreasing in relation to available ESX server capacity e Total Virtual Machine Count The total number of virtual machines running on all or a group of ESX servers The VM totals are separated into individual ESX server totals up time software 473 Using Reports Reports for Virtual Environments 474 e ESX Server Virtual Machine Count The total number of virtual machines running on a specific ESX server Using this report you can have a better understanding of virtualized workloads by seeing ESX server use and trends and quantifying VM creation overall and on a server by server basis Creating a VMware Infrastructure Density Report To create a VMware Infrastructure Density report do the following 1 Inthe Reports Tree panel click VMware Infrastructure Density 2 Inthe Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 3 In th
282. figure monitor information fields see Monitor Identification on page 141 Even though the Web application performance is not directly tied to an Element s performance making this selection is still required the service based on this monitor needs to be associated with an Element in order to be viewed in areas such as Global Scan or My Infrastructure 8 Configure the Web Application Transaction Settings Script to play back If desired optimize the playback script e g remove extraneous URLs such as image downloads Text that must appear Enter a text string that can be used to confirm the script playback was successful e g a phrase that appears on the final page of the application If the monitor does not find this text its status changes to Critical By providing mandatory text you can ensure an alert is triggered in cases where a Web application is malfunctioning but checkpoint to checkpoint times are fast enough to fulfill response time requirements Text that must not appear Enter a text string that should not appear at any point during the script playback e g a client or server error HTTP status code If the monitor finds this text its status changes to Critical Use this feature as you would use mandatory text to ensure a malfunctioning application triggers an alert User Agent String Select the Web browser and version used to record the script This selection determines the user agent string use
283. g Save switches the authentication source to the LDAP directory Administrators still need to create profiles for all up time users but will not need to set a password for each one See Adding Users on page 337 for more information Defining LDAP Synchronization Mapping Before synchronizing user details a populated uptime group must already exist in the LDAP directory you will also need to know its distinguished group name as it will be required during configuration Note that all DataStore based user profiles will be deleted when you switch to an LDAP directory for synchronization a list of affected users will be displayed during configuration Before continuing you should ensure your up time users are also in the LDAP directory To configure user detail synchronization from the Active Directory list do the following EN 1 Click Edit Configuration to open the User Authentication Configuration pop up window 2 Select the Synchronization Enabled check box All user synchronization configuration options appear 3 Inthe Synchronize Users field enter the frequency at which up time user information will be synchronized with the LDAP listing O e e lt gt e Cc 2 oD 7 By default synchronization occurs every hour 4 Inthe LDAP Group Distinguished Name field enter the name of the LDAP group of up time users e g CN uptime users CN Groups DC yourdomain DC com 5 If requi
284. g and Critical Thresholds on page 144 3 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph 276 up time 5 User Guide D up time Sybase 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish ZL U iy o o 2 ce ce 7 up time software 277 Database Monitors Sybase 278 up time 5 User Guide l up time CHAPTER 13 Network Service Monitors The network service monitors track the health and performance of the following DNS E rek E E E nd 280 E E T 283 HTTP Web ServiCes innnearaiavrarenksneni anie airen initin 285 IMAP Email Retrieval cccccccccccccncceccnesecucneseneeeetsestesenaee 289 LDAP shares NN 291 NES E E da elds tea add ban eagenengs af oed a cave n dk 295 NIS YP varende lesne antenne ane 297 NNTP Network News cccccecceeneeeceneneeeeneennnn nee eneneneenes 299 PHIAG esata ales ova bu a peetip eee ae E pee
285. g and Critical threshold values that you have set to the values that up time captures up time issues an alert when these thresholds are exceeded You choose a comparison methods from the Select a comparison method dropdown list as shown below Response time v Warning Critical or equal to an or equal to After selecting a comparison method you enter a value in field beside or below the dropdown list The following are the available comparison methods e exactly matches The string returned by the monitor exactly matches the string that you defined e does not match The string returned but the monitor does not match the string that you defined up time software 143 SJOHUON 2914135 Huis ry Using Service Monitors The Monitor Template regular expression The string returned by the monitor exactly matches the pattern result of a regular expression that you define inverse regular expression up time accepts any patterns that do not correspond to the regular expression you define For example if creating a service monitor for your Leech and Microsoft IIS FTP servers you may want to ensure any message from them includes the FTP server name as part of the standard response In this case you can enter the following expression Leech Microsoft A missing name means a server may have been compromised or is not working correctly in which case up time would generate a critical alert contai
286. g and Managing up time up time Diagnosis e OFF The default setting DEBUG essentially logs all system event types To reduce the number of log entries you can limit logging to events with a higher level of severity from INFO to FATAL Note that each severity level is a subset of higher levels e g setting loggingLevel to WARN means any WARN ERROR or FATAL level events are written to the log Logging is configured through the following uptime conf parameter loggingLevel DEBUG Audit Logs up time can record changes to the application s configuration in an audit log The details of the configuration changes are saved in the audit log file which is found in the logs directory There are many uses for the audit log For example you can use the audit log to track changes to your up time environment for compliance with your security or local policies You can also use the audit log to debug problems that may have been introduced into your up time installation by a specific configuration change the audit log enables you to determine who made the change and when it took effect The following is an example of an audit log entry 2006 02 23 12 28 20 082 kdawg ADDSYSTEM cfgcheck true port 9998 number 1 use ssl false systemType 1 hostname 10 1 1 241 displayName MailMain systemSystemGroup 1 serviceGroup description systemSubtype 1 Audit Logging is enabled or disabled with yes o
287. ge review the installation options and 11 then do one of the following e Type back and then press Enter to change any of the settings e Press Enter begin the installation process The installation process will take several minutes When the software is installed press Enter The following occurs e the Web server DataStore and Data Collector are installed the Web server and DataStore are started the DataStore is populated with default data e the Data Collector is started 12 On the Install Complete page press Enter It can take up to a minute for the up time services to start Wait before attempting to log into the Monitoring Station Installing the Monitoring Station as a Virtual Appliance To install the up time Monitoring Station as a Virtual Appliance do the following 1 In the Virtual Infrastructure Client start the procedure to import a virtual appliance Select the Import from file option and locate the up time ovf file you downloaded from the uptime software web site Click Next up time software 35 5 a Si Ko xe 3 D Installing up time nstalling the up time Monitoring Station 10 36 After viewing the Virtual Appliance Details click Next On the License Agreement screen review the up time end user license agreement click the Accept all license option then click Next Provide configuration information for install the name and location of the up tim
288. gical Dependency The Add Topological Dependency window appears Select a system from the Select a host to create dependencies for dropdown list This host acts as the parent for the dependent systems or nodes If up time cannot communicate with the host then the service monitors that check the dependent systems or nodes will not run host checks Click Continue Select one or more systems or nodes from the Available Dependent Hosts dropdown list These systems or nodes will be the dependents of the host system that you specified in step 3 Optionally select one or more entity groups from the Available Dependent Groups dropdown list These groups will be the dependents of the host system that you specified in step 3 Click Finish Viewing Topological Dependencies 160 To view topological dependencies do the following 1 2 On the up time tool bar click Services In the Tree panel click View Topological Dependencies The subpanel displays the following dependency information e name of the parent e the number of dependent hosts e the number of dependent groups if any up time 5 User Guide hy up time Scheduling Maintenance Scheduling Maintenance Scheduled maintenance is a period during which the Monitoring Station does not monitor a host or service You can schedule maintenance if for example you back up a system at a specific time each day or week or if a system must be taken down for an upgrade W
289. h system up time software 13 w dn Buipueyssapun F Understanding up time Understanding Agents 14 Major and minor versions of up time agents are shown in the following diagram Windows Monitoring Station Major version Regardless of operating system platform the major version is the number to the left of the decimal In the diagram above the major number of the Windows agent is 3 the major number of the UNIX agent is 3 the major number of the LINUX agent is 4 0 Minor version Minor version numbers follow the major version number These numbers are used to distinguish each minor version of a major version On UNIX and Linux the minor version is the first number to the right of the decimal In the diagram above the minor version number of the UNIX agent is 8 and the minor version number of the Linux agent is 0 On Windows the minor version is the last set of numbers in the complete version In the diagram above the minor version number of the Windows agent is 1061 For major version 4 and later for Windows the minor version number is the number immediately after the decimal that follows the major number For example for Windows agent version 4 0 the minor number is 0 up time 5 User Guide f up time Understanding the up time DataStore Understanding the up time DataStore The DataStore is a database in which up time stores different types of information configuration information for up time
290. hboard Tabs on page 562 64 up time 5 User Guide D upitime CHAPTER 6 Defining and Managing Your Infrastructure This chapter explains the My Infrastructure panel in the following sections OVET VIEW karakter eas patie begat AEREA Ge pavumtea many ite 66 Working with Systems nnnnnmnnnnnnannnnnnnner enn eee nennnnnrnnnnennr 67 Working with Applications rrrvrrrererarannrrnrrrrrrrrnnnnnnnnnnn 101 Working With SLAS satses sir mariri EEE EEE node thea sada nd hest d 101 Working with Groups 2nnnnnmnrrnnnannnnn annen annnennnennnennennener 105 Working with Views raanunnrrnnannnannnnner vr ennnennnnnnnnrsnnnnenr 108 Deleting Elements Applications and VIEWS auuuuvvvnavanurr 111 Acknowledging AlertS aaarrannaanaannnnrrannn aner nnnnararrennnennnnen 112 65 Defining and Managing Your Infrastructure Overview Overview 66 The My Infrastructure panel is your starting point for monitoring the systems in your environment From the My Infrastructure panel you can add systems or network devices Applications which provide the overall status for one or more services service level agreements which measure compliance to infrastructure performance goals groups Which are sets of systems or devices that have been combined in a meaningful way views which enable non administrative users to view only the systems in which they are interested up time 5 User Guide hy up
291. he 28 nal Every month on the 28th Every month on the 28 6 PM 11 PM Every month on the 28th 6PM 11PM currence every month lt ordinal as word gt lt day gt lt time range gt on the Basic example Time ranges are optional Every month on the I fourth and last Combining Expressions a Every month on the last Fri Every month on the last Fri 6 PM 11 PM last Fri 6PM 11PM Note The ordinal must be stated as a word first second third nd Excluding Time Periods Elaborate time period defintions are built from a combination of the basic expressions defined in the previous section e fixed dates e fixed date ranges 572 up time 5 User Guide f up time Time Period Definitions e weekly recurrences e monthly recurrences e monthly ordinal recurrences e yearly recurrences Combinations Combine basic expressions by writing each one on a new line in the Definition box when defining a Maintenance Profile or Monitoring Period The following examples demonstrate combinations of different basic expressions used to define a maintenance window Combining fixed dates Dec 25 2008 12AM 12PM Jan 1 2009 12AM 12PM Combining a fixed date and a fixed date range Dec 25 2008 12AM 12PM From Dec 31 2008 11PM to Jan 1 2009 12PM Combining weekly recurrences Mon Fri 1AM 3AM Sat 1AM 5 30AM Sun Combining yearly recurrences Every Dec 25 12AM 12PM Every
292. he List of Groups area 5 To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views 6 If you are generating reports for specific systems in your environment select them from the List of Systems and Nodes up time software 467 Using Reports Reports for J2EE Applications 468 7 Select a report generation option See Report Generation Options on page 402 for details 8 To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information Using the WebLogic Report Since WebLogic is large and complex it can be difficult to pinpoint the source of a problem with the server or an application running on the server This is especially true when that problem is intermittent Watching for problems in real time only gives you a snapshot of the problem The up time WebLogic report on the other hand gives you a detailed historical perspective of the problem Using the information report you can find the source of the problem For example users have trouble logging into an application that is running on the WebLogic server Checking the Connection Pool charts section of a WebLogic report you might see that the size of the connection pool has reached its
293. he WebLogic server s JVM heap HeapFreeCurrent The current amount of free memory in bytes that is in the WebLogic server s JVM heap OpenSocketsCurrentCount The current number sockets on the server that are open and receiving requests e AcceptBacklog The number of requests that are waiting for a TCP connection ExecuteThreadCurrentIdleCount The number of threads in the server s execution queue that are idle or which are not being used to process data up time software 205 LL gt xo O o fe 5 e Oo 7 Application Monitors Variables WebLogic Metrics PendingRequestCurrentCount The number of pending requests that are in the server s execution queue TransactionCommittedTotalCount The total number of transactions that have been processed by the WebLogic server TransactionRolledBackTotalCount The total number of transactions that have been rolled back InvocationTotalCount The total number of times that a servlet running on the WebLogic server was invoked Before you can use the WebLogic monitors you must perform additional steps outside of up time The steps performed depend on the version of your WebLogic server WebLogic 8 monitoring requires that you deploy the weblogic jar file on the up time Monitoring Station WebLogic 9 or 10 monitoring requires that you enable the Internet Inter Orb Protocol IIOP on your WebLogic server Monitoring We
294. he first user to log into up time should be the system administrator While the administrator account has the default user name admin you will have to set the password and email address for the administrator account You will only need to do this the first time that you log into up time To set up the administrator account do the following 1 Enter the following in the address bar of a Web browser http lt uptime_hostname gt lt port gt Where lt uptime hostname gt is the name or IP address of the server that is hosting the Enterprise Monitoring Station For example http localhost 9999 The up time log in window opens in a Web browser Enter the password for the administrator in the Password field Re enter the password in the Confirm Password field Enter your email address in the Administrator s Email field a fF OO N Click the Login button 48 up time 5 User Guide hy up time Accessing and Exiting up time Accessing up time Once an administrator sets up your up time account you can navigate and log in to the Enterprise Monitoring Station To start up time do the following 1 Start a Web browser 2 Enter the following in the address bar of the Web browser http lt uptime hostname gt lt port gt Where lt uptime hostname gt is the name or IP address of the server that is hosting the Enterprise Monitoring Station The up time log in window opens in the Web browser 3 Enter your assigned user nam
295. he monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following fields e Event Log Type Choose one of the following types of event log to search e Application A log that records events generated by programs running on the server e System A log that records the activity of various components of the operating system e Security A log that records events such as login attempts and attempts to access files e Other 178 up time 5 User Guide hy up time Windows Event Log Scanner A custom or external log whose name will be defined in the next step e Other Windows Log to Search When the Other event log type is selected in the previous step this field appears Enter the name of an additional Windows event log that you want this service monitor to use This log may accompany an application platform you are running or could be a custom log regardless the name you provide should match the name that appears in the Windows Event Viewer e Match event type with The type of event to search for which can be one of the following e Information OL Describes the successful completion of a task e Warning Indicates that a problem may occur in the future e Error A problem which may involve the loss of data or system integrity has occurred e Success Audit Found in the Security log this describes the successf
296. he monitoring station If you have multiple license keys for the Software each key will be activated on a designated server For purposes of this Agreement use of the Software means loading the Software into the temporary or permanent memory of a computer The Software may not be used on or distributed to a greater number of servers than you have license keys There is no restriction on the number of users who may access the designated servers and use the Software 1 3 Copies and Modifications You may not reverse engineer decompile disassemble or otherwise translate the Software or attempt to derive the source code of the Software or any license keys you have obtained You may not modify or adapt the Software or any license keys that you have obtained in any way You may make one 1 copy of the Software the Documentation and any license keys that you have obtained solely for backup or archival purposes Any such copies of the Software Documentation or license keys shall include any copyright or other proprietary notices that were included on such materials when you first received them Except as authorized in this Section 1 3 no copies of the Software Documentation or license keys or any part thereof may be made by you or any person under your authority or control 1 4 Assignment of Rights The license granted under this Agreement is personal to you You are not permitted to grant access to distribute sell transfer publish dis
297. he name of the server could follow the smtp lt domain_name gt convention or could be its host name or IP address Optionally enter the port used by the mail server in the SMTP Port field In the SMTP Sender field enter the email address that up time uses to send alert notifications and reports This value was set the first time the up time administrator logged in after installation and should be set to your domain e g admin mail uptimesoftware com A sender s name can be encapsulated with double quotes in which case the email address is encapsulated with angled brackets uptime administrator lt admin uptimesoftware com gt In the SMTP Helo String field enter the string that identifies the domain from which a message is being sent For example upt imesoftware com In the SMTP User field enter the user name that is used to authenticate connections with the SMTP server In the SMTP Password field enter the password that is used to authenticate connections Click Save The edit window closes and you are returned to the Mail Server Configuration panel To test the mail server configuration click the Test Configuration button The Monitoring Station will try to send an email message containing the configuration information to the email address of the up time administrator If an error message appears in the subpanel edit and then re test the configuration up time software 535 Configuring and
298. he notification whose information you want to edit and then click Edit Notification Group on the Notification Group Information page The Edit Notification Group window appears 2 Edit the group as described in Adding Notification Groups on page 347 up time 5 User Guide hy up time Changing How Users Are Authenticated Changing How Users Are Authenticated By default user management and authentication is based entirely in up time a profile for a User is created in up time and all profile information is kept in the DataStore up time user lists exist and are maintained separately from any other user management framework your organization may be using In light of this you can elect to use Active Directory or an LDAP based service for authentication and user detail synchronization If you configure up time to authenticate users against a central AD or LDAP directory password entry on login will refer to that directory instead of the DataStore Additionally if you choose to synchronize specific user attributes e g email address the up time user profiles will draw all information from the central directory instead of the DataStore Both measures ensure up time access is automatically kept in sync with the current access levels in your organization up time administrators do not have to manually update user access to match staffing changes EN If user detail synchronization with Active Directory or LDAP is enabled y
299. heck on page 259 to check Oracle tablespaces Configuring Oracle Basic Checks Monitors To configure Oracle Basic Checks monitors do the following 1 Inthe Oracle Basic Checks monitor template complete the monitor information fields To learn about monitor information fields see Monitor Identification on page 141 2 Complete the following fields e Port The number of the port on which the Oracle service is listening f you enter a value in the SID field up time can capture the port value from the SID of the Oracle instance e Port Check Optional Select this option to open a socket connection that determines whether or not the database is listening on the defined port 256 up time 5 User Guide dy up time up time software B If you configured your database to allow logins with a user name and password and you specify the script file but no login information the script will fail The script will run properly if you have configured your database to allow logins without a user name and password Oracle Basic Checks Username The user name that is required to login to the Oracle database Password The password that is required to login to the Oracle database SID The Oracle System Identifier SID that identifies the Oracle instance The SID defaults to the database name If you enter a value in this field up time can capture the number of the port on which Oracle is listening
300. heck Optional Select this option to open a socket connection that determines whether or not the database is listening on the defined port e Username The user name that is required to login to the database e Password The password that is required to login to the database up time software 275 Database Monitors Sybase Database The name of the Sybase database to which you want to connect Script Click the Script checkbox and then type or copy the script that you want up time to against the database into this text box Use this option if you do not have access to the file system on the Monitoring Station or if your script is short or will not regularly change Script File Click the Script File check box and then enter the full path on the Monitoring Station to the script that this monitor will run against the database you configured your database to allow logins without a om f user name and password and you specify the script file but no login information the script will fail The script will run if you have configured your database to allow logins without a user name and password Match Regular Expression Enter a regular expression that you want to match against the string returned from the database If the string matches the status is OK Otherwise the status is Critical Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warnin
301. hen a host or service is scheduled for maintenance the Monitoring Station assumes that the host or service cannot be contacted but does not issue an alert If maintenance is not scheduled then during those periods up time will notify you that the system or service is unavailable when systems or services are not online Creating Scheduled Maintenance Profiles You can schedule maintenance using profiles A scheduled Maintenance Profile is a template that enables you to define maintenance periods and then assign the profile to multiple systems A profile is a recurring event for example a backup cycle that occurs every Monday between 3 a m and 5 a m To create scheduled Maintenance Profiles do the following 1 On the up time tool bar click Services In the Tree panel click Add Maintenance Profiles RR WO N Enter time period expressions in the Definition field that together make up the maintenance window Enter a descriptive name for the profile in the Profile Name field See Time Period Definitions on page 567 for information on the types of time period expressions that are valid in up time 5 Click Save up time software 161 SJOHUON 2914135 bulsn ry Using Service Monitors Scheduling Maintenance Viewing Scheduled Maintenance Profiles You can view scheduled Maintenance Profiles to ensure that they meet your needs and that they are applied to the appropriate hosts and services To view schedul
302. hes a critical state are highlighted in red and include the critical icon amp The color coding also indicates whether an Application is offline or is in scheduled maintenance e an Application that is offline is highlighted in red and marked by the offline icon and a message indicating that the Application is offline appears in the Applications subpanel e an Application that is in scheduled maintenance is grayed out the message System is in scheduled maintenance is displayed in the Applications subpanel and the Application is marked with the scheduled maintenance icon 6 The Applications subpanel displays the status of each Application that you have added to up time This subpanel has two views Condensed View and Detailed View up time 5 User Guide d up time Viewing All Applications Condensed View The following image illustrates the Condensed view of the View Applications subpanel Application Status Application Name Description Status of Master Services Status of Regular Services gt ejejejap o Hr Peer 088 A 5 Gao B Databases Gil The Condensed view is the default view for this subpanel and displays the following information e the name of the Application e adescription of the Application if one was added when the Application was defined e the status of each service in the Application The status of the service is denoted by a colored bar in the Status of Master Services and
303. hether or not the service is a master service monitor The Alert Profiles section of the subpanel displays which Alert Profiles have been associated with the Application For information about viewing more details about Applications see Viewing System and Service Information on page 50 Editing Applications To edit an Application do the following 1 Inthe My Infrastructure panel right click the name of the Application that you want to modify then click Edit The Edit Application window appears 2 Edit the Application setting as described in Adding Applications on page 101 up time software 103 ainjonsjseajuy INOA Huibheuew pue buiuijeg FI Defining and Managing Your Infrastructure Working with SLAs Working with SLAs In up time a service level agreement SLA measures your organization s ability to meet pre defined performance goals These goals focus on various aspects of your IT infrastructure and each can include any number of monitored systems From the My Infrastructure panel you can view your existing SLA details by clicking the SLA name see Viewing SLA Details on page 360 for more information Current User admin List fs Ku up time 5 Global Scan My Enterprise Services Users Reports Config Search co Service Level Agreement General Information Enterprise Application AERP REIR A Name Display Name Description Target Percentage Monitoring Period v n Tue Wed Thu
304. highest values returned by the instances and then compares them to the thresholds that you set If the values exceed the thresholds up time issues an alert The monitor does not pinpoint the specific instance s that have exceeded the defined thresholds For example you are monitoring an ESX server that is running three instances You configured the ESX Workload monitor to collect data samples every 10 minutes and to issue a warning when memory usage exceeds 300 MB The three instances are using the following amounts of memory 110 MB 227 MB and 315 MB The ESX Workload monitor focuses on the value of 315 MB and since it exceeds the warning threshold issues an alert LL Configuring ESX Workload Monitors To configure an ESX Workload monitor do the following 1 Complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 gt xo O o fe 5 e Oo 7 2 Complete the following fields e Time Interval The amount of time in minutes at which the monitor will collect data samples from the ESX server e CPU Warning Threshold The amount of processor power measured in megahertz MHz that the instances on the ESX server must consume before up time issues a warning e CPU Critical Threshold The amount of processor power measured in megahertz MHz that the instances on the ESX server must consume before up time issu
305. hive command with one or more of the following options cd f lt filename gt Imports a single file 1 e an archive category s data for a single date You must specify the full path to the file name d lt date gt Imports all files with the specified date in YYYY MM DD format e D lt directory gt The directory containing the archived files Note that you must specify this option when using the d option e c lt directory gt The full directory path to the file uptime conf For example enter the following command to import all of the data archived on September 18 2006 which are located in the default directory for archived data restorearchiv d 2006 09 18 D usr local uptime archives c usr local uptime Exporting and Importing the DataStore 548 In cases where you need to perform a wholesale backup of the existing DataStore e g migrating your DataStore to another database up time includes two command line utilities fulldatabasedump Creates a compressed XML file of the contents of your DataStore fulldatabaseimport Imports the archived data back into your DataStore Both utilities work with all of the databases that up time supports up time 5 User Guide f up time up time software Archiving the DataStore Archiving the DataStore To archive your DataStore do the following 1 2 Shut down the up time Data Collector service Navigate to the scripts folder under the direct
306. hold Report e File System Service Time Summary Report Enterprise CPU Utilization Report 428 The Enterprise CPU Utilization report enables you to compare the processing power of different types of systems in your environment Performing this kind of comparison is difficult because different types of systems use different processors for example a Windows server uses an Intel processor while a Solaris server may use a SPARC processor The benchmarks for measuring the power of each type of processor will be different An Enterprise CPU Utilization report offers a quick snapshot of the overall performance of the servers in your environment Based on the information in the report you can then determine how best to optimize CPU capacity across your enterprise up time can measure processing power using statistics called a power units Power units are the number of CPUs on a system multiplied by the speed of the processors For example a Solaris server has four CPUs and each CPU runs at 168 Mhz The total number of power units for the server is 672 4 x 168 If you compare this to a Windows server with one CPU running at 2900 MHz 2 900 power units then you can conclude that the Windows server has more processing power up time 5 User Guide hy up time Reports for Capacity Planning Enterprise CPU utilization is a percentage that is derived by dividing the total number of power units used by the total number of power units availabl
307. host down Unable to conta Mon Apr 07 10 CRIT retry CRIT Ping completed Mon Apr O7 10 22 01 EDT 2008 SplunkS PING css w2ksur x86 OK CRIT retry Ping completed Tue Apr Ol 3 54 EDT 2008 splunk gt UPTIME css w2ksvr x86 CRIT retry CRIT Unable to conta Tue Apr 01 15 4 EDT 2008 splunk gt UPTIME css w2ksvr x86 OK CRIT retry Unable to conta Tue Apr 01 14 12 31 EDT 2008 SpluNkS Platform Performance Gatherer CRIT retry CRIT Tue Apr 01 14 09 51 EDT 2008 Configuration Update Gatherer UNKNOWN pending OK Agent up time Tue Apr 01 14 09 50 EDT 2008 Configuration Update Gatherer UNKNOWN UNKNOWN pending No previous sta Tue Apr 01 14 09 38 EDT 2008 PING css w2ksur 86 UNKNOWN pending OK Ping completed Tue Apr 01 EDT 2008 PING css w2ksur 186 UNKNOWN UNKNOWN pending No previous sta Tue Apr 01 2 EDT 2008 Platform Performance Gatherer UNKNOWN UNKNOW pending No previous sta Tue Apr Ol 2 EDT z Platform Performance Gatherer UNKNOWN pending CRIT retry Tue Apr 01 14 09 13 EDT UNKNOWN UNKNOWN pending No previous sta Tue Apr 01 14 09 13 EDT 2008 UPTIME c UNKNOWN pending OK uptime agent ru el up t ime software Viewing System and Service Information For more information on what each status means see Understanding the Status of Services on page 21 Outages Lists in tabular format the services that have suffered outages along with the time at which the outage occurred The Outages table is shown below The Outages ta
308. i Sat Sun Mon e day ranges and lists can be mixed the following examples are correct e Fri Sun Mon e Fri Sun Mon Basic Expressions Using the building blocks outlined in the previous section use the following templates to create basic expressions that are used to define time periods in up time Note that shaded components of a template are optional up time software 569 ER ES ETE Y 570 Time Period Definitions Fixed Dates lt month gt lt date gt lt year gt lt time range gt Basic example Oct 28 2008 Spaces are optional Oct28 2008 Time ranges are optional Oct 28 2008 7 PM 11 PM Oct28 20087PM 11PM Note Fixed dates that do not include a time range are interpreted to include the entire day i e 12 00 a m through 11 59 p m although this will not automatically appear in the defined time period Fixed Date Ranges from lt month gt lt date gt lt year gt lt time range gt to lt month gt lt date gt lt year gt lt time range gt Basic example From Oct 28 2008 to Oct 29 2008 Spaces are optional FromOct28 2008to00ct29 2008 Time ranges are optional From Oct 28 2008 7 PM to Oct 29 2008 2 AM Note A fixed date without a time that is at the end of a date range is interpreted to include the first minute of the next day e g up time converts From Oct 28 2008 to Oct 29 2008 into From
309. ice Metrics graph indicates delivery and retrieval times are not exceeding defined thresholds and up time is not sending out critical alerts it is still an ideal investigative starting point if you are getting critical feedback from your users about email delivery times If the Email Delivery monitor s Service Metrics graph confirms that there are delays somewhere within your network infrastructure you can investigate further by using the service monitor you created for your mail server Co ordinate your Email Delivery monitor s metrics graphs or reports with those from a service monitor you have assigned to your mail server e g Exchange while focusing on metrics that may be related outgoing or incoming mail time delays For example in the Exchange service monitor metrics graph below the mail server experienced a high up time 5 User Guide dh up time Email Delivery Monitor SMTP Local Queue Length that did not always coincide with the SMTP Messages Per Second count localhost 9999 Exchange uptime exchange Service Metrics Date Range T e rd fed ESE EES vi gt upsime 5 Exchange up Metrics Date Range Thu Apr 03 0 03 11 44 26 EDT 2008 16 1 600 15 150 14 1 13 1 12 1 1 1 10 1 3 g 3 6 5 4 1 0 NW m d Per Second m Per Second BBB T Inbound Connections VE ser 9 MG swt mj SMTP Loading Java Applet 4 SJONUON UoneaI ddy up time software 235 Applicatio
310. ice Monitor User Group System Group An Alert Profile can send an alert via email or to a pager or a cell phone or a Windows popup alert You can configure any or all of these actions to occur simultaneously For example if a Web server process stops responding the system administrator can be notified Enabling the Windows Messaging Service In order to receive popup alerts from up time the Windows messaging service must be enabled on the recipient s computer To enable the Windows messaging service do the following 1 In Windows select Start gt Control Panel 2 Inthe Control Panel double click Administrative Tools and then double click Services The Services window appears 3 Find and then double click Messenger in the list of services The Messenger Properties dialog box appears 4 Inthe Messenger Properties dialog box select Automatic from the Startup type dropdown list 5 Click Apply up time software 381 Alerts and Actions Alert Profiles Creating Alert Profiles 382 To create Alert Profiles do the following 1 2 On the up time tool bar click Services In the Tree panel click Add Alert Profile The Add Alert Profile window appears Type a descriptive name for the profile in the Name of Alert Profile field In the Start alerting on notification number field enter the number of times an error must occur before up time sends an alert notification Enter the number of times
311. icular uptime conf file and Restarting up time Services In addition to the Web interface the up time Monitoring Station consists of the following services e DataStore e Web server Data Collector also called the Core These services run in the background and start automatically after the operating system on the server hosting up time starts However system administrators may need to stop the up time services for example before making configuration changes to the uptime conf file performing an upgrade or archiving the DataStore Stopping the up time Services To stop the up time services in Windows do the following 1 Select Start gt Control Panel 2 Double click Administrative Tools and then double click Services 3 Inthe Services window find the following entries and click Stop the service e up time Web Server e up time Data Collector e up time Data Store To stop the up time services on Solaris or Linux do the following 1 Log into the Monitoring Station as user root up time 5 User Guide dy up time Overview 2 Type the following command to stop the Web server etc init d uptime httpd stop 3 Type the following command to stop the Data Collector etc init d uptime core stop 4 Type the following command to stop the database etc init d uptime datastore stop Starting the up time Services To restart the up time services in Windows do the following 1 Select Start gt Control Pa
312. iewing 119 360 service monitor File System Capacity 167 Process Count Check 174 SQL Server Advanced 266 up time Agent 192 Service Monitor Availability report 460 Service Monitor Metrics report 425 Service Monitor Outages report 461 service monitors 135 adding alert settings 149 adding information 142 adding timing settings 148 advanced 138 321 external check 328 with retained data 326 agent File System Capacity 167 overview 166 Performance Check 170 Process Count Check 174 alert settings 148 application ESX Workload 217 Exchange 194 IIS 200 Live Splunk Listener 238 Splunk query 236 up time Agent 192 WebLogic 203 WebSphere 211 cloning 151 comparisons 143 database MySQL Advanced Metrics 244 MySQL Basic Checks 251 Oracle 256 Oracle Advanced 253 Oracle tablespace 259 SQL Server Advanced Metrics 266 SQL server Basic Checks 262 SQL Server Tablespace 270 Sybase 275 getting help 150 up time software identification 141 monitor settings 142 Monitoring Period 150 network DNS 280 FTP 283 HTTP 285 IMAP 289 LDAP 291 NFS 295 NIS YP 297 NNTP 299 Ping 303 POP 305 SMTP 309 SNMP 311 SSH 307 TCP 318 overview 17 136 template 141 testing 152 timing settings 146 types 137 Windows Active Directory 187 Event Log Scanner 178 Service Check 182 SMB 185 services filtering 58 service groups 20 starting 531 stopping 530 viewing 129 Services panel 8 SMB Check monitor 185 SMTP monitor 309 SNMP 311 MIB Browse
313. ilar privileges These privileges enable the members of a group to do the following e work with specific systems or network devices e receive up time alerts from those systems and devices e participate in any number of defined service alert monitoring escalation paths A member of a user group can view either individual systems or multiple systems in a system group The following diagram illustrates how user groups work in up time gt 3 Single System 3 ot e 2 oD HEG ra User Group Group of Systems Each up time user must belong to at least one user group In a small installation of up time there may only be one user and one user group In larger installations you can set up such user groups as Operators Help Desk System Administrators Network Administrators DBAs Development QA Operations Management and the like 341 Configuring Users Working with User Groups Adding User Groups To add user groups do the following 1 Inthe Navigation pane click Add New User Group 2 Enter a name for this group in the User Group Name field 3 Optionally type a short description in the User Group Description field 4 Select the users to add to the group in the Available Users list then click Add 5 Optionally select one of the systems or Elements from the Available Elements list then click Add 6 Optionally select one of the groups from the Available Element Groups lis
314. iles The Action Profiles subpanel appears displaying the settings that you configured when you created the profile as well as a list of the services that are attached to the profile To test whether or not the profile works click the Test Action Profile button A popup window appears and the Monitoring Station tries to carry out the action defined in the profile When the action is completed the message Action Profile tested appears in the popup window If an error message appears in the popup window edit the profile and test it again Editing Action Profiles To edit Action Profiles do the following 1 2 3 On the up time tool bar click Services In the Tree panel click View Action Profiles Click the Edit Action Profile icon beside the name of the profile that you want to edit up time software 395 Alerts and Actions Action Profiles The Edit Action Profile window appears 4 Edit the Action Profile fields as described in the section Creating Action Profiles on page 391 396 up time 5 User Guide hy up time Monitoring Periods Monitoring Periods Monitoring Periods are the times over which a service monitor will be actively monitoring a host The Monitoring Periods also apply to the times when up time sends alerts up time comes with the following Monitoring Periods e 24x7 Monitoring is performed 24 hours a day seven days a week e 9am to 5pm weekdays Monitoring is performed from 9 a
315. ilter up time agent 3 9 solaris 1 17 Creating a Service Monitor Outages Report To create a Service Monitor Outages report do the following 1 Inthe Reports Tree panel click Service Monitor Outages 2 Inthe Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 3 Select one of the following options from the Sort by dropdown list up time software 461 Using Reports 462 Reports for Availability e Sample Time by Entity Service Name by Entity e All Sample Times From the Sort Direction dropdown list select Ascending or Descending If you want to generate reports for groups of systems select the groups from the List of Groups area To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views If you are generating reports for specific systems in your environment select them from the List of Entities Select a report generation option See Report Generation Options on page 402 for details To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information up time 5 User Guide hy up time Reports for J2EE Applications Reports for J2EE App
316. ime Installation Plan siccsatsnnnscatcnscssasswaddoatneweneiinacdens cnesamastunss 26 Installation RequirementS aarunnnannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 27 up time Monitoring Station srernnnnvvrrnnnnnnrrvrrnnnrrernnrn vr rrnnnnnnnennnnnn 27 up time Agents 2 edda maler 28 Installing the up time Monitoring Station s xs aaxunnnnnnnr 29 Before YOU Begin vesen E EE 29 Installing the Monitoring Station on Windows rarrnvrrnnnnnnnrnnnnnnnr 30 Installing the Monitoring Station on Solaris or LINUX rrnnnnrrrrrnrr 32 Installing the Monitoring Station as a Virtual Appliance 35 Post Installation Tasks aarnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 37 Configuring the Monitoring Station to Use Oracle rrrrrnnnnrnnnnrr 37 Upgrading to up time 5 znasnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 39 Installing AgentS as a xaannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 40 Installing Agents on WINKOWS rrrrnnnnrrrrvrnrnnnnnnnnnvvrrrrrnrnrnrnnnnrnnnnnn 40 Installing Agents on Solaris aasesrnskeemeeeradndned 41 Installing Agents on UNIX ssassesssnnnneeesinnnnnnesssinnnnnnnessennennnneneenne 42 Installing Agents on LinuX sssssssssssnnnnesesinnnnnneneennnnnnneesennnnnnneeneennn 42 Installing Agents on IBM pSeries Servers mrrrrrrrrrnnvrrenrvnvrrrnnrnnnr 43 up time 5 User Guide D up time Getting Started D oy Accessing and Exiting up time asrarnannnnnnnnnnnnn
317. ime UNKNOWN File System Capacity 39 16 10 18 46 67 0 00 3 99 PING lab websphereS1 94 89 0 00 5 11 0 00 0 00 Plants Response 92 53 0 00 3 46 0 00 4 01 UPTIME lab websphereS1 95 97 0 00 0 06 0 00 3 97 WebSphere 95 35 0 00 0 65 0 00 4 00 Optionally click the Generate Graph button to display pie charts that graph the status of each service as shown below File System Capacity PING lab websphere51 E Maintenance E Maintenance E Unknown GB Unknown E Okay E Okay E Warning E Warning E Critical E Critical Manage Services Lists the following information about the services associated with a particular host the name of the service e the service group if any to which the service belongs e the monitors if any associated with the service 54 up time 5 User Guide hy up time Viewing System and Service Information If the host is part of a service group the services for all of the hosts that are members of the group appear in the Manage Services subpanel Click the name of the service to view information about that service You can edit the service information as well as the Alert Profiles and Action Profiles associated with the service by clicking the appropriate button in the subpanel You can add services instances by clicking the Add Service tab in the Manage Services subpanel The services that you add do not appear in the Manage Services but in the Service Instances subpanel For more inf
318. imultaneously apply common service checks to hosts that you are monitoring to which you want to add the system This field is optional Port The number of the port on which you will be connecting to the system Leave this field blank to use the default port for the type of system that you are adding 94 up time 5 User Guide f up time Field Working with Systems Description Community If you are adding a Net SNMP system to up time specify the read community which acts like a user ID or password that gives you access to the system Valid options are public which enables you to retrieve read only information private which enables you to access all information HMC Hostname The name or the IP address of the Hard ware Management Console HMC that is being used to manage one or more pSeries LPAR servers in your environ ment Managed Server The unique identifier of a pSeries LPAR server that is managed by an HMC Username If you are adding a Net SNMP or Novell NRM system to up time specify the user name required to access the system Password If you are adding a Net SNMP or Novell NRM system to up time specify the password required to access the system Group The name of the entity group a set of systems that have been combined in a meaningful way to which you want to add this system This field is optional SSL For agent syste
319. ind the DataStore archive in the Virtual Store instead of the default location i e C Users uptime AppData Local VirtualStore Program Files lt uptime install directory gt 550 up time 5 User Guide f up time up time System E up time Diagnosis Diagnosis The following options assist you with diagnostic steps that you may need to perform should you encounter problems with up time You have access to two types of logs system logs and audit logs that track user actions Additionally you can generate a problem report for up time Customer Support if further analysis is required System and audit logs are written to the logs directory and problem reports are found in the GUI directory both of which are found in the up time installation directory e Linux usr local uptime e Solaris opt uptime e Windows C Program Files uptime softwareluptime Store instead of the default location i e C Users uptime AppData Local VirtualStore Program Files lt uptime install directory gt Windows Vista users can find the audit log in the Virtual vent Logging up time automatically logs system events to the logs directory These weekly logs follow the uptime log lt year gt lt week gt log naming format You can determine the type of system information up time writes to the log by using one of the following values DEBUG s INFO ARN e ERROR e FATAL e ALL up time software 551 Configurin
320. indows Service Check Monitors rrrrrrrrrnnnnrrnnnr 182 Windows File Shares SMB nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 185 Configuring Windows File Shares SMB Monitors x rrrrnn22 185 Active Directory sssssssssnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn nnn 187 Configuring Active Directory Monitors rrrnnnnvnnnnnnrrnennrnnnrnnnnen 187 Application Monitors Uptime Agent sasccsccicccascdscsesecsecasceccosscsscdscaascdeascaseaesnaaecs 192 Configuring Uptime Agent Monitors arrrnrnnnnnnnvnnnnnnrrrennrnnrnnnnnn 192 Exchange masssesssmenesanmssnnmidakneanne ke vannen enken dkne sank dakn ddmd 194 Configuring Exchange 2003 Monitors rrrrnnnnnvrnnnnnnrrrrnnnnnrnnnnnnn 194 Configuring Exchange Monitors arrrrvnnnnnrrnnnnnnvnvnnrnnnrrnenntn raneren 196 EN 200 Configuring 15 MomlorSumaansuenmnubunedndgsive bad 200 WOebLogi sissies tenedecnseaeductiwasicaseuesanasenawnsebencnenesinasadsenne 203 Monitoring WebLogic PE uaverssmmagmranneeen oser 206 Configuring WebLogic 8 Monitors rrrrrrnnvrrnnnnnnnvvnnrnnnrrenrnnv renn 207 up time software xi D OD ie k O Oo gt p D gt pr 7 xii Monitoring WebLogic 9 11 Lsansessesmmmmvusemhnne 208 Configuring WebLogic Monitors rrrrvvnnrnnrrnnnnnannvnnrnnnrnnnnnrnnnnnnnen 209 WeOebSphere vummmuseens seanuennsd nd anisnkadend ina ne kan bd n 211 Deploying the WebSphere Performance Servlet ccccceee 214 Configuring
321. information on the frequency duration and recovery time of critical level events and the overall reliability of your monitored systems This information is presented for services that are associated with groups of Elements whether a pre defined group or an manually selected list of individual Elements Compared to the Service Monitor Outages report the Incident Priority report instead of providing an auditable list of outages uses a comparative approach to indicate how efficiently systems are running in relation to each other and furthermore how efficiently problems are dealt with up time software 457 Using Reports Reports for Availability 458 In order to report this efficiency the following building blocks are available as elements in the report e Incidents The total number of outages for all service monitors associated with selected Elements Critical level events for multiple service monitors that are associated with a single Element will each contribute to the incident count e Incident Top 20 The 20 systems with the highest incident counts for the given time period incidents being the number of times service monitors associated with selected Elements were in a critical state e Total Downtime The total amount of time that all service monitors associated with selected Elements were in a critical state Multiple service monitors in a critical state that are associated with a single Element each contribute to the down
322. ing on an ESX server is ready to run but cannot run because it cannot access the processor on the ESX server e Workload Profile Used The percentage of CPU time that an instance running on an ESX server is using 4 If you want to generate reports for systems in specific groups select the groups from the List of Groups area 472 up time 5 User Guide hy up time Reports for Virtual Environments 5 To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views 6 If you are generating reports for specific systems in your environment select them from the List of Entities 7 Select a report generation option See Report Generation Options on page 402 for details 8 Do one of the following e Click the Generate Report button e Enter a name for the report in the Save to My Portal As field and optionally enter text in the Report Description field Then click Save Report The report parameters are saved to the My Portal panel Doing this does not generate the report 9 To schedule the saved report to run at a specific time or interval click the Scheduled checkbox See Scheduling Reports on page 407 for more information on configuring a scheduled report VMware Infrastructure Density Report The VMware Infrastructure Density report enables you to assess the carrying capacity and workload distribution o
323. inimum Server Processes Set parameter If the status is Bad examine your server by doing the following 8 In Novell NRM click Profiling Debugging 9 Check the information for server process functions 10 Change the Maximum Server Processes and the Minimum Server Process Set parameters Available Server Processes This statistic enables you to view the number of available processes on your server as a graph The graph charts the processes that are available every five seconds over a 50 second period If the status is Suspect or Bad you should increase the Set parameters for Maximum Server Processes and the Minimum Server Processes settings If the number of available server processes has not reached the maximum and is not increasing you should add memory to your server Abended Thread Count This statistic enables you to view the threads that have ended abnormally abended and are suspended This statistic returns the following statuses If the status is Suspect or a Bad your server has abended and has recovered automatically by suspending the offending thread while leaving the rest of the server processes running s a result some of the server s functions were compromised You must determine which module driver or hardware the abended threads belong to and then take the appropriate action CPU Utilization This statistic enables you view as a graph how busy any given CPU is up time tracks usage on a per CPU basis coll
324. ints visible or invisible or displays them in two or three dimensions You can change the following attributes of data points style width height color border and pattern and image e Apply value formatting styles and masking Applies formats and masks to your data by value percentages horizontal axis vertical axis and cursor e Marks Graphs any of the following every data point of every statistic every data point of any statistic and every n data point e Data Source Lists all data points by value and time Using Data Source you can perform calculations on retrieved statistics and graph the result You can import perform calculations perform contrasts and comparisons and graph external data with collected statistics Exporting Graphs Using the Export tab you can send your graph by e mail or save it to a directory on your computer or network You can export your graph in three ways e A one of the following formats Bitmap Metafile SVG Postscript PDF PCX GIF PNG or JPEG e Inthe native up time graph format e none of the following data formats text HTML table XML or Excel Changing the Look and Feel of a Graph Using the Themes tab you can change the appearance of a graph You can select one of eight styles for the graph as well as specify whether the graph should be in 3D or if it should be to scale 486 up time 5 User Guide f up time CHAPTER 21 Using G raphs This chapter
325. ions and not using much of the hardware Instead of wasting resources you can consolidate these applications in a virtual environment for example using VMware This enables you to run applications on distinct servers but without using as much hardware The Server Virtualization report can help you to pinpoint physical servers that can be combined on a single virtual server The report highlights up time 5 User Guide hy up time Reports for Capacity Planning servers that are good candidates for virtualization ones that do not fully use their CPU memory or disk resources In the report each system will have one of the following stars beside it g Indicates that the system is a good candidate for virtualization The corresponding metrics are highlighted in green Indicates that the system is a reasonable candidate for virtualization The corresponding metrics are highlighted in blue Indicates that the system is a poor candidate for virtualization The corresponding metrics are not highlighted As well the metrics for Average Power Units Used Power Units measure the power of CPUs by multiplying the number of CPUs on a system by their speed Avg Disk I O and Avg Network I O for each system may be highlighted Creating a Server Virtualization Report To generate a Server Virtualization report do the following 1 Inthe Reports Tree panel click Server Virtualization 2 Inthe Date and Time Range area
326. ird day of every seventh month Select the second option then do the following e select first second third fourth or last from the first dropdown list select a day of the week on which the report will run from the second dropdown list select a number from to 12 from the third dropdown list For example if you select second Tuesday and 9 from the dropdown lists the report will be run on the second Tuesday of every ninth month If you are saving an existing report after editing it or saving B a new report with the name of an existing one up time dis plays a warning dialog box Click OK on the dialog box to overwrite the report Or click Cancel on dialog box to give the report a different name up time software 409 suondo 140day Huipugys pun SL Understanding Report Options The Report Log The Report Log The Report Log tracks the progress and status of scheduled reports or reports that are running in the background Using the Report Log you can quickly determine whether or not reports have been successfully generated If they have not then you can use the log to determine why report generation failed The Report Log subpanel tracks the status of reports in the following sections e Pending Reports Reports that are in the report queue and are waiting to run This section contains the following information e the name of the report e the description of the report if available e whether or not the
327. is being used to its optimal capacity Consider the up time 5 User Guide up time Reports for Virtual Environments i following example in which the VMware Workload report returns the following information about the top ten CPU loads on the VMware server VMware Workload Top Ten CPU for vmh esx5 E lab websphere51 E uptime ops1 E ginger O lab1 rhes4 x64 E css1 oracle O css6 sles9 x86 DO uptime bes O css5 w2k3ee E css6 w2k3se x86 E demo ad01 10000 8840 8000 6000 4000 2000 2008 05 22 2008 05 22 01 30 2008 05 22 02 30 2008 05 22 03 30 2008 05 22 04 30 2008 05 22 05 30 2008 05 22 06 30 2008 05 22 2008 05 22 08 30 2008 05 22 09 30 2008 05 22 10 30 2008 05 22 11 30 2008 05 22 12 30 2008 05 22 13 30 This graph indicates that on average the ten most CPU intensive instances use only 20 of the server s CPU capacity The PU on the server can handle up to three to four times its current load The memory usage section of the report indicates that the instances are using roughly the same amount of memory VMware Workload Memory for vmh prod B Active Directory E Brightmail VWY2K3 Enterprise O DNS and FogBugz D ginger no data 15 Gigabytes o on hk mm o o o o o o oa o o o o o p 3o 303030 3030303030303og20oga020og0g0oA0g0 85 858585 8585 B56 Bo BS BSG Bo BG BG Bo BS BSG Ba BSS 29269 69 69 796969 6969 60 amp 29 69 69 69 GG dN 69 5 60 ABN 5 5 6H BN 60 5
328. ish 258 up time 5 User Guide l up time Oracle Tablespace Check Oracle Tablespace Check The Oracle Tablespace Check monitors the size as a percentage of individual tablespaces within Oracle database instances The Oracle Tablespace Check alerts you when a tablespace in your instance exceeds the defined thresholds Each database is logically divided into one or more tablespaces One or more data files are explicitly created for each tablespace to physically store the data in a tablespace The combined size of the data files in a tablespace is the total storage capacity of the tablespace For example ID TABLESPACE NAME INDX OEM REPOSITORY RBS SUPPORT SYSTEM TEMP TOOLS UNANET USERS SNe Fw WwW om on Total Bytes Bytes Free Free 56 623 104 56 614 912 99 99 31 465 472 3 473 408 11 04 104 857 600 75 489 280 71 99 52 428 800 52 420 608 99 98 56 623 104 2 250 216 5 03 71 303 168 71 294 976 99 99 2 388 608 8 380 416 99 90 314 572 800 300 539 904 95 54 20 971 520 20 963 328 29 96 In the above table the SYSTEM tablespace is over 95 full If you set the Warning threshold to 90 and the Critical threshold to 95 the Oracle Tablespace Check returns a status of Critical Use the Oracle Basic Checks monitor to determine the availability of Oracle databases the performance of services and the matched response of scripts For more information see Sybase on page 275 Configuring Or
329. itor Application or SLA if their state changes from OK to Warning or Critical Alert Profiles are normally associated with any of these monitored items at the time of their configuration Alert Profile assocations can also be modified with existing service monitor definitions See Chapter 8 Using Service Monitors Working with Applications on page 101 and Adding and Editing SLA Definitions on page 371 for more information about configuring Service Monitors Applications and SLAs respectively 384 up time 5 User Guide hy up time Working with Custom Alert Formats Working with Custom Alert Formats up time s standard alert format is well suited for most alerting needs However you can modify the content of the alert up time comes with three custom alert templates You can change the content of the alert by adding or removing variables from the template To define a custom alert format do the following 1 Define an Alert Profile as described on page 382 2 Inthe Custom Format Options section click Custom Formats 3 From the dropdown list select one of the following options e Small Template Contains the date and time of the alert as well as the names and status of the service and host for which the alert was generated This corresponds to the template used for pager alerts e Medium Template Contains the information in the small template as well as an expanded subject line the type of notification
330. k any value in the Busy column to open the Disk Performance Statistics report subpanel See Disk Performance Statistics Graph on page 514 for more information up time 5 User Guide hy up time Viewing All Services Viewing All Services Services are specific tasks or sets of tasks performed by an application in the up time environment up time service monitors continually check the condition of services to ensure that they are providing the required functions to support your business For more information on services see Services on page 8 You can view the services assigned to each system in your environment by clicking on the View All Services tab This tab contains the following information F e the name of the service e the monitor that is associated with the service e the status of the service e the date and time on which the last check was performed e the number of days hours and minutes since the last check e a human readable text message that was returned by the monitor e g up time agent running on MailServer up time agent 3 7 2 linux O lt OD 2 OD 2 gt e lt Oo lt n o pen i a 1 up time software 129 Overseeing Your Infrastructure Viewing the Resource Scan Report Viewing the Resource Scan Report Resource Scan is a dynamically updated report that charts the percentage of various resources that are being used by the s
331. kets rather than bytes e AIX e FreeBSD e IRIX e MacOS e Novell NRM If you are monitoring one or more of these systems you can specify a ratio for converting packets to bytes Different network interfaces have a maximum packet size called a Maximum Transmission Unit MTU an ethernet interface for example has an MTU of 1 500 bytes Most interfaces will not transmit packets at the MTU The value that you specify for the bytes per packet conversion will be based on the observed performance of the network interface Fifty percent of MTU is a good average to use the default value in up time is 750 The report contains the following information e the display name in up time of the system e the names of each network interface on the system e the total amount of data measured in megabytes that is moving in and out of each network interface Generating a Network Bandwidth Report To generate a Network Bandwidth report do the following 1 Inthe Reports Tree panel click Network Bandwidth 2 Inthe Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 If no data available for the date range the report displays a message indicating that there is no data for the time period up time software 439 Using Reports 440 Reports for Capacity Planning To only include data from certain hours during the day select those hour
332. l click the name of the system whose information you want to graph 2 Inthe Tree panel click the Graphing tab 3 Click TCP Retransmits 4 Select the start and end dates and times for which the graph will chart data For more information see Understanding Dates and Times on page 22 5 Click Generate Graph up time software 503 Using Graphs Graphing User Activity Graphing User Activity up time uses the following graphs to chart the activity of users on a system e Login History The number of times or frequency at which a user has logged into a system during any 30 minute time interval e Sessions The number of sessions or number of distinct users who are logged into a system during any 30 minute time interval Using these graphs an administrator can identify user load and whether or not there is any correlation between user logins or number of sessions and problems with the performance of the system These graphs use the same input criteria but they return different data Generating a User Activity Graph To generate a user activity graph do the following 1 Inthe Global Scan or My Infrastructure panel click the name of the system whose information you want to graph 2 Inthe Tree panel click the Graphing tab 3 Click either Login History or Sessions 4 Select the start and end dates and times for which the graph will chart data For more information see Understanding Dates and Times on page 22
333. l Response Time thresholds for the length of time a service check takes to complete For more information see Configuring Warning and Critical Thresholds on page 144 3 To save the data from the thresholds for graphing or reporting click the Save for Graphing checkbox beside each of the metrics that you selected in step 3 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish 202 up time 5 User Guide hy up time WebLogic WebLogic The WebLogic 8 and WebLogic monitors collect data that enables you to determine whether or not there is a performance problem or a failure on a WebLogic application server Using the data that the WebLogic monitor collects you can determine the root cause of the issue by generating a report see Reports for J2EE Applications on page 463 for more information LL gt D T O oD e e 3 eo up time software 203 Application Monitors WebLogic The WebLogic monitors collect the following metrics fro
334. l administrator Type the following command sh up time 5 0 lt buildf gt lt platform gt bin where lt build gt is the number of the up time build that you are installing and lt plat form gt is the operating system on which you are installing up time For example e Linux up time 5 0 455 rhes4 x86 bin or up time 5 0 455 sles9 x86 upgrade bin e Solaris up time 5 0 455 solaris sparc bin It can take up to several minutes for the components of the installer to be extracted from the bin file Wait while this process completes On the Introduction page press Enter to continue On the License Agreement page carefully read the up time end user license agreement Press Enter to scroll through the agreement At the DO YOU ACCEPT THE TERMS OF THIS LICENSE AGREEMENT Y N prompt type y and press Enter Do one of the following to set the directory in which up time will be installed e Press Enter to accept the default location opt uptime on Solaris and usr local uptime on Red Hat and SLES up time software 33 5 a Si Ko xe 3 D Installing up time nstalling the up time Monitoring Station e Type a new location at the command prompt for example opt uptime on Solaris then press Enter The uptime user account must be able to access the directory that you specify 8 Do one of the following to set the location where the up time DataStore will be installed e Press Enter to
335. ld scales up with each additional CPU present on a monitored system Select any of the following statistics to include in the report e sys CPU system time e usr CPU user time e wio CPU wait I O time The statistics that you select will be added together and compared to the threshold that you specified in step 4 For example to see when system time and user time are over 80 select the sys and usr options and then enter 80 in the Max CPU field If you want to include a list of processes that are in the run queue in the report click Show Processes up time 5 User Guide hy up time Reports for Capacity Planning 8 Click the Maintain Graph Scale option to keep the scale of the graphs in the reports consistent For example if you have three systems and one is 1 200 minutes over the threshold then scale of the graph is 1 200 for all of the graphs in the report Minutes over Threshold of 2 0 for 10 1 1 35 Minutes o 8 8 2007 05 04 2007 05 05 2007 05 06 2007 05 07 2007 05 09 2007 05 10 2007 05 11 2007 05 12 2007 05 13 2007 05 14 2007 05 15 2007 05 16 2007 05 17 2007 05 18 2007 05 22 2007 05 23 Minutes over Threshold of 2 0 for AIX 5L aix51 Minutes 2007 05 02 2007 05 03 2007 05 04 2007 05 05 2007 05 06 2007 05 07 2007 05 08 2007 05 09 2007 05 10 2007 05 11 2007 05 12 2007 05 13 2007 05 14 2007 05 15 2007 05 16 2007 05 17 2007 05 18 2007 05 23 2007 05 24 5 9 R
336. le Users for View list then click Add up time 5 User Guide l up time Working with Views 8 To add previously defined groups of users select one or more entries from the Available User Groups list then click Add 9 Click Save Adding Nested Views You can also create nested views in order to categorize and better manage a larger set of existing views The following can be assigned to nested views e existing Element views e individual Elements e individual users who have view access to the Elements in a view e up time user groups with similar privileges You cannot assign a parent view to a child view or to any other ancestor Before you begin ensure that you have at least one parent view defined For more information see Adding Views on page 108 Adding a Nested View To add a nested view do the following 1 Inthe Infrastructure panel click Add View 2 Inthe Add View window enter a descriptive name in the View Name field This name will appear when listing views in the Infrastructure panel 3 Optionally enter a description in View Description field 4 Inthe Parent View dropdown list select the view to which this nested view will be subordinate 5 To give this nested view its own child views select one or more entries from the Available Element Views list then click Add 6 Select one or more users who can view this group from the Available Users list then click Add up time software 109 ai
337. le remedy Uptime at its option may i procure for you the right to use the Software or ii modify the Software to render it non infringing 5 2 You will at your expense indemnify and hold Uptime and all its officers directors and employees harmless from and against any and all claims actions liabilities losses damages judgments grants costs and expenses including reasonable lawyer fees collectively Claims arising out of any use of the Software by you any party related to you or any party acting upon your authorization in a manner that is not expressly authorized by this Agreement 6 Disclaimer THE SOFTWARE DOCUMENTATION AND ANY IF ANY SUPPORT SERVICES ARE LICENSED AS IS AND UPTIME AND ITS SUPPLIERS DISCLAIM ANY AND ALL OTHER WARRANTIES EXPRESS OR IMPLIED INCLUDING WITHOUT LIMITATION ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE WITHOUT LIMITING THE GENERALITY OF THE FOREGOING UPTIME EXPRESSLY DOES NOT WARRANT THAT THE SOFTWARE WILL MEET YOUR REQUIREMENTS OR THAT OPERATION OF THE SOFTWARE WILL BE UNINTERRUPTED OR ERROR FREE YOU ASSUME RESPONSIBILITY FOR SELECTING THE SOFTWARE TO ACHIEVE YOUR INTENDED RESULTS AND FOR THE RESULTS OBTAINED FROM YOUR USE OF THE SOFTWARE YOU SHALL BEAR THE ENTIRE RISK AS TO THE QUALITY AND THE PERFORMANCE OF THE SOFTWARE 7 Limitation of Liability UPTIME S CUMULATIVE LIABILITY TO YOU OR ANY PARTY RELATED TO YOU FOR ANY LOSS OR DAMA
338. le shares on a system leave this field blank Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 OL O Oo V ie Er a ie Oo gt Oo 185 Microsoft Windows Monitors Windows File Shares SMB 3 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information e Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish 186 up time 5 User Guide hy up time Active Directory Active Directory Active Directory is a distributed network management service that is included in the Microsoft Windows Server 2003 and Windows 2000 Server operating systems Active Directory provides a centralized location for all of the information about the services and resources within your network Using this information you can easily manage information about users netw
339. liance with a regulatory mandate or Service Level Agreements You install Splunk on a server in your datacenter When values are provided for the Splunk settings listed below the Splunk icon splunk gt will appear in the My Portal panel beside the names of services that are in WARN or CRIT states When you click the Splunk icon you will be automatically logged in to your Splunk search page You can change your up time Splunk integration by manually inputting settings in the up time Configuration panel as outlined in Modifying up time Config Panel Settings on page 529 up time software 543 Configuring and Managing up time nterfacing with up time Changing Splunk Server Information for up time You can enable automatic login to the Splunk search page or modify an existing configuration through the following parameters e splunk url The URL of the server on which your Splunk search page is hosted e g http webportal 8000 e splunk username The user name required to log in to your Splunk search page e splunk password The password required to log in to your Splunk search page e splunk soapurl The URL that points to the SOAP management port that Splunk uses to communicate with the splunk daemon e g https webportal 8089 In the URL you must include the port on which the Splunk server listens for requests See the Splunk Admin Manual for more information e splunk version The version of Sp
340. lications The following reports enable you to visualize any performance problems with applications that are running a J2EE environments e WebSphere Report e WebLogic Report WebSphere Report The WebSphere report charts a set of counters that provide insight into the health and performance of a WebSphere Application Server Depending on the number of options that you select the report can become quite long and can take considerable time to generate For most options the report contains charts for two or more metrics Creating a WebSphere Report To create a WebSphere report do the following 1 Inthe Reports Tree panel click WebSphere 2 Inthe Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 3 Select one or more of the following report options e Thread pool A set of counters that report on the number of connection threads that have been created or destroyed that are concurrently active or are hung that are in the thread pool or time that are in use e JDBC Connection Pool A set of counters that monitor the performance of JDBC data sources e Enterprise Beans A set of counters that report the following load values response times and life cycle activities for enterprise Java beans up time software 463 Using Reports 464 Reports for J2EE Applications e JVM Runtime A set of counters that monitor the perf
341. lick Next The Add SNMP Service Monitor window appears See Configuring SNMP Monitors on page 315 for information on setting up the SNMP monitor Deleting OIDs After adding several OIDs there may be OIDs that you no longer want to monitor You can use the SNMP MIB browser to delete the unwanted OIDs To delete OIDs from the Selected OIDs panel do the following 1 Select the OID you want to remove in the Selected OIDs panel 2 Click Delete Selection up time 5 User Guide f up time SNMP Configuring SNMP Monitors To configure SNMP monitors do the following 1 In the SNMP monitor template select the version number of an SNMP implementation from the SNMP Version dropdown list In the v1 v2 Community field enter the community string The community string acts like a user ID or password giving you access to a device via SNMP Common communities are public enables you to retrieve read only information from the device and private enables you to access all information on the device Enter the number of the port on which SNMP is listening in the SNMP Port field If you selected v3 from the SNMP Version dropdown list complete the following settings e v3 Username The user name that is required to connect to an SNMP instance that is using version 3 of SNMP e v3 Authentication Method If the server uses version 3 of SNMP select one of the following options from the list The option that you select determin
342. ll be assigned 9 Select a User Group to which any newly detected users will be assigned 10 Click Save Once saved up time will synchronize its list of users with the up time group in Active Directory at the specified interval LDAP Authentication To use LDAP for user management you need to provide up time with your organization s LDAP information You can also define whether and how much user information is synchronized between LDAP and up time s user list Enabling LDAP for User Authentication To configure up time to check an LDAP listing for user passwords do the following 1 On the up time tool bar click Config In the Tree panel click User Authentication Click Edit Configuration b amp b WO N Select LDAP as the authentication method You will next need to provide access details for the Active Directory server 5 Inthe LDAP URL field enter the address for the LDAP server If directory communication occurs through secure channels such as TLS or SSL ensure this is reflected in the server address e g ldaps instead of ldap 6 Enter the LDAP Query that up time will use on the LDAP server to look up a user s name 7 Continue to the next section to enable and configure synchronization from the Active Directory listing to up time user profiles If you do not wish to synchronize users click Save 352 up time 5 User Guide hy up time Changing How Users Are Authenticated Clickin
343. ll of the databases on the host system This metric is returned from the SQL Server Database object The Database object provides such information about the database as the amount of free log space available or the number of active transactions in the database There can be multiple instances of this object e Total Latch Wait Time ms The total time in milliseconds that it takes to complete the latch requests that were waiting over the last second e Latch Waits Sec The number of latch requests that were not immediately granted and which waited before being granted e Average Latch Wait Time ms The average time in milliseconds that latch requests had to wait before being granted e Maximum Workspace Memory KB The maximum amount of memory in kilobytes that the server has available to execute such processes as sort bulk copy hash and index creation This metric is returned by the SQL Server Memory Manager object which monitors overall server memory usage By monitoring overall server memory usage you can determine whether or not e Bottlenecks exist due to a lack of available physical memory for storing frequently accessed data in cache If so SQL Server must retrieve the data from the disk e You can improve query performance by adding more memory or by making more memory available to the data cache or to SQL Server internal structures 268 up time 5 User Guide dy up time up time software SQL
344. llowing information e name of the report taken from the My Portal panel e date on which the report was run e user name of the person who ran the report The following is an example of a report file name Service Outages 2006 01 24 rfripp pdf To save reports to a file system do the following 1 Inthe Save Report area of the Report subpanel enter a name for the report in the Save to My Portal As field 2 Optionally enter a description of the report in the Description field 3 Select either HTML or PDF from the list of options 4 Click the Publish Report option 5 Click the Scheduled Report option and then select a a date and time for the report to run For more information on scheduling reports see Scheduling Reports on page 407 6 Click Save Report Viewing Saved Reports You can quickly view any reports that were generated on the Monitoring Station and saved to the file system To do so do the following 1 On the tool bar click Reports up time software 405 suondo 140day buipuejsjopun SL Understanding Report Options Saving Reports 2 Click Published Reports in the Tree panel The Report Library window appears The Report Library window lists the reports that were generated on the Monitoring Station in descending order by date Using the Search Function The Report Library window includes a search function that enables you to find specific reports To use the search function do the following
345. llowing settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information e Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish 296 up time 5 User Guide D up time NIS YP NIS YP NIS YP Network Information Services Yellow Pages is a distributed database system that enables you to configure multiple hosts from a central location as well as store and maintain common configuration information in that location You can then propagate the information to all of the nodes in a network The collection of network information is referred to as the NIS namespace The NIS YP monitor performs a lookup on the domain table and key enabling you to e check that a Network Information Service NIS server for a given domain is responding EL e request a specific key from a NIS table This is useful if the contents of the NIS maps are often rebuilt Configuring NIS YP Monitors To configure NIS YP monitors do the following 1 Inthe NIS YP monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Ide
346. low query log nor are queries that would not benefit from the presence of an index because a database table has no rows or just one row Open Tables The number of database tables that are opened independently by each concurrent thread Multiple clients can simultaneously issue queries for a given table Each table is opened independently by each concurrent thread to ensure that multiple client threads do not have different states on the same table For each concurrent thread the table must be opened twice if two threads access the same table or if a thread accesses the table twice in the same query Each concurrent open requires an entry in the table cache The first time any table is opened it takes file descriptors for the data file and the index file Each additional use of the table takes only a descriptor for the data file The index file descriptor is shared among all threads 245 ZL U iy o o 2 ce ce 7 Database Monitors MySQL Advanced Metrics The cache of open tables should be at the level specified by table_cache entries The default value is 64 MySQL may temporarily open more tables to execute queries Unused tables are closed and removed from the table cache when any of the following occurs e the cache is full and a thread tries to open a table that is not in the cache e the cache contains more than table cache entries and a thread is no longer using a table e atable
347. lunk you are using 544 up time 5 User Guide f up time Archiving the DataStore Archiving the DataStore Depending on the amount of disk space available for the continuously growing DataStore administrators can set an archive policy that determines how many month s worth of data is retained Old performance data is automatically archived and removed from the DataStore This archiving procedure works with all databases that are compatible with up time The existing archive policy can be viewed and modified on the Archive Policy subpanel which is accessed from the main Config panel Here the main archive categories are shown along with the number of months for which collected data is retained in the DataStore Every month up time checks the DataStore s entries data that is older than the limit set in the archive policy are written to XML files The XML archives use the following format lt table name gt lt date gt xml gz The archives created reflect the database table structure used to store performance data as well as the date that the stored data represents performance cpu 2006 09 13 xml gz The DataStore is trimmed and the XML files are compressed and stored in the archives directory For example if you installed up time in the default location the path to the archived data will be e Linux usr local uptime archives e Solaris opt uptime archives Windows C Program Files uptime software uptime archi
348. m to 5 p m Monday to Friday e Never No monitoring is carried out You can add Monitoring Periods that suit your needs For example you can create a Monitoring Period called Weekends that only monitors a host from 12 00 a m on Saturday to 11 59 p m on Sunday Adding Monitoring Periods To add Monitoring Periods do the following 1 On the up time tool bar click Services 2 Inthe Tree panel click Add Monitoring Period The Add Monitoring Periods window appears 3 Type a name in the Monitoring Period Name field 4 Inthe Definition section enter one or more time period expressions that combine to create a full Monitoring Period definition See Time Period Definitions on page 567 for information on the types of time period expressions that are valid in up time 5 Click Save up time software 397 Alerts and Actions Monitoring Periods 398 up time 5 User Guide D upitime CHAPTER 18 Understanding Report Options This chapter is an overview of the options available for generating reports in up time and contains the following sections OVEIVIOW erari ka prione i sad aksler erat 400 Generating RepoftS ssaasnsannnsnnnnnnnnarnnnnnnrnnnnnnnnnnnrnrnnne 401 Saving Repris usann LEANTA eds RAAE AA AEEA ARTERE 404 Scheduling Reports sree cay cede ksr EDE EE ERPE LEE REPELER ed h 407 Th Report LOG rari oeoa E T E ee eae 410 399 Understanding Report Options Overview Overview 400 up time
349. m a WebLogic server Variables Metrics Connection Pools FailuresToReconnectCount The number of times that the connection pool failed to reconnect to a data store ConnectionDelay Time The average time that was required to connect to a connection pool ActiveConnectionsCurrentCount The current number of active connections in a JDBC connection pool ActiveConnectionsHighCount The highest number of active connections in a JDBC connection pool LeakedConnectionsCount The total number of connections that have been checked out of but not returned to the connection pool CurrCapacity The current number of database connections in the JDBC connection pool NumAvailable The number of available sessions in the session pool that are not currently being used e WaitingForConnectionCurrentCount The current number of requests that are waiting for a connection to the connection pool 204 up time 5 User Guide f up time Variables Per EJB WebLogic Metrics e AccessTotalCount The total number of times an attempt was made to get an EJB instance from the free pool BeansInCurrentUseCount The number of EJB instances in the free pool which are currently in use CachedBeansCurrentCount The total number of EJBs that are in the execution cache e ActivationCount The number of EJBs that have been activated Other HeapSizeCurrent The amount of memory in bytes that is in t
350. mber of users log in at the same time eDirectory uses multiple server threads However its thread requirements should not cause poor performance because eDirectory cannot use more than its allocated maximum number of threads If this statistic returns a Good status eDirectory is using less than 25 of the available server threads If it returns a Suspect status eDirectory is using between 25 and 50 of the available server threads If the status is Bad eDirectory is using more than 50 of the available server threads Packet Receive Buffers This statistic enables you to view the status of Packet Receive Buffers for the server Packet Receive Buffers transmit and receive packets You can set the maximum or minimum number of buffers to allocate using the up time 5 User Guide f up time Working with Systems Maximum Packet Receive Buffers or Minimum Packet Receive Buffers SET parameters The minimum number of buffers is the number of packets that are allocated at when the system is initialized If the number of Packet Receive Buffers is increasing the system will be sluggish If the number of Packet Receive Buffers reaches the maximum and no Event Control Blocks ECBs are available the server will become very sluggish and will not recover Available Event Control Blocks ECBs This statistic enables you to view the status of available Event Control Blocks ECBs Available ECBs are Packet Receive Buffers that have been creat
351. me 8 Click Generate Graph up time software Workload Graphs 507 sydes9 Buisn pe Using Graphs Workload Top 10 Graphs Workload Top 10 Graphs The three Workload top 10 graphs chart the 10 processes that are consuming the most CPU resources Consumption of CPU resources is tracked via one of the following a user ID a group ID or the name of a process Workload Top 10 graphs enable you to quickly determine which processes are consuming the most CPU resources over a specified time period Each graph uses the same input criteria but they return different data Generating a Workload Top 10 Graph 508 To generate a Workload Top 10 graph do the following 1 Inthe Global Scan or My Infrastructure panel click the name of the system whose information you want to graph 2 Inthe Tree panel click the Graphing tab 3 Click one of the following options e Workload Top 10 User e Workload Top 10 Group e Workload Top 10 Process Name 4 Select the start and end dates and times for which the graph will chart data For more information see Understanding Dates and Times on page 22 5 Click one of the following options e CPU e Memory Size e RSS Graphs generated for SNMP agents only chart the memory size metric 6 Click Generate Graph up time 5 User Guide up time LPAR Workload Graphs LPAR Workload Graphs up time can collect workload information from logical partitions LPARs that are running
352. me End User License Agreement defines the rights permissions and limitations that you agree to by choosing up time The up time End User License Agreement is detailed in the following sections NOTICE TO USER 2 samansett adiri nE E ENEEIER 576 ea E E dines E A E 576 Intellectual Property and Confidentiality aaruravavananv vnr 578 LICENSE FOCS viscacscsntecersteretanerientoen some ienien ini enk aia 579 Term and Termination urrauuvrvrnanvrnnnunuvreanerennanuvsnanaveeer 580 Remedies and Indemnification arrruuvrarnanvrrnanvnvreannvrnenener 580 DISCIAIING as avsendere te de Haar Vaudereuaesaessae 581 Limitation of Liability raonnmnnren rann navn navna en nn ennnnnnennnnnr 581 General Terms sesser aaa ker Ea EP EEn EEEE E TRETE pR E edda 582 575 NOTICE TO USER NOTICE TO USER 1 576 This End User License Agreement the Agreement is a legal contract between you as either an individual or a business entity and Uptime Software Inc Uptime PLEASE READ THIS CONTRACT CAREFULLY BEFORE DOWNLOADING UPTIME S PROPRIETARY SOFTWARE the SOFTWARE OR OBTAINING A LICENSE KEY TO THE SOFTWARE OR USING THE SOFTWARE BY CLICKING ON THE I ACCEPT BUTTON AND BY DOWNLOADING THE SOFTWARE OR OBTAINING A LICENSE KEY TO THE SOFTWARE YOU REPRESENT AND WARRANT THAT YOU ARE EITHER THE REPRESENTATIVE OF THE COMPANY WITH THE AUTHORITY TO ENTER INTO THIS AGREEMENT AND TO BIND THE COMPANY OR
353. me of the specific Windows service to which the Action Profile will apply e Action Select one of the following actions e None e Start e Stop e Restart 8 Ifyou want to send SNMP traps to a particular host complete the following fields e SNMP Trap Host The name of the host that monitors SNMP traps e SNMP Trap Port The port number on the trap host to which the SNMP trap is sent e SNMP Trap Community The name which acts as a password for sending trap notifications to the trap host e SNMP Trap OID optional The object identifier OID that identifies the SNMP trap for example ols Se 60l 201634 400 07 9 If Splunk integration has been enabled and you would like the Action Profile to write to the Splunk log complete the following fields e Splunk Hostname The host name of the server on which Splunk is running 394 up time 5 User Guide l up time Action Profiles Logging Port The port on which the Splunk server is listening for logging requests This port is configured in Splunk and you will need to contact the Splunk administrator for this information Click the Use SSL option to securely access the Splunk server using SSL For more information on Splunk integration see Splunk Settings on page 543 10 Click Save Viewing Action Profiles To view Action Profiles do the following 1 2 On the up time tool bar click Services In the Tree panel click View Action Prof
354. me user who received the last alert e the user name and email address of the person who acknowledged the alert e the name of the Element and service monitor involved e acomment relating to the alert or reason for acknowledgement The following is a sample alert acknowledgement message up time Administrator jsmith myDomain com acknowledged the WARN status of File System Capacity Web Server 2 with comment Initial check of problem More information to come In the up time Web interface the acknowledge icon changes to up time software 113 ainjonsjseajuy INOA Huibheuew pue buiuijeg FI Defining and Managing Your Infrastructure Acknowledging Alerts 114 up time 5 User Guide D upitime CHAPTER 7 Overseeing Your Infrastructure This chapter explains the Global Scan panel in the following sections OVEIVIOW sie rcrecisie tha dente code ead edits cast eed vendb hiss DAARLE EARLE 116 VIEWING All SLAS vagasskdmnskermia tendensene 124 Viewing All Applications aavavavanvvvvnnnnn rn nnnnnnnnnnr 124 Viewing All Elements vasaassutesarrkekavianvevdvas news plese eis pes pas 127 Viewing All Services aa annrnnnuuanunnnnnr annan ann annnnnnrannennnennnnnn 129 Viewing the Resource Scan Report rrnnannannnnnnranrennnennennr 130 Viewing Scrutinizer StatuS aaaaanunnnannnannannnnrrarrennnennnnnn 133 Changing Reporting Thresholds cccccceeeeeennenneeeeeenees 134 115 Overseeing Your Infra
355. mediate The number of times that a table lock is acquired immediately For more information on table locks see the Knowledge Base article SQL Server Locks e Table Locks Waited The number of table locks waited that must be exceeded before up time generates an alert For more information on table locks see the Knowledge Base article SQL Server Locks e Threads Cached The number of threads in the thread cache that must be exceeded before up time generates an alert e Threads Connected The maximum number of clients that can be connected to the database at any one time e Threads Running The number of threads that are running which can be used to determine whether or not the database is becoming overloaded If the database is overloaded the monitor will report an increased number of running queries However you can have values that exceed this limit for very short times e QCache Queries in Cache The number of queries in the query cache QCache that must be exceeded before up time generates an alert e QCache Inserts The number of queries added to the query cache You should compare the value of the gcache hits to the total number of select queries to determine the current hit rate You can increase or decrease query cache size to find the value which provides optimal performance 248 up time 5 User Guide f up time MySQL Advanced Metrics e QCache Hits The number of hits to the query cache qc
356. mesoftware com Outages Email Delivery Outages h 07 18 00 08 00 00 08 06 00 08 12 00 4 08 18 00 4 09 00 00 4 09 06 00 4 Apr 06 00 00 5 Apr 06 06 00 Apr 06 12 00 Apr 06 18 00 4 Apr 07 00 00 4 Apr 07 06 00 Apr 07 12 00 Apr 09 12 00 Apr 09 18 00 4 Apr 10 00 00 Apr 10 06 00 Apt Ap Ap Apt Ap Ap Ap 364 up time 5 User Guide l up time SLA Compliance Calculation A Note About SLOs and Compliance up time software It is important to note the role an SLO plays regarding SLA compliance SLOs exist to help you conceptually separate services into logical groups that make it easier for you to monitor diagnose and set performance goals for them Although the descriptions of allowable downtime in the previous section implied that service downtime affects SLA downtime it is more accurate to say that service downtime affects SLO performance which in turn affects SLA downtime SLO outages affect reported SLA compliance in the same way service outages affect SLO compliance allowable downtime is reduced when any outage is experienced This is also pertinent if you are scanning the Achieving statistic for an SLA Summary This statistic can be viewed in the Service Level Agreement subpanel of My Infrastructure by clicking the Graphing tab then clicking Current Status You can verify how well or poorly an SLA is achieving its target but you can also view how the component SLOs are
357. mmon template to ensure that the configuration of service monitors is the same across all monitors For more information on services see Using Service Monitors on page 135 Understanding Service Groups Service groups are service monitor templates that enable you to simultaneously apply a common service check to one or more hosts Defining and using service groups will greatly simplify the task of initially setting up and maintaining common service checks that you wish to perform across many hosts in an identical manner For example you can create a service group called CPU Performance Check that is associated with 50 different servers You can apply a common performance monitor check to 50 servers With service groups you save time by not having to manually re create an individual service monitor with the exact same service check and Alert Profile for each server you want to monitor There is no practical limit to the number or complexity of your service groups and the underlying service monitors associated with them See Service Groups on page 153 for more information 20 up time 5 User Guide f up time Understanding the Status of Services Understanding the Status of Services up time monitors can return the following statuses for a service e 0 0K The services are functioning properly e 1 Warning There is a potential problem with one of more of the services e 2 Critical There is a critical problem with one or
358. monitor or if up time cannot execute the service check e Alert on Recovery The user receives an alert when the service recovers from an error for example an application process or service restarts or a server reboots 7 Click Save EN Viewing Distribution Lists You can view the details of a Distribution List to ensure is properly configured The details of a Distribution List include an email address and the conditions under which alerts will be sent To view Disbrituion Lists do the following 1 Click Users on the up time tool bar 2 Inthe Tree panel click View Distribution Lists O e e lt gt e Cc 2 oD 7 A list of Distribution Lists appears in the Distribution Lists subpanel 3 Click the name of the Distribution List that you want to view The details of the group appear in the Distribution Lists subpanel Editing Distribution Lists If you find that a Distribution List is not properly configured you can edit that list To edit Distribution Lists do the following 1 Do one of the following e Click the Edit icon H beside the name of the Distribution List up time software 345 Configuring Users Managing Distribution Lists e Click the name of the Distribution List you want to edit then click Edit Distribution List on the Distribution List Information page The Edit Distribution List window appears 2 Edit the group as described in Adding Distribution
359. monitor information fields see Monitor Identification on page 141 Complete the following fields Process Name Mandatory The exact name of the process that you want to monitor The name is the absolute name of the process without its path file extension or any parameters For example on UNIX systems the process usr bin vmstat p is checked as vmstat and on Windows systems process exe should be entered as process Process Occurrences Enter the number of process occurrences for which you want to set Warning and Critical thresholds For more information see Configuring Warning and Critical Thresholds on page 144 Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 up time 5 User Guide hy up time Process Count Check 3 To save the data from the thresholds for graphing or reporting click the Save for Graphing checkbox beside each of the metrics that you selected in step 3 4 Complete the following settings e Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more i
360. more email addresses Click the Email Address option and then type the email address of the person to whom you want to send the report in the field To send the report to multiple recipients type their email addresses in the field separated by commas or semi colons For example Generate Now I E E C User C Group E mail Address Email address headmin amp myorganization com fixit myorganization com up time 5 User Guide hy up time Generating Report S Reports that are sent by email have a file name that consists of the type of report and the date and time range it covers For example a CPU Utilization Ratio report might be named ReportCPUUtilizationRatio_2006 01 10_00 00 2006 01 10_14 53 pdf If you choose to output the report to the screen a message appears while the report is being generated When the report has been generated it is displayed in the report window If up time cannot connect to a host the following error message appears in the report window An error occurred while running this report Verify the configuration of up time and try again up time software 403 suondo 140day buipuejsjopun SL Understanding Report Options Saving Reports Saving Reports If you find that you need to generate reports on a regular or frequent basis you can save the parameters for the report to the DataStore A link to the report appears in the My Portal panel Click the link to gene
361. more information see Adding Groups on page 105 Adding a Nested Group To add a nested group do the following 1 Inthe My Infrastructure panel click Add Group 2 Enter a descriptive name for the group in the Group Name field 3 Optionally enter a description of the group in the Group Description field up time 5 User Guide f up time Working with Groups Select the group with which the new one will be associated from the Parent Group dropdown list To give this nested group its own subgroups select one or more entries from the Available Groups list then click Add Select the Elements that you want to add to this group from the Available Elements list and then click Add Select one or more sets of users who can view this group from the Available User Groups list and then click Add Click Save Editing Groups To edit groups do the following 1 2 3 In the Infrastructure panel right click the group you want to modify then click Edit The Edit Element Group window appears Edit the group as described in Adding Groups on page 105 Click Save To delete a group right click it then click Delete but note that only empty groups can be deleted from the My Infrastructure panel up time software 107 94njonIseJju 1NOA Bulbeuey pue buiuijeg FI Defining and Managing Your Infrastructure Working with Views Working with Views Not every user that accesses the Monitoring Station needs to view
362. ms use this field to determine whether or not up time will securely communicate with an agent installed on the system using SSL Valid options are true and false This field is optional up time software 95 ainjonsjseajuy INOA Huibheuew pue buiuijeg FI Defining and Managing Your Infrastructure Working with Systems 96 Field Description Authentication Method For Net SNMP systems use this field to determine how encrypted information travelling between the Net SNMP instance and up time will be authenti cated Valid options are MD5 a widely used method for creating digital signatures SHA a secure method of creating digital signatures Privacy Password For Net SNMP systems the password that will be used to encrypt information travelling between the Net SNMP instance and up time Privacy Type For Net SNMP systems how informa tion travelling between up time and the Net SNMP instance is encrypted Valid options are DES an older method used to encrypt information AES the successor to DES which is used with a variety of software includ ing SSL servers Pingable For nodes use this field to specify whether or not up time can contact the node using the ping utility Valid options are true and false WMI Domain The Windows domain in which WMI has been implemented WMI Username The name of the account with access to WMI on the Windows domain WMI Passw
363. must be exceeded for up time to issue a Warning alert e Network Retransmits Critical Threshold The number of retransmits per second that must be exceeded for up time to issue a Critical alert e Network Retransmits Time Interval The time interval in minutes at which up time checks retransmits 7 Complete the following settings e Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information 172 up time 5 User Guide d up time Performance Check e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 8 Click Finish up time software 173 Agent Monitors Process Count Check Process Count Check The Process Count monitor measures the number of identical processes that are running on a system If there is more than one instance of a process running the check returns an OK status If the process is not running the check returns a Critical status Configuring Process Count Check Monitors 174 To configure Process Count Check monitors do the following 1 In the Process Count Check monitor template complete the monitor information fields To learn how to configure
364. n on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information 4 Click Finish up time software 327 vL gt Q lt o gt Oo a e gt e 7 Advanced Monitors External Check External Check 328 The External Check monitor captures asynchronous events up time does not actively monitor these events by polling or initiating service checks Instead External Check monitors rely on an external event to generate the information that the monitors capture External Check monitors enable you to determine when to collect service data for the event that you specify After you define an External Check monitor the monitor runs a Perl script named extevent pl The script extevent p1 is included with up time in the scripts subfolder When it is run the script connects to the port on which the server is listening It then triggers the application on the server that generates the external event that is sent to up time This script extevent p1 has the following command line syntax extevent pl host Hostname port PortNumber status StatusNumber message messag monitorName name Wh
365. n on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish 252 up time 5 User Guide hy up time Oracle Advanced Metrics Oracle Advanced Metrics The Oracle Advanced Metrics monitor captures a number of performance tuning metrics for your Oracle database Some Oracle metrics are for tuning devices for long term performance gains rather than avoiding outages This applies to following probes Buffer Cache Data Dictionary Cache Disk Sort Ratio Library Cache and Redo Log You should schedule the monitor to gather data less frequently perhaps every hour or every two days Configuring Oracle Advanced Metrics Monitors To configure Oracle Advanced Metrics monitors do the following ZL 1 Inthe Oracle Advanced Metrics monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following fields e Username The user name that is required to login to the database U iy o o 2 ce ce 7 e Password The passwo
366. n list In the Threshold field specify the threshold for file system service time Disk or file system service time is considered critical when it exceeds this threshold up time 5 User Guide hy up time Reports for Capacity Planning 8 Inthe Percentile field specify the percentage of time at which the service time for systems is below the threshold The default is 95 which is the lowest service time that is greater than at least 95 of all of the recorded values in the time range that you specified in step 2 9 Ifyou want to include or exclude certain disks enter the following in the Exclude Disks and Exceptions fields e The name of the disk e A regular expression See Using Regular Expressions on page 442 for more information You can enter one name or regular expression on a single line 10 If you want to include or exclude certain file systems enter the following in the Exclude File Systems and Exceptions fields e The name of the file system e A regular expression See Using Regular Expressions on page 442 for more information You can enter one name or regular expression on a single line 11 To generate reports for groups of systems select the groups from the List of Groups area 12 To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views 13 If you are generating reports for s
367. n Groups 347 Novell NRM system 86 OIDs 314 service groups 153 systems 67 user groups 342 views 108 VMware instance 79 advanced monitors 138 321 choosing 140 custom 139 324 external check 139 328 guidelines 323 overview 322 types 139 with retained data 139 326 agent monitors 137 File System Capacity 167 overview 166 Performance Check 170 Process Count Check 174 agentless monitors 138 agents up time software installing 40 Linux 42 pSeries 43 with HMC 43 without HMC 45 Solaris 41 UNIX 42 Windows 40 overview 13 Alert Profiles 381 editing 384 alert profiles custom formats 385 alerts acknowledging 112 applying to Applications 384 creating profiles 382 custom formats 385 editing profiles 384 monitor settings 148 overview 378 profiles 381 Application Availability report 456 application monitors ESX Workload 217 Exchange 194 IIS 200 Live Splunk Listener 238 Splunk Query 236 up time Agent 192 WebLogic 203 WebSphere 211 Application Web monitoring 223 Applications adding 101 applying Action Profiles 389 applying Alert Profiles 384 deleting 111 editing 103 maintenance 124 offline 124 status in Global Scan 124 viewing details 103 viewing in Global Scan 124 Archive Policy 545 auto discovery 74 ESX 76 pSeries with HMC 77 585 Index using 75 Cc cloning service monitors 151 Config Panel 9 527 Archive Policy 545 Global Scan thresholds 555 License Information 563 Mail Servers 534 Resourc
368. n Monitors Splunk Query Splunk Query Splunk is a third party search engine that indexes log files and data from the devices servers and applications in your network Using Splunk you can quickly analyze your logs to pinpoint problems on a server or in a network or ensure that you are in compliance with a regulatory mandate or Service Level Agreements You install Splunk on a server in your data center When you click the Splunk icon splunk gt beside the names of services that are in WARN or CRIT states in the My Portal panel you will be taken to your Splunk search page You can use the Splunk Query monitor to perform Splunk queries on log files to pinpoint an error condition Before you can use a Splunk Query monitor you must add 3 some settings specific to Splunk to the file uptime conf See Splunk Settings on page 543 for more information Configuring Splunk Query Monitors 236 To configure a Splunk Query monitor do the following 1 Complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following fields e Splunk query The Splunk query string that you want to use to search log file for an error condition For example entering the following query string host mailServer sendmail error hoursago 2 Will search log files that were generated for the system named mailServer for the word sendmail and error that
369. n of such destruction to Uptime Uptime reserves the right to physically verify that the Software has been removed Uptime may terminate this License Agreement if you breach any term of the Agreement by giving you written notice of your breach and Uptime s decision to terminate the Agreement Upon termination by Uptime you agree to either return the Software Documentation and all copies thereof and all license keys that you have obtained to Uptime or to destroy all such materials and provide written verification of such destruction to Uptime 5 Remedies and Indemnification 580 5 1 If you learn of any actual or threatened infringement or piracy of the Software or if any infringement or piracy claim is made against you by a third party in connection with your use of the Software you shall notify Uptime in writing of the infringement piracy or claim as soon as is reasonably possible Uptime shall in its sole discretion determine what action if any to take with respect to the foregoing and shall assume the up time 5 User Guide f up time NOTICE TO USER defense or bear the expenses of any such action except to the extent if any to which such dispute or costs arise from your negligence willful misconduct or modification of the Software In the event that the use of the Software in accordance with the provisions of this Agreement is declared by a court of competent jurisdiction to infringe the rights of any third party as your so
370. n setting up your Live Splunk select the Run the shell script option on the configuration page Then enter the path to liveSplunkHandler_v2 py along with the script options in the field Using Splunk v3 or v4 Before you can monitor Live Splunks generated on a v3 or v4 Splunk server you must do the following Edit the alertUptime py script to point to the up time Monitoring Station Navigate to the scripts directory on the Monitoring Station up time 5 User Guide dy up time Live Splunk Listener Open the file alertUptime py in a text editor Find the following entry in the file host uptime host port 9996 Change the values for host and port to the host name and port of the Monitoring Station Save and close the file 2 Edit the alertUptimeStatusHandler sh script to configure how the Live Splunk is reported on the Monitoring Station Open alertUptimeStatusHandler shina text editor found in the scripts directory on the Monitoring Station For the message option enter a diagnostic message that accompanies a Live Splunk captured by the up time service monitor For the status option enter the status of the service being monitored For the monitorName option enter the name of the service monitor that is listening to the Live Splunk Save and close the file 3 Copy the alertUptimeStatusHandler sh and alertUptime py scripts from the Monitoring Station s scripts directory to the data splunk bin sc
371. n the up time tool bar click Services In the Tree panel click Add Service Group The Add Service Group window appears Enter a descriptive name for this group in the Name of Service Group field Optionally enter a description of the group in the Description field Click Continue On the second Add Service Group screen select one of the following options from the Available Services dropdown list e All View all of the services that are available e The name of a host If you are monitoring large number of systems this option enables you to filter the services based on the hosts that you have added to up time Select one or more services from the list and then click Add up time software 153 SJOHUON 2914135 Huis ry Using Service Monitors Service Groups From the Available Element Groups list select one or more existing groups to immediately associate with the service group then click Add Select the Include subgroups check box to ensure any nested groups are also included For more information see Adding Nested Groups on page 106 Select one of the following options from the Available Elements dropdown list e All View all of the hosts that have been added to up time The name of a group If you have grouped your hosts use this option enables you to filter the hosts based on the groups that you have added to up time The names of the hosts in the group appear below the dropdown list If you ha
372. nalysis e Workload Top 10 Memsize The top 10 processes that consume system memory based on the total memory size of the processes including virtual pages and shared memory This information appears as a graph in the report This graph does not appear when you generate a report for 3 a VMware ESX system Optionally click Select All to generate a report on all of the options listed above 4 Ifyou selected more than one report option and plan to report on more than one system you can optionally click the Group report options by system checkbox Selecting this option combines the metrics for each system for which you are generating the report 5 To generate reports for systems in specific groups select the groups from the List of Groups area 6 To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views 7 Ifyou are generating reports for specific systems select the systems from the List of Systems 8 Select a report generation option See Report Generation Options on page 402 for details 9 Ifyou want to save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information up time software 417 Using Reports Reports for Performance
373. nce Windows Click the Assign Maintenance to Service tab in the subpanel In the Service Maintenance window select a profile from the Maintenance profile dropdown list If you have not created a Maintenance Profile the message No profiles exist appears in the dropdown list Optionally from the dropdown list above the Available Service list select a system that contains the services for which you want to schedule maintenance From the Available Service list select one or more services for which you want to schedule maintenance Click Add and then click Save up time software 163 SJOHUON 2914135 bulsn ry Using Service Monitors Scheduling Maintenance 164 up time 5 User Guide D uptime CHAPTER 9 Agent Monitors The agent monitors track the performance and health of the following File System Capacity rrrrrrrrrrrarnnnnnnnnnnnnnnnnnnnsnnnnnnnnnnnnnner 167 Performance Check wicscccccccccccccccnncccencesuccseeucetnustausesennnees 170 Process Count CHECK cccccccccncccecnnnneenectennnnesesanenessssannes 174 165 Agent Monitors Overview Overview Agent monitors are service monitors that require an agent to be installed on the system being monitored An agent is software that collects performance information from the system and transmits that information to the Monitoring Station Using the information gathered by an agent up time can alert users to changes in an environment based on defined thresholds F
374. nd Critical Thresholds on page 144 e Web Mail Sends Per Second The maximum number of messages that can be sent from the Exchange server each second e Web Mail Auths Per Second The maximum number of authorization requests that can be sent to the Exchange server each second e SMTP Bytes Sent Per Second up time 5 User Guide D up time Exchange The total number of bytes sent per second by the Exchange SMTP server e SMTP Bytes Received Per Second The total number of bytes received per second by the Exchange SMTP server e SMTP Bytes Total Per Second The total number of bytes of information passing through the Exchange SMTP server each second e SMTP Local Queue Length The number of messages in the SMTP queue that are scheduled for local delivery LL e SMTP Messages Per Second The maximum number of messages per second that are allowed by the SMTP server e SMTP Inbound Connections The number of incoming connections that the SMTP server allows e SMTP Outbound Connections The number of outbound connections that the server allows to all remote domains gt xo O o fe 5 e Oo 7 e SMTP Connection Errors Per Second The number of number of connection errors that occur per second Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 3 To save the d
375. ndicates that no performance data for the last 10 minutes exists for the Element To avoid false positives note that recently added Elements will have this status until 10 minutes worth of performance data has been collected also in cases where the up time Data Collector service is down for more than 10 minutes all Elements will have this status until the service has been restarted and enough data has been collected F The thresholds for the service status indicators are typically 70 fora warning state and 90 for a critical state These thresholds can be customized see Changing Reporting Thresholds on page 134 The bar chart at the bottom left of the panel displays the number of service monitors that have moved from a normal OK to critical CRIT status over the past 24 hours up time takes a data sample from the database for any new critical status services every 15 minutes and charts it on the bar chart The number of services in each state appears in the graph The pie chart at the bottom right of the panel visualizes the current availability of systems or devices The services for unmonitored systems in groups are not shown in the pie chart O lt OD 2 OD 2 gt e lt Oo lt n o pen i a 1 Viewing More Information You can view detailed information about an Element by clicking its name To view the details of each metric for example CPU usage click the n
376. ndicating whether up time was able to connect to the remote reporting instance If an error message is displayed correct your configuration and re test it Note that the modification of these values is one of a series of steps performed to correctly set up a remote reporting instance Refer to the Knowledge Base article entitled Setting up a reporting instance for more information up time software 541 Configuring and Managing up time nterfacing with up time User Interface Instance Settings A Ul instance is an up time installation that does not perform any data collection tasks and is primarily used for real time monitoring and report generation UI instances can divert traffic from a standard Monitoring Station implementation and are helpful when there are many up time users who do not need to perform full administrative tasks You can manually configure UI instance settings with the following uptime conf parameters e uiOnlyInstance true Determines whether the Monitoring Station functions only as a user interface instance Gl e uiOnlyInstance monitoringStationHost HOSTNAMI The host name or IP address of the up time Monitoring Station that is performing data collection and to which this UI instance will connect e uiOnlyInstance monitoringStationCommandPort 9996 The port through which the UI instance can communicate with the data collecting Monitoring Station A Monitoring Station that is acting
377. nel 2 Double click Administrative Tools and then double click Services 3 Inthe Services window find the following entries and click Start the service e up time Data Store e up time Data Collector e up time Web Server To restart the up time services on Solaris or Linux do the following 1 At the command line log into the Monitoring Station as user root 2 Type the following command to start the database etc init d uptime datastore start 3 Type the following command to start the Data Collector etc init d uptime core start 4 Type the following command to start the Web server etc init d uptime httpd start up time software 531 Configuring and Managing up time nterfacing with up time Interfacing with up time Some of the Monitoring Station s features require integration with other elements that make up your infrastructure In some cases configuration is mandatory e g an SMTP server will need to have been set at the time of installation while in others it is required only when particular up time features are used e g using the Web Application Transaction monitor requires you to provide up time with your proxy server settings The following sections outline how to configure up time to communicate with servers and databases Database Settings 532 The database settings determine how up time communicates with the DataStore The following are the database settings in the upt ime conf file
378. nennnnnr 137 The Monitor Template aaauuuuvvvvvvvvvavan vnr nnnnnnnnnnnnnr 141 Cloning Service MONItOIS cece cece eee e teen teen eee 151 Testing Service Monitors ranannnannnnn narrer nn annnnnnnennennnenr 152 Service GlOUDSiiiiiciseceddvadiviviv estan ends AN AIVE ETENE VEDEA E dave 153 Changing Host Checks rrannnannnannnvrrennn enn ennnnernennennnennnner 156 The Platform Performance Gatherer aauuvrauuurrnnavuvrnanavener 157 Topological Dependencies ruuuuuuvvvvvvvnnannnrrnrrrrrnrunnnnnnnsne 159 Scheduling Maintenance arrannnnnanannnnr nar ne eee t eet enenaeeeeegs 161 135 Using Service Monitors Overview Overview A service monitor is an up time process that checks the performance and availability of services in your environment at regular intervals If the monitor detects a problem up time issues an alert Before you configure a service monitor you should determine the following e the host name of the system that you want to monitor e when you want alerts to be sent e the action that will be taken to fix the problem e when the monitor should be run If you have tool tips enabled see page 339 for more information the graphic that appears in the Service Instances panel is a clickable image map Add Service Monitors Add Alert Profiles Group Users together Add Users to to a system to notify users to receive alerts receive alerts j G f 5 F G I ng
379. nformation Using the Solaris Mutex Exception Report The following is an example of a Solaris Mutex Exception report Solaris Mutex Exception Date Range 2008 04 16 00 00 00 to 2008 04 16 17 55 11 between the time range 00 00 to 23 59 Multi CPU systems with an Average SMTX over 150 0 are highlighted System Name Number of CPUs Average SMTX Opteron opteron 2 158 19 The Zones vmh tik a 24 0 83 The number of mutex stalls for the first system in the list exceeds the threshold that was set when the report was defined Based on this information you can generate one of the following graphs to get a better idea of the performance of the CPUs on the system e Multi CPU Usage see page 495 for more information e Run Queue Length see page 493 for more information e Run Queue Occupancy see page 493 for more information From there you determine how to best reduce the queue size to improve performance Bandwidth Report The Network Bandwidth report keeps track of the amount of data moving in and out of each network interface on a system This report helps you identify or confirm that specific systems are being overloaded based on the amount of data they are sending or receiving such systems could become bottlenecks for the whole network up time 5 User Guide hy up time Reports for Capacity Planning The amount of data moving through each interface is measured in megabytes However the following systems store data as pac
380. nformation e Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish up time software 175 Agent Monitors Process Count Check 176 up time 5 User Guide Cupane CHAPTER 10 Microsoft Windows Monitors The Microsoft Windows monitors track the performance and health of following Windows Event Log Scanner raannunnnnnrannnannnnnnnnnnannnenr 178 Windows Service Check rrauuunnnnununnnnnnnennnnnrnnnannvnnnnereeer 182 Windows File Shares SMB aannnannnannnnannnanvnennveenvnenveenr 185 Active Directory oprise chet ace sige E EEEE tes added ie 187 177 Microsoft Windows Monitors Windows Event Log Scanner Windows Event Log Scanner The Windows Event Log Scanner alerts on specific entries in a Windows log file This monitor searches through events based on text strings as well as the log and error type When the monitor runs with WMI based collection events are retrieved in 15 minute batches with agent based collection the number of events retrieved is user defined To prevent false positives the monitor ignores log entries that are older than when it was last run To avoid performance degradation maximum number of log entries which has a default 1 000 is 10 000 lines Configuring Windows Event Log Scanner Monitors To configure Windows Event Log Scanner monitors do the following 1 In the Windows Event Log Scanner monitor template complete t
381. ng 1 2 336 In the Tree panel click View User Roles Click the name of the user role that you want to edit and then click Edit User Role in the Users subpanel The Edit User Roles window appears Edit the user role information as described in the section Adding User Roles on page 334 up time 5 User Guide f up time Working with Users Working with Users Users are the individuals who have access to up time and its various functions You can grant permissions to users to do any or all of the following view information about specific systems in your environment generate and save reports about specific systems receive alerts Adding Users To add users do the following 1 EN In the Tree panel click Add New User The Add User window appears Type a name for the user which will be used to log into up time in the Username field If you are using Active Directory or an LDAP directory to authenticate up time users the user name you input should be identical to the user s name in the central directory If AD LDAP is enabled for user authentication leave the Password field blank otherwise enter a password that will be stored in the up time DataStore O e e lt gt e Cc 2 oD 7 If using an AD or LDAP directory to authenicate users up time will refer to the directory for password information during user login For more information see Changing How Use
382. ng 334 editing 336 overview 334 viewing 335 Users adding 337 configuring 333 Distribution Lists 344 editing 340 Notification Groups 347 overview 337 roles 334 viewing 340 Users panel 8 V viewing Action Profiles 395 all Elements 127 detailed process information 524 Distribution Lists 345 up time 5 User Guide f up time Notification Groups 348 Service Check 182 Quick Snapshot 490 SMB 185 report logs 411 Windows Service Check monitor 182 Resource Scan 130 WMI 81 536 scheduled Maintenance Profiles 162 agentless system adding 82 service information 52 workflows 389 391 services 129 Workload graphs 505 system information 50 Workload Top 10 graphs 508 system status 489 user groups 342 user roles 335 users 340 views adding 108 adding nested 109 deleting 111 virtual infrastructure density 473 VM density 473 VMware 79 432 Instance Motion graph 523 VMware instance 79 VMware vCenter Orchestrator 389 391 539 VMware Workload report 470 VXVM Stats graph 519 Ww Wait I O report 423 warning threshold 144 Web application transaction monitoring 223 WebLogic monitor configuring 207 209 metrics 204 Weblogic monitor 203 WebLogic report using 468 WebSphere monitor 211 configuring 215 counters 211 WebSphere report 463 using 465 Windows Event Log Scanner monitor 178 Windows File Shares monitor 185 Windows monitors Active Directory 187 Event Log Scanner 178 up time software 593
383. ng Settings on page 146 for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information Click Finish up time 5 User Guide f up time Windows File Shares SMB Windows File Shares SMB The Windows File Shares SMB monitor can check the availability of file shares on a Windows server If a file share is not available the status of this monitor becomes critical and up time sends an alert Configuring Windows File Shares SMB Monitors To configure Windows File Shares SMB monitors do the following 1 up time software In the Windows Files Shares SMB monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 Complete the following fields Username The user name that is required to login to the file share The value entered can include the file share domain if input with the following formats lt domain gt lt username gt or lt domain gt lt username gt Password The password that is required to log in to the file share Shares The names of file shares that you want to monitor on a host system Specify the name of the file share for example Main To specify multiple file shares add a comma between the names for example Main home To check all of the fi
384. njonsjseajuy INOA BuiBeuey pue buiuijeg FI Defining and Managing Your Infrastructure Working with Views 7 To add previously defined groups of users select one or more entries from the Available User Groups list then click Add 8 Click Save Editing Views To view and edit views do the following 1 Inthe Infrastructure panel right click the View you want to modify then click Edit The Edit View window which contains system and user information appears 2 Edit the view as described in Adding Views on page 108 3 Click Save 110 up time 5 User Guide hy up time Deleting Elements Applications and Views Deleting Elements Applications and Views If you have administrator privileges you can delete a Element or view in the Infrastructure panel To delete a system or network device do the following 1 Locate the system or network device Application or view that you want to delete in the Infrastructure panel 2 Right click the Element then click Delete 3 On the dialog box that appears click OK up time software 111 ainjonsjseajuy INOA BuiBeuey pue buiuijeg FI Defining and Managing Your Infrastructure Acknowledging Alerts Acknowledging Alerts When a problem occurs on a system that up time is monitoring the Monitoring Station sends alerts these are notifications about the problem sent to users who are qualified to receive them If the user role to which they belong is configured to do so the
385. nnnnnnnnnnnnnn 48 pg Setting Up the Administrator Account vrnnnnvnnnnnnnrrennnnvrrrnnnnnvrennnnnnn 48 o Accessing Up UMa EEE EE 49 Oo Exiting UPU O sissi iren a R T REN 49 Viewing System and Service Information aarnnnnnnnnnnnnn 50 7 Viewing System Information rrrrvrnnnnrrrennnnvrrrrnnnnnvrrnnnnn nr nnnnnnv renn 50 Viewing Service Information rrrvnnnnnvrrnnnnnvrrrnnnnnrnnnnnnnnrnennnnvnnnnnnn 52 Searching and Filtering a asxnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 57 Using the Search BOX ser 57 Filtering Service Instances aker aebe 58 Audit Logging sissrssaissssasinaannnsnoandnnnodninadnvindaasnnenaankaanninaii 60 Enabling the Audit Log is incsesesosessadsbenosiayasnaringachynsgavucsesrusbieudeesseeduude 60 Using My Portal STE saisrrasnnnuisunnad sinnnksanunandina nannaa diasaan dinaa 62 PG SNS ANG pet 62 Why PICTOVCNCOG ua se Jises desman Se caastedete sacusdecedd ganvnctteesacdeacsabo mu 63 Latest up time ANDES scsccis ccccassscataccueteceugdasuadensdeativeseannccanceseeusicaceee 63 up time Information erannvrvennnnrrrrnnonvrrnnnnnnvrnnnnnnrrrennrrrnrnnnennvnennennn 63 MY AGS vepsens de 63 Savod Repons REE NE EN ME 64 Custom DAT DON unne 64 Defining and Managing Your Infrastructure OV GR VI EE EEE EE 66 Working with Systems arnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 67 Adding Systems or Network Devices rrrrvrnrnnrvrnnnnnvvrvrnrnnnrrvnnnnvnnr 69 up time software vii viii
386. nnnnnnnnnnnnnnnr 566 Time Period DefinitiOnS aa narnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 567 Building BIOCKS Hanssens 567 BASIC BESS 569 Combining Expressions and Excluding Time Periods 572 End User License Agreement NOTICE TO USER i ccicissstttcttetcctectececcccasesenseneecenenenncnnsnsass 576 Ve LICENSE ae teavebasnecelnsesndaeidices Suanteddeaente 576 2 Intellectual Property and Confidentiality oornrrvnnnnnnrrnnnnnnn 578 3 LICENSE FEGS vs iinicacccsovieactnecssnerdVousstiurdaseeatevendeatacdvansdvtecwiactinies 579 4 Term and TOSiinAatiON 0ccccccceeececcecescecencnsesceecnensceseanesceseaanes 580 5 Remedies and Indemnification vvrvrvrvnrnnrnrrvvrvrrrrnrnnnnvnrvnvvvenr 580 6 DIG ET EE 581 7 Limitation of Liability taxi ciecci te cecoteaeecaceces seneenersaezateetanensaameanonieces 581 8 General Terms crni Secsevecest eievtdaccta ges tvaseidewciens 582 up time software xxiii Index Xxiv up time 5 User Guide uptime CHAPTER 1 Welcome to up time This chapter introduces up time in the following sections Introducing up time ssssssnnannnsnnannnnnnnnnnnnnrnnnnnnnrnnnnnrnnne 2 up time Architecture rrrrrrrrrrrrrnnnnnnnnnnnnnnnnnnnnnnnennnnner 3 up time Service Monitoring Concepts araruuunnnnnnvnanannnnnner 4 Welcome to up time Introducing up time Introducing up time up time monitors manages and reports on systems network devices and applicati
387. nnnnnnvnnennnrrrnnnrnvnnnnnen 461 Reports for J2EE Applications anrnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 463 WebSphere MP ae 463 WebLogic RENO L e etan 466 Reports for Virtual Environments rannnnnnnnnnnnnnnnnnnnnnnnnr 470 VMware Workload Report rnrnnnvvnnnnnnronnnnnnvrrennnrnrvnnnnonnnnnnnnnnnen 470 VMware Infrastructure Density Report rrrrnnnnnnnnnvrvrnnnnnnnnnnvvnnnr 473 LPAR Workload Report Lupknenodedarreeadmme 475 Understanding Graphing Graphing n up tiMe ssssssssssunnsunnnunnnnnnnnnnnnnnnnnnnnnnnnnn nnn 480 Graphing Bl Lao REE EE EE EEE 481 Using the Graph Editor s asranvnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 482 Working with Trend Lines ccccccsccccesesscceeeesssceeeeesneeeseseeaeeeeeeeaaee 484 Formatting Individual Graph Elements evrnnnvrvernnnvrrennonvrnnnnrn 485 Exporting Graphs RE EEE EE 486 Changing the Look and Feel of a Graph rannvennnvvnnanvnnnnneverene 486 Using Graphs SE TT EEE EE ee 488 up time software xix XX UNIX vs Windows Performance Monitoring rrvrnnnnnnnnnnrrrnnnnn 488 Viewing the Status of a System rasrnnnnnnnnnnnnnnnnnnnnnnnnnnnn 489 Viewing Quick Snapshot smm v smersimernsreennssnetsnmmnsssoomnsakenenen 490 Monitoring CPU Performance naxunnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 491 CSAS RE EEE EE NE tees 491 Run Queue LEN ann 493 Run Queue Occupancy eee 493 Generating a CPU Performance Graph sssssesseeeieeenernnennes 494 Multi
388. nnnvnnvnnnnnnvnnvnnennnnnuvnnnnn 370 z 3 Adding and Editing SLA Definitions anrnnnnnnnnnnnnnnnnnnnnr 371 o Adding a Service Level Agreement arnrrnnnnnnnnvvnnnnnnrrnnnnnnvrnennnn 371 D Adding Service Level Objectives to an SL rrrrrrrrnnrrrnnnnnvrrnnnnnn 313 Associating Alert and Action Profiles to an SL rrrrrrrnnnrrrnrenr 374 Alerts and Actions Understanding Alerts annannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 378 Understanding the Alert FIow mmerssmisusrkessnearsvndennneskersannane b nn nn 379 Alert Profil s wcccccsssscceccacccccssssssascnecacceasesenesananasdennacsoasace 381 Enabling the Windows Messaging Service rnnnnrnnnnnnrrrnnnnnvnnnnnrn 381 Creating Alert Profiles nooonnnennnnenannnnnoennnnnenenennennnnneennneneenenene 382 Viewing Alert PIOIOS soscenereasesue seecetauadanesss seeds ereaceadesdageninekesdevcadae 383 Editing Alert 0 asics cvicvateica tien hcutmsndesantechncessauceisgebneeieesudiels 384 Associating Alert Profiles to Elements rrrrnnrvvrnnrnnnrrrnnrrvvrrnnnrn 384 Working with Custom Alert Formats nnnnnnnnnnnnnnnnnnnnnnn 385 Custom Alert Format Variables rrrrrrrnnnnvrrnnnnnnnrnnrnnnrrnnnnnr renn 386 Action PROMS sicacicccasecatancctisdeacnadateaniacatanaesntusnandadeanenacs 389 VMware vCenter Orchestrator Workflow Actions rrrrrnnrrrnnnr 389 SNMP Trap Actions ai vaseegossechadeayoancsesudayendupedanindtonewedespedaiuatuadoseee 390 Creating Action
389. not a system is short of memory up time checks whether or not the pgscan rate and page out statistics are consistently high Use the following equation to calculate the scan rate threshold scan threshold handspreadpages residence tim The handspreadpages variable is fixed at 8192 on UltraSPARC systems with more than 256 MB of memory The residence time variable is generally fixed at 30 seconds Therefore the default scan rate threshold is 213 You should also examine the swap device for excessive activity To identify the device check the file etc vfstab for the tmpfs file system You can also use the swap 1 command to list the physical partitions that are being used for swap on the system p When a program requires more memory than is physically available information that is not being used is written to a temporary buffer on the hard disk called swap The Free Swap graph charts the amount of available free swap space as a percentage of total available free swap space Microsoft Windows writes data to the Windows Page File when it needs additional memory The Windows Page File can range in size from 20 million bytes to over 200 million bytes The Paging File Total Usage performance counter extracts page file information On Solaris swap space is separated into e Physical swap space The actual space on a disk available for swapping e Virtual swap space The amount of physical swap space and the amount of memory
390. ns The string returned by the monitor contains the string that you defined does not contain The string returned by the monitor does not contain the string that you defined If you select a method from the dropdown list and either enter an incorrect value in the field or do not enter a value then an error message appears and you cannot save the monitor If you do not want to specify a comparison value do not select an option from the Select a comparison method dropdown list Configuring Warning and Critical Thresholds 144 In many instances you must configure Warning and Critical thresholds to determine the conditions under which up time issues an alert For example if hard disk usage on a server reaches 85 up time issues a Warning alert If disk usage reaches 95 up time issues a Critical alert To configure Warning and Critical thresholds do the following Enter the threshold value in the text box next to the Select a comparison method dropdown list up time 5 User Guide dy up time 2 The Monitor Template Select an option from the Select a comparison method dropdown list Response Time The Response Time setting denotes the amount of time that a monitor requires to initiate a service check transmit a request to a local or remote system or to a service collect service information return the collected information to the Monitoring Station display the information on the Monitoring Station Many factors c
391. ns in a hubbed environment or handshake errors between a system and a switch Page Scanning Statistics The number of file system pages scanned by the page scanning daemon This information appears as a graph in the report 415 Using Reports Reports for Performance and Analysis e Workload Top 10 CPU The top 10 processes that are consuming CPU time grouped by user ID group ID and process name This information appears as a graph in the report This graph does not appear when you generate a report for a VMware ESX system e Multi CPU The percentage of total CPU time that is being used on systems with more than one CPU e CPU Performance Graph Tracks the performance of a system s CPU over a specified time period This information appears as a graph in the report e TCP Retransmits Any network services that may not be completing properly because of undue network or system load This information appears as a graph in the report e Disk Statistics The following statistics for each disk on a system e percentage of the disk that is busy e average queue length number of reads and writes per second number of blocks being accessed per second e average wait time in seconds e average service time in seconds If the system for which you are creating a report for has multiple disks a graph for each disk on the system is generated 416 up time 5 User Guide hy up time Reports for Performance and A
392. nstalling third party applications on a system Every Windows service has one of the following states which control how the services are launched or prevented from launching e Disabled Services that are installed but not currently running e Set to manual Services that are installed but will start only when another service or application needs its functions e Set to automatic Services that are started by the operating system after device drivers are loaded at boot time Configuring Windows Service Check Monitors 182 To configure Windows Service Check monitors do the following 1 In the Windows Service Check monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following fields e Service Name Mandatory You can find the name of all available Windows services their states and their status in a service property window by doing the following up time 5 User Guide f up time Windows Service Check On the Windows desktop right click on My Computer and select Manage Click Services and Applications and then click Services Double click on the name of the service that you want to review If you enter the name of a service that does not exist or mistype the name the monitor changes the status of the service to Critical e Service Status Mandatory Select a comparison method fr
393. nt based Elements whose data collection method is to be changed to WMI 5 Click Convert to WMI When the conversion is complete the lists of agent based and WMI Elements will be refreshed to reflect the changes up time software 85 94njonIseJju 1NOA Bulbeuey pue buiuijeg FI Defining and Managing Your Infrastructure Working with Systems Converting Multiple Elements to Agent Based Data Collection To change multiple WMI Elements to use the up time Agent for data collection do the following 1 Ensure a global up time Agent configuration exists see Configuring Global WMI Credentials on page 536 for more information On the up time tool bar click Config In the tree panel click Bulk Element Conversion In the WMI Elements section select the check boxes that correspond to the WMI Elements whose data collection method is to be changed to the up time Agent Click Convert to Agent When the conversion is complete the lists of agent based and WMI Elements will be refreshed to reflect the changes For bulk WMI to agent conversions the port used by all of the converted up time Agents must match the port specified in the global agent configuration Novell NRM Systems 86 up time collects performance metrics and availability information from version 6 5 of the Novell Remote Manager NRM using HTTP or HTTPS up time extracts performance information from the NRM by reading and parsing XML files Adding a N
394. nt monitors do the following 1 In the Uptime Agent monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 Complete the following options by clicking the checkbox beside each option then specifying a warning and critical threshold If the thresholds that you set are exceeded then up time generates an alert For more information see Configuring Warning and Critical Thresholds on page 144 e Major The major version number of the agent For more information see Understanding Major and Minor Versions on page 13 e Platform The operating system on which the agent is installed and running e Response Time Enter the Warning and Critical Response Time thresholds for the length of time a service check takes to complete For more information see Configuring Warning and Critical Thresholds on page 144 To save the data from the thresholds for graphing or reporting click the Save for Graphing checkbox beside each of the metrics that you selected in step 3 Complete the following settings e Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information up time 5 User Guide D up time Uptime Agent e Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 f
395. ntage of the maximum file size against which the monitor will check data files and log files Full Critical Threshold Mandatory Enter a value that will change the status of the Oracle Tablespace Check from OK to Warning The critical threshold should be a percentage of the maximum file size against which the monitor will check data files and log files up time 5 User Guide dy up time Oracle Tablespace Check Response Time Enter the Warning and Critical Response Time thresholds for the length of time that a service check takes to complete For more information see Configuring Warning and Critical Thresholds on page 144 3 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish up time software 261 ZL U iy o o 2 ce ce 7 Database Monitors SOL Server Basic Checks SOL Server Basic Ch
396. nthe Please Choose a Folder field type the name of the directory where you want to install the DataStore and then click Next This should be the full path to the DataStore Because the DataStore can grow very large in excess of 100 GB you can install the DataStore in another folder on the file system if you are monitoring a large number of systems and retaining data for extended periods Click Choose and select a directory from the Browse for Folder window 9 Do one of the following to specify the basic up time configuration information e Click Next to accept the defaults e Enter information in the following fields e Email address The email address from which the Monitoring Station will send alerts and reports to users e DataStore Port The number of the port on which the DataStore the up time database will listen for requests The port number is written to the file uptime conf up time software 31 Installing up time nstalling the up time Monitoring Station 10 11 12 13 14 e Web Server Name The name of the computer that is hosting the Web server This name is written to the file httpd conf which contains configuration information for the Web server used by up time e Web Server Port The number of the port on which the Web server for the Monitoring Station will listen for requests The port number is written to the file httpd conf Select an option for setting up icons in the Windows Start me
397. ntification on page 141 2 Complete the following NIS YP monitor settings 2 pr Oo Ww 92 Oo Oo Oo 2 e YP NIS Domain The domain of the NIS service For example upt imesoftware com NIS administration databases that contain name service information are called maps A domain is a collection of systems that share a common set of NIS maps e YP NIS Table The name of the NIS YP table that contains the values for which you want to search up time software 297 Network Service Monitors NIS YP 298 Key Enter a value you want to search for in the NIS table For example the key is jsmith in the following string returned from a NIS table jsmith LLZDusFe5Da3s 20080 100 Jim Smith export home jsmith bin sh Lookup The Lookup value associated with the value in the Key field For example the following is returned from the passwd table of a NIS database based on the key jsmith jsmith LLZDusFe5Da3s 20080 100 Jim Smith export home jsmith bin sh Response Time Enter the Warning and Critical Response Time thresholds for the length of time that a service check takes to complete For more information see Configuring Warning and Critical Thresholds on page 144 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph Complete the following settings Timing Settings see Adding
398. nu and then click Next On the Install Summary screen review the installation options that you selected and then do one of the following e Click Previous to change the settings e Click Install to begin the installation process The installation process will take several minutes When the software is installed click Next The following occurs e The Web server DataStore and Data Collector are installed The Web server and DataStore are started e The DataStore is populated with default data e The Data Collector is started On the Install Complete screen click Next Click Finish Installing the Monitoring Station on Solaris or Linux 32 Installation on Solaris or Linux is done at the command line In addition to installing the up time application the installation process attempts to create the uptime user ID which run applications in non privileged mode If it already exists then the installer will use that account up time 5 User Guide l up time Installing the up time Monitoring Station Installing the Monitoring Station To install the up time Monitoring Station on Solaris or Linux do the following 1 If you are upgrading ensure you have logged out of the up time Web application by clicking the Logout button Ensure you have logged in to the Monitoring Station system as root up time may not function properly if the Monitoring Station is installed when you are logged in as a domain or non loca
399. o 7 TUNE caverna mener Eu PET Va es kanten etter vn senumy EINEN I aneu wymi WAT USEINANE Spanne u a email time internal external Gmail x PSU rthan 10 Found mail 10614 byte gt EE ARs ofa sce sariren a Moats le Speculation based on the status message can be confirmed using a Service Metrics graph for the Email Delivery monitor s system This graph up time software 233 Application Monitors Email Delivery Monitor http localhost 9999 Mail Server mailhost Service Metrics Date Range Wed Show Editor dialog J Export chart r gt gt up il delivery Time M E Re indicates whether the delivery and retrieval time are within acceptable limits below left or if one or both are unusually long below right ej loxi localhost 9999 Mail Server mailhost Service Metrics Date Range Wed Apr bek f P Show Editor dialog Export chart r gt gt up time 5 52 50 EDT 2008 59 18 EDT 2008 spuesnoyj ms thousands ise time Ml Retrieve Time MM Delivery Time M Response time MM Retrieve Time Loading Java Applet 234 Applet com steema teechart UptmeTeeChartapplet started To generate a Service Metrics graph either select the system to which the Email Delivery monitors are associated in My Infrastructure or the monitor itself in the main Services panel Click the Graphics tab then click Service Metrics Even if the Serv
400. o a custom Web page and indicating which User Group will be able to view it You can enable and configure the first dashboard through the following parameters myportal custom tabl enabled true myportal custom tabl name lt DashboardNameOnTab gt myportal custom tabl URL lt URLtoCustomPage gt myportal custom tabl usergroups lt UserGroupName gt Values for the first three parameters are required If no name is specified for the User Group parameter or if no User Groups have been defined the custom dashboard will be visible to all up time users Thus a User Group parameter is only required if you want to restrict or refine user access to a particular custom dashboard To create additional tabs add the same set of parameters but increment the tab count myportal custom tab2 enabled true myportal custom tab2 name lt DashboardNameOnTab gt myportal custom tab2 URL lt URLtoCustomPage gt up time 5 User Guide hy up time License Information License Information If your up time package did not come with a license key then either contact your sales representative to request a key or send an email to support uptimesoftware com You will need the host ID for the system so that a permanent license key can be generated The host ID is displayed in the License Information subpanel and is similar to the following 001110bf101d You do not need the host ID if you are evaluating up time The demo licenses expire af
401. o configure monitor information fields see Monitor Identification on page 141 2 Complete the following fields e WebSphere Port gt xo O o fe 5 e Oo 7 The number of the port number on which WebSphere is listening The default is 9080 Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 3 Optionally click the Save for Graphing checkbox beside the Response Time option to save the data for a metric to the DataStore which can be used to generate a report or graph up time software 215 Application Monitors WebSphere 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information e Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish 216 up time 5 User Guide d up time ESX Workload ESX Workload The ESX Workload monitor collects a set of metrics from all of the instances that are running on an ESX v3 or V4 server over a specified time period The monitor the compares the
402. o match information e Attribute EL The attribute or information for which you want to search in your LDAP directory An LDAP entry consists of a set of attributes Each attribute has a type which describes the kind of information contained in the attribute and one or more values which contain the actual data For example the entry jsmith inter net has the Attribute value jsmith inter net The Attribute type is e mail Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 2 pr Oo Ww 92 Oo Oo Oo 2 3 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph up time software 293 Network Service Monitors LDAP 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information e Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish 294 up time 5 User Guide D up time NFS NFS NFS Net
403. o the maximum extent permitted by law you hereby consent to the jurisdiction and venue of such courts and waive any objections to the jurisdiction or venue of such courts To the extent any terms and conditions on a purchase order or other ordering document submitted to Uptime by you conflicts with the terms of this Agreement the terms of this Agreement shall control and notwithstanding any term of your order which states to the contrary up time 5 User Guide d up time NOTICE TO USER 8 2 Severability If any term or provision of this Agreement is declared void or unenforceable in a particular situation by any judicial or administrative authority this declaration shall not affect the validity or the enforceability of the remaining terms and provisions hereof or the validity or enforceability of the offending term or provision in any other situation 8 3 Survival Sections 2 5 6 7 and 8 of this Agreement and all subsections thereof shall survive the termination of this Agreement regardless of the cause for termination and shall remain valid and binding indefinitely 8 4 Headings The Article and Section headings contained in this Agreement are incorporated for reference purposes only and shall not affect the meaning or interpretation of this Agreement 8 5 No Waiver The failure of either party to enforce any rights granted hereunder or to take action against the other party in the event of any breach hereunder shall not be deemed
404. odify whether an Element Group s nested groups are included by selected or clearing the Include subgroups check box Remove systems by clicking on one or more entries in the Selected Element Groups list and then clicking Remove Click Save 7 To edit the Elements in the group do the following up time software Add systems by clicking on one or more systems in the Available Elements list and then clicking Add Remove systems by clicking on one or more systems in the Selected Elements list and then clicking Remove Click Save 155 SJOHUON 2914135 bulsn ry Using Service Monitors Changing Host Checks Changing Host Checks Host checks determine whether or not a system that is being monitored is available and functioning properly If a host check determines that a host is unavailable then all service checks are temporarily disabled The available host checks are Ping check This host check uses the ping utility to determine whether or not the server is accessible This is the default host check up time agent check This host check communicates with the up time agent installed on a system to determine whether or not the system is functioning Any service monitors that you have configured for a system Change a Host Check To change a host check do the following 1 2 156 On the up time tool bar click Services In the Tree panel click Host Check A list of the servers and their assigned host checks
405. oes not attend to the problem within a specified amount of time then the alert will be sent to the administrator s manager up time can send alerts via e email messages to a cell phone or a pager or to one or more email addresses e a Windows popup The following is a sample email alert Notification type Problem 1 12 2008 10 52 Host filter Host State N A Service FS Capacity Filter Service State WARN Output var is 92 full up time 5 User Guide hy up time Understanding Alerts The following is a sample pager alert subject CRIT Alert content 5 7 2005 13 22 Type Problem Service FTP CRIT Host filter CRIT For more information on alerts see Monitor Alert Settings on page 148 Understanding the Alert Flow Alerts in up time follow a specific flow When up time detects a problem with a host it issues an alert up time then continues to check the host at specific intervals and reports on the status of the host Considering the following example e up time checks the host system every 15 minutes e alerts are sent continually every check interval until up time detects a change in the state of the host system e whenever an error is encountered up time rechecks the system every minute e if all rechecks up to the maximum number of rechecks fails up time issues an alert up time encounters a critical error on a host up time performs three rechecks at one minute intervals
406. oftware Ping 303 POP 305 SMTP 309 SNMP 311 SSH 307 TCP 318 NFS monitor 295 NIS YP monitor 297 NNTP Network News monitor 299 command implementation 299 response category 300 response codes 300 Notification Groups 347 notification groups adding 347 editing 348 overview 347 viewing 348 Novell NRM 86 Abended Thread Count 89 abended thread count 89 adding 86 Allocated Server Processes 88 allocated server processes 88 Available Disk Space 91 available disk space 91 Available Event Control Blocks ECBs 91 Available Memory 90 available memory 90 Available Server Processes 89 available server processes 89 Connection Usage 90 connection usage 90 CPU Utilization 89 CPU utilization 89 Disk Throughput 91 disk throughput 91 DS Thread Usage 90 DS thread usage 90 ECBs 91 LAN Traffic 91 LAN traffic 91 Packet Receive Buffers 90 packet received buffers 90 Statistics Available 87 statistics captured 87 Work To Do Response Time 88 work to do response time 88 589 Index O OIDs deleting 314 manually adding 314 Oracle Advanced Metrics monitor 253 Basic Checks monitor 256 Tablespace Check monitor 259 using as the up time database 37 Orchestrator 389 391 P Performance Check monitor 170 Ping monitor 303 POP Email Retrieval monitor 305 Problem Reporting 552 Process Count Check 174 Configuring 174 Settings 174 Process Count Check monitor 174 pSeries adding LPARs 81 R Report Log 410 completed reports 41
407. ollowing sample checkpoints could be created for an e commerce transaction Browse Catalog Add to Shopping Cart Checkout Credit Card Validation The following sample checkpoints could be created for an internal office transaction Login Browse Orders View Order Details Configuring Web Application Transaction Monitors 224 You can define Web application transactions by manually stepping through one and declaring checkpoints at key stages 1 Open a Web browser and configure its proxy settings so that you can record a transction Open the dialog where connection settings are made e g the Connection Settings dialog in Firefox or the Local Area Network LAN Settings dialog in Internet Explorer Configure the browser s proxy to localhost on port 8001 Ensure these settings have also been applied to SSL or secure communications Set the proxy to bypass the Monitoring Station For example in Firefox v2 you will need to manually enter the Monitoring Station URL or IP address in the No Proxy for box or in Internet Explorer v6 select the Bypass proxy server for local addresses check box up time 5 User Guide f up time Web Application Transactions Using the monitor as a proxy will allow it to intercept Web traffic as you generate it In the browser navigate to the starting point of the Web application whose performance you will be monitoring In the up time Add Service window select the Web A
408. ollowing states e Pending The report is in the queue and is waiting to run e Running The report is being generated e Completed The report has been generated and has been sent via email to the users configured to receive that report For information on how to schedule reports see Scheduling Reports on page 407 If you do not receive a scheduled report check the Report Log see The Report Log on page 410 or contact your system administrator up time software 401 suondo 140day buipuejsjopun SL Understanding Report Options Generating Reports Report Generation Options 402 up time can generate reports in four ways Print to Screen Displays the report in a new window This is the default option PDF to Screen Converts the report to a PDF document and displays it in a new window You can save the PDF document to a local or network drive or print it XML to Screen Displays the report as an unformatted XML document in a new window Email Address Enables you to email the report as a PDF document attached to an email message to A specific up time user for example a system administrator Click User and then select the name of an up time user to whom you want to send the report from the dropdown list The members of one or more up time user groups Click Group and then select the name of an up time user group to which you want to send the report from the dropdown list One or
409. om the Comparison Method dropdown list and then select one of the following up time software Stopped The service is stopped Start Pending The service is stopped or paused while waiting for another process or condition to be satisfied before starting Stop Pending The service is running while waiting for another process or condition to be satisfied before stopping Running The service is running Continue Pending OL Oo Oo ie Er a ie Oo Oo The service is waiting for another process or condition to be satisfied before continuing to run the service Pause Pending The service is running while waiting for another process or condition to be satisfied before pausing the service Paused The service is paused 183 Microsoft Windows Monitors Windows Service Check 184 Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 To save the data from the thresholds for graphing or reporting click the Save for Graphing checkbox beside each of the metrics that you selected in step 3 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timi
410. omponents can use to connect to a named instance of SQL Server A computer can concurrently run any number of named instances of SQL Server A named instance can run at the same time as an existing installation of SQL Server version 6 5 or SQL Server version 7 0 The instance name cannot exceed 16 characters U iy o o 2 ce ce 7 A new instance name must begin with a letter an ampersand amp or an underscore _ and can contain numbers letters or other characters Do not use SQL Server sysnames and reserved names as instance names For example default is a reserved name and should not be used as an instance name You can have multiple instances of SQL Server installed on one computer Fach instance operates independently from the other instances and applications can connect to any of the instances 273 Database Monitors SOL Server Tablespace Check e Full Warning Threshold Enter a percentage of the maximum file size you want to set as your warning threshold e Critical Warning Threshold Enter a percentage of the maximum file size you want to set as your critical threshold Response Time Enter the Warning and Critical Response Time thresholds for the length of time a service check takes For more information see Configuring Warning and Critical Thresholds on page 144 3 Click the Save for Graphing checkbox to save the data fora metric to the DataStore which can be used
411. on From this page you can add SLOs as well as associate Alert Profiles and Action Profiles to the SLA up time 5 User Guide up time Adding and Editing SLA Definitions Adding Service Level Objectives to an SLA To add a service level objective to an SLA do the following 1 Inthe My Infrastructure panel click the name of the Service Level Agreement that you want to edit The Service Level Agreement General Information subpanel appears 2 Click Add SLO The Add Service Level Objective window appears http dev sla1 9999 Add Service Level Objective Mozilla Firefox 9L Service Level Objective Name o f Service Level Objective Description of Service Level Objective Monitoring Period Target Percentage Compliance Period Type UPTIME 10 1 1 57 Add gt lt Remove Add All gt gt lt lt Remove All Done 4 3 Enter a descriptive name for the SLO in the Name of Service Level Objective field This name will appear anywhere in My Infrastructure and Global Scan e x co gt w D S Oo OD Fr lt D gt Ko OD OD 3 D pr 7 4 Enter a description for the SLO in Description of Service Level Objective field Although this step is optional this description will appear in SLA Detailed reports therefore it is recommended that you provide a detailed description of the SLO including what goal
412. on see Configuring Warning and Critical Thresholds on page 144 To save the data from the thresholds for graphing or reporting click the Save for Graphing checkbox beside the Response Time metrics Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information up time 5 User Guide dh up time Windows Event Log Scanner e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 4 Click Finish OL Oo Oo ie Er a ie Oo Oo up time software 181 Microsoft Windows Monitors Windows Service Check Windows Service Check The Windows Service Check monitor alerts you to changes in the status of Windows services Windows services are processes that extend the features of Windows by providing support to other programs they are controlled in the Microsoft Management Console The default installation of Windows provides a core set of services and configurations that suits most needs There are approximately 100 services in the Windows Server family of operating systems You can add services that you develop or by i
413. on pSeries servers The following graphs visualize the workload information for all LPARs on a server e Workload CPU The amount of CPU time that is being used by the LPAR e Workload Memory The total amount of memory being used by an LPAR e Workload Disk The amount of data that has been transferred to and from the disk e Workload Network The amount of data that has been transferred over the network interface used by the LPAR You can also graph the CPU entitlement of individual LPARs using the CPU Utilization graph See LPAR CPU Utilization Graphs for more information Generating an LPAR Workload Graph To generate an LPAR Workload graph do the following 1 Inthe Global Scan or My Infrastructure panel click the name of the pSeries server which is hosting the LPARs whose information you want to graph 2 Inthe Tree panel click the Graphing tab 3 Click one of the following options e Workload CPU e Workload Memory e Workload Disk e Workload Network up time software 509 Using Graphs LPAR Workload Graphs 4 Select the start and end dates and times for which the graph will chart data For more information see Understanding Dates and Times on page 22 5 Click Generate Graph LPAR CPU Utilization Graphs Using the CPU Utilization graph you can better determine the CPU entitlements of the LPARs on a system The entitlements indicate the amount of CPU power that is assigned to an indi
414. onfiguration parameters available for the chosen workflow consult the appropriate developer s documentation up time software 391 Alerts and Actions Action Profiles 392 If you would like the Action Profile to write to a log in the Log File field enter the name and path to a log file on the Monitoring Station to which error information will be written If you would like the Action Profile to run a recovery script in the Recovery Script field enter the name and path to a script that will reboot a server or restart an application process or service The recovery script will also have the following information appended to it e the date and time on which the error occurred e the type of error notification that was sent the name of the host on which the error occurred e the state of the host e the name of the service that threw the error e the state of the service e the output that was generated by the error For example usr local uptime recover sh 24 12 2007 5 01 05 Problem printserver null WinSrv Print Spooler CRIT threshold error servicestatus Not Running does not match Running Service Print Spooler found status Not Running took 12ms You can also use the recovery script to file trouble tickets with a system like Remedy or to interact with third party software packages If you are setting up an Action Profile for a Windows server you can also leave the Windows Service as Agent and
415. onnections v SMTP Local Queue Length SMTP Messages Per Second v SMTP Outbound Connections v Web Mail Auths Per Second v Web Mail Sends Per Second v e Click Generate Graph 56 up time 5 User Guide Data Type integer integer integer integer integer integer integer integer integer integer integer hy up time Searching and Filtering Searching and Filtering If you have a large number of hosts on your system you can use the search and filtering functions in the up time Web interface to quickly display and view information about specific hosts Using the Search Box You can use the search box at the top of the up time Web interface to display the basic information about a particular host To use the search box do the following 1 From anywhere in the up time Web interface enter any of the following information in the Search box e The name of the system for which you want to search example if you want to display all systems whose names B You can enter a partial name in the Search box For start with Web enter Web in the Search box e Details about the architecture of the servers For example to use an operating system as the search criteria enter Linux in this field e Any information that may appear in the Custom fields in the profile for the system 2 Click Go The following information is displayed e name of the host e description of the host if any
416. ons in a real time centralized view At the datacenter level up time continuously monitors your servers applications databases and IT resources and alerts you to problems Using the information that up time gathers you can solve problems before they impact your business For example a service monitor detects that a large volume of email messages are going back and forth between a particular email address in your organization and an external domain This could indicate that a high number of legitimate emails are being sent or it could indicate that a virus or a trojan is active on a system in your environment You can also generate reports and graphs to visualize the information that up time gathers By analyzing the information reports and graphs you can do the following e identify and isolate performance bottlenecks e monitor and report on the availability of services e determine the specific causes of a problem in your network e perform capacity planning e consolidate servers where necessary develop more precise management reports Who Should Read This Guide The up time User Guide is intended for various types of users e system administrators who want to use up time to monitor a single system or multiple systems in a distributed environment at a single datacenter e users who gather information about their systems to perform analysis and make key business decisions e IT managers who will determine the availability
417. or evaluates the size of data files within SQL Server databases up time gathers information from all the databases across all instances on a system and aggregates this information in the metrics that it returns This monitor also reports whether or not any of the data files in a filegroup or any log file in any database in the instance exceeds warning and critical thresholds If warning or critical thresholds are exceeded up time generates an alert Structure of a SOL Server Database 270 Each SQL Server database consists of at least two files e aprimary data file with the extension mdf e alog file with the extension 1df There are also secondary data files with the extension ndf A database can have only one primary data file zero or more secondary data files and one or more log files Each database file can only be used by one database In a database data files store persistent data For ease of management you can group one or more data files into logical tablespaces The SQL Server equivalent of an Oracle tablespace is the filegroup SQL Server filegroups come under and are associated with the individual databases The SQL Server data hierarchy is Instance Database FileGroup Data file Each data file can be a member of only one filegroup but the log files are managed separately from one another There are three types of filegroups e primary e user defined e default When you configure your SQL Server database
418. or information on installing agents see Installing Agents on page 40 166 up time 5 User Guide hy up time File System Capacity File System Capacity The File System Capacity monitor checks the amount of total and used space in kilobytes on a disk This monitor then compares the capacity to the specified warning and critical thresholds On Windows servers up time looks at the capacity of all local drives on UNIX and Linux servers up time looks at all local file systems e g var export usr On UNIX and Linux systems you can configure the monitor to check all of the mount points on a system or just specific mount points Windows Volume Mount Points can be monitored when the host Element is monitored through WMI not the up time agent see Working with Systems on page 67 for more information Note that the level of detail for mounted volumes on Windows XP and 2000 when reported through WMI is limited the mounted volume name and exact location are not always accurate but other pertinent information such as volume capacity and usage are correct This monitor does not check floppy drives tapes drives or CD ROM drives Configuring File System Capacity Monitors To configure File System Capacity monitors do the following 1 Complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following fields
419. or memory usage of the JVM you can tune the JVM to ensure that it is working at optimal levels up time software 465 Using Reports Reports for J2EE Applications WebLogic Report The WebLogic report charts a set of metrics see WebLogic on page 203 for details that provide insight into the health and performance of a WebLogic server Using the WebLogic report you can pinpoint problem areas on your WebLogic server and quickly determine how to fix those problems Depending on the number of options that you select the report can become quite long and can take considerable time to generate For most options the report contains charts for two or more metrics Creating a WebLogic Report To create a WebLogic report do the following 1 In the Reports Tree panel click WebLogic 2 In the Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 3 Inthe Report Options area select one or more of the following options e Thread pool The report charts the number of pending request in the thread pool as well as the free size of the pool e Server Stats The report charts the number of connection requests that WebLogic accepts before refusing additional requests as well as the number of open sockets to the server e JDBC Connection Pool The report charts the number of active and leaked connections to the server as well as the size o
420. or more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish LL gt xo O o fe 5 e Oo 7 up time software 193 Application Monitors Exchange Exchange The Exchange 2003 and Exchange monitors identify when certain performance counters for Microsoft Exchange servers have exceeded user defined thresholds These thresholds can be for example an inordinately high number of inbound connections or a rapidly growing message queue Whenever a threshold exceeds a warning or critical amount up time generates an alert Use up time s Exchange 2003 monitor if you are using and monitoring Microsoft Exchange 2000 or 2003 use the Exchange monitor for later versions e g Microsoft Exchange 2007 and 2010 Configuring Exchange 2003 Monitors 194 To configure an Exchange 2003 monitor for your Microsoft Exchange 2000 or 2003 server do the following 1 Complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following settings by clicking the checkbox beside each option and then specifying a warning and critical threshold If the thresholds that you set are exceeded then up time generates an alert For more information see Configuring Warning a
421. ord The password for the account with access to WMI on the windows domain up time 5 User Guide f up time Working with Systems Adding Multiple Systems to up time To add multiple systems to up time do the following 1 Copy the hosts file to the directory in which you installed the up time Monitoring Station 2 Atthe command line navigate to the scripts folder For example if you installed the Monitoring Station in the default location on a Windows system navigate to the following folder C Program Files uptime software uptime scripts 3 Enter the following command addsystem lt path and filename gt Where lt path and filename gt Is the name of the text file that contains the list of systems that you want to add to up time along with its full path The systems listed in the file are added to up time unless e up time cannot connect to the system e The system does not exist in your environment The system has already been added to up time Examples of Hosts File Entries The following table contains sample host file entries for each type of system that you can add to up time Host Type Sample Hosts File Entry Agent Host Name prod mainSystem Display Name prodl Description Main production server Type Agent Service Group Production Systems Port 9998 Group Windows 2003 Servers up time software 97 ainjonsjseajuy INOA BuiBeuey pue buiuijeg FI Defining and Managing Yo
422. ore information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish OL Oo Oo ie Er a ie Oo Oo up time software 189 Microsoft Windows Monitors Active Directory 190 up time 5 User Guide f up time CHAPTER 11 Application Monitors The application monitors track the performance and health of following Uptime Agent iz ses coce ca ice cdde cath arsen dekken nakke kake TINERET orden 192 Ex hang Orap inn e Enr EFE AAEE ArI de dd eee 194 EE T E E E P E hae oaes 200 WebLogic certs tats rest Eria ENOTEN TETEPANA KIA ETAT KEEA EINKI RO 203 WebSphere vassadmergenr nedgang vasre aE 211 ESX Workload icccccieciaares ENE aR EEEE EET sb eanisa ekke 217 ESX Advanced MetricSJ aaarrunuvurananvrenvnnvrenenerereennenenner 220 Web Application Transactions rrrvrnrnrnrrrrrnnnnnsnnnnnnnnnnr 223 Email Delivery Monitor rannnnnmnnrennnan na annnnn renn ete e eee ennnaee 230 Splunk QUery lt uuaasatdnke annen 236 Live Splunk Listener rrrrrrrrrrnunnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnner 238 191 Application Monitors Uptime Agent Uptime Agent The Uptime Agent monitor determines whether or not an agent is running on a system that you are monitoring Configuring Uptime Agent Monitors 192 To configure Uptime Age
423. ork devices and any other resources that you might find useful to maintain The Active Directory monitor can check for any settings or information in your Active Directory The monitor can start the check from any location within your Active Directory structure The Active Directory monitor attempts to match information that you have specified with information available in your Active Directory If the monitor finds the information the service monitor returns a status of OK Otherwise the monitor returns a Critical error and up time generates an alert Configuring Active Directory Monitors To configure Active Directory monitors do the following 1 Inthe Active Directory monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following fields e Port The number of the port number on which the Active Directory server is listening e Password The password that is required to log in to the Active Directory server up time software 187 OL Oo Oo ie Er a ie Oo Oo Microsoft Windows Monitors Active Directory e Base The location in the Active Directory from which you want the monitor to begin searching for information e Bind The Bind string which associates user account properties and Active Directory account attributes This string gives you access
424. ormance of the Java Virtual Machine JVM that is running on the WebSphere server e Transaction Manager A set of counters that report on the status of global local and concurrent transactions e Servlet Session Manager A set of counters that report on usage information from the HTTP servlets that are running on the server Optionally click Select All to generate a report on all of the options listed above If you selected more than one report option and plan to report on more than one system you can optionally click the Group report options by system checkbox Selecting this option combines the metrics for each system for which you are generating the report To generate reports for systems in specific groups select the groups from the List of Groups area To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views If you are generating reports for specific systems select the systems from the List of Systems Select a report generation option See Report Generation Options on page 402 for details If you want to save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information up time 5 User Guide hy up time Reports for J2EE Application
425. ormation about adding service instances see Using Service Monitors on page 135 e Host Check List the basic checks for example a ping for a system e Maintenance Lists whether or not there are any maintenance periods scheduled for the system For more information on maintenance periods see Scheduling Maintenance on page 161 4 Optionally click Service Metrics to generate a graph that visualizes retained data over a given period of time For more information about retained data see Understanding Retained Data on page 24 To generate a graph do the following e Select the date range for the graph from the Date Range area For more information see Understanding Dates and Times on page 22 up time software 55 Getting Started Viewing System and Service Information Inthe Current Retained Service Metrics area select the retained data variables that you want to graph as shown below Service Monitor Metrics Specific Date and Time Last Date Range From C Quick Date To 2008 04 18 MM DD 2008 04 18 HH MM SS 00 00 00 23 59 59 Current Retained Service Metrics Instance Name Instance Description Exchange z m mm r Generate Graph Variable Units Response time v SMTP Bytes Received Per Second v SMTP Bytes Sent Per Second SMTP Bytes Total Per Second v SMTP Connection Errors Per Second v SMTP Inbound C
426. ort Check Optional Select this option to open a socket connection that determines whether or not the database is listening on the defined port You should perform a port check because SQL Server can communicate statically on a defined or default port or communicate dynamically on a port assigned by the operating system e Username The user name that is required to log into the SQL Server database e Password The password that is required to log into the SQL Server database e Instance The name of the SQL server instance to which you want to connect You can install multiple versions of Microsoft SQL Server on one computer When installing a new version of SQL Server 2000 or maintaining an existing installation you can specify it as up time software A default instance of SQL Server This instance is identified by the network name of the computer on which it is running SQL Server version 6 5 or SQL Server version 7 0 servers can operate as default instances However a computer can have only one version functioning as the default instance at one time A named instance of SQL Server This instance is identified by the network name of the computer plus an instance name in the format lt computername gt lt instancename gt Most applications must use SQL Server 2000 client components to connect toa named instance However you can use the SQL Server version 7 0 Client Network Utility to configure a serv
427. ory where up time is installed Run the following command fulldatabasedump Depending on the size of your DataStore this process can take anywhere from several minutes to several hours The utility creates the file upt imedump YYYY MM DD xml gz for example upt imedump 2007 01 02 xml gz This file is saved in up time s root installation directory Windows Vista users can find the DataStore archive in the Virtual Store instead of the default location i e C Users uptime AppData Local VirtualStore Program Files lt uptime install directory gt Restoring the DataStore To restore your DataStore do the following 1 2 Ensure that the DataStore service is running Use the resetdb utility with the really option to delete then recreate the database structure that is used by up time by running one of the following commands e Linux usr local uptime resetdb really e Solaris opt uptime resetdb really e Windows C Program Files uptime software uptime resetdb really Run the following command fulldatabaseimport path lt filetoimport gt xml gz 549 Configuring and Managing up time Archiving the DataStore Where path lt filetoimport gt xml gz is path to and file name of the archived contents of your DataStore For example to import an archive that is located in up time s root installation directory enter the following fulldatabaseimport uptimedump 2007 01 02 xml gz Windows Vista users can f
428. osoft com en us library aa822854 28v V5 85 29 aspx Adding a WMI System to up time To add an agentless WMI system to up time do the following 1 On the up time tool bar click My Infrastructure then click Add System Network Device up time 5 User Guide d up time Working with Systems 2 Complete the Display name in up time and Description fields See Adding Systems or Network Devices on page 69 for more information 3 Select WMI Agentless from the Type of System Device dropdown list 4 Inthe Host Name field enter the actual name or IP address of the machine that up time will be monitoring 5 Select the Use WMI Global Credentials check box if they have been configured and you would like to use them see Configuring Global WMI Credentials on page 536 for more information otherwise complete the following fields e Windows Domain The Windows domain in which WMI has been implemented e Username The name of the account with access to WMI on the Windows domain e Password The password for the account with access to WMI on the windows domain 6 Ifyou want to associate this system with a group select its name from the Group dropdown list 7 If you want to associate this system with a Service Group select its name Service Group dropdown list 8 Click Save Switching an Element to WMI Data Collection To change the data collection source for an individual Windows Element from the up time Agent
429. ot respond within the specified thresholds up time generates an alert Configuring SMTP Email Delivery Monitors To configure SMTP Email Delivery Monitors do the following 1 up time software In the SMTP Mail Delivery monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 Complete the following fields Port The number of the port on which the SMTP server is listening The default is 25 Expected Server Response Enter the Warning and Critical thresholds for the amount of time that is required to send and receive a ready response from the SMTP server For example the following response reveals the ready status of the SMTP server 220 mail yourdomain com ESMTP Sendmail 8 12 10 SUN 8 12 8 Tue 14 Dec 2005 13 25 15 0400 lt EDT gt For more information see Configuring Warning and Critical Thresholds on page 144 Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 309 EL 2 pr Oo Ww 92 Oo Oo Oo 2 Network Service Monitors SMTP Email Delivery 3 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph 4 Complete the following settings Timing S
430. ou will no longer be able to manually add users from within up time the Add New User option on the Users panel will not be available Regardless of which authentication and synchronization 3 method is selected the up time admin user profile will always be stored and authenticated against the password found in the DataStore O e e lt gt e Cc 2 oD 7 Active Directory Authentication To use Active Directory for user management you need to provide up time with your organization s AD information You can also define whether and how much user information is synchronized between AD and up time s user list Enabling Active Directory for Authentication To configure up time to check an Active Directory listing for user passwords do the following 1 On the up time tool bar click Config up time software 349 Configuring Users Changing How Users Are Authenticated 10 In the Tree panel click User Authentication Click Edit Configuration Select Active Directory as the authentication method You will next need to provide access details for the Active Directory server In the Primary Domain Controller field enter the host name of the server acting as the domain controller most likely enabled as the global catalog If applicable in the Backup Domain Controller field enter the name of the server acting as an additional domain controller on the same domain Enter the Port thro
431. ovell NRM System to up time To add a Novell NRM version 6 5 system to up time do the following 1 2 On the up time tool bar click My Infrastructure and then click the Add System Network Device tab Complete the Display name in up time and Description fields up time 5 User Guide hy up time Working with Systems See Adding Systems or Network Devices on page 69 for more information 3 Select Novell NRM from the Type of System Device dropdown list 4 Complete the following fields e Host name The actual name of the machine that up time will be monitoring or the IP address of the machine e Port The port on which the NRM is listening The default is 8008 for a port that is not using SSL The default for a port that is using SSL is 8009 e Username The NRM administrator account name This field is mandatory e Password The NRM administrator password This field is mandatory The password is encrypted and stored in the up time DataStore 5 If you want to associate this system with a group select its name from the Group dropdown list 6 If you want to associate this system with a Service Group select its name Service Group dropdown list 7 Click Save NRM Statistics Captured by up time up time captures the following Novell NRM system version 6 5 statistics e Work To Do Response Time e Allocated Service Processes e Available Server Processes e Abended Thread Count up time soft
432. ox appears while the your Web browser to close the dialog box 4 inthe SNMP MIB Browser click one of the following options e Load MIB from File e Load MIB from Server 5 In the window that appears do one of the following e Tf you are loading a MIB from your computer navigate to the directory containing the MIB or OID Select the MIB and then click Open e If you are loading a MIB from a server select the MIB from the list that appears and then click Load Selected MIB The MIB appears in the MIB selection tree You can select any OID within the MIB to monitor with the SNMP service monitor up time software 313 EL 2 pr Oo Ww 92 Oo Oo Oo 2 Network Service Monitors SNMP 314 Adding OIDs Once a MIB is loaded into the MIB selection tree you can add the OIDs in the MIB to the SNMP monitor To add OIDs do the following 1 Navigate the MIB directory tree to find the OID that you want to add 2 Double click the OID The OID appears in the Selected OIDs panel 3 Click Next The Add SNMP Service Monitor window appears See Configuring SNMP Monitors on page 315 for information on setting up the SNMP monitor Manually Adding OIDs If you know the OID that you want to add you can add it without navigating the MIB tree To add OIDs manually do the following 1 Type the name of the OID in the Add OID Manually field 2 Click Add OID Manually 3 C
433. ox to save the data for a metric to the DataStore which can be used to generate a report or graph 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish up time software 265 ZL U iy o o 2 ce ce 7 Database Monitors SQL Server Advanced Metrics SOL Server Advanced Metrics SQL Server Advanced Metrics monitor collects information on the availability and performance of individual SQL Server databases You only need to configure one SQL Server Advanced Metrics monitor for each system You can however create multiple SQL Server Advanced Metrics monitors for a system if you need to separately capture different SQL Server performance metrics See the section Using Multiple SQL Server Advanced Metrics Monitors for more information For example consider a host configured to have the following e an up time agent installed e two database instances e four databases The SQL Server Advanced Metrics monitor can capture performanc
434. p time Using a Distribution List is an easy way to broadcast to a large group of users without having to create and manage individual up time user profiles for each member Distribution Lists like individual user profiles are associated with Notification Groups and can be configured to broadcast specific types of status alerts e g only Critical level and Recovery alerts Adding Distribution Lists 344 To add Distribution Lists do the following 1 Click Users on the up time tool bar 2 Inthe Tree panel click Add New Distribution List 3 Type a descriptive name in the Display Name field You will select this name when defining a Notification Group 4 Select a Monitoring Period from the Time Period for Emailing list e 24x7 e 9am to 5pm weekdays e another Monitoring Period that you have previously created 5 Select the Should the Distribution List receive alerts check box 6 Configure the type of alerts those on the Distribution List will receive by selecting one or more of the following check boxes e Alert on Critical The user receives an alert when up time detects a critical problem with one or more monitored services up time 5 User Guide hy up time Managing Distribution Lists e Alert on Warning The user receives an alert when up time detects a potential problem with one or more monitored services e Alert on Unknown The user receives an alert when up time detects an error in the configuration of the
435. p time e Description Group e Service Group 5 Click Save to add the instance to up time up time 5 User Guide hy up time Working with Systems Adding Individual LPARs to up time After you have added pSeries servers whether managed by an HMC or not to up time you can add individual LPARs from those systems to up time While up time collects workload data from all LPARs on a pSeries server whether they have been added to up time or not adding LPARs can help you keep track of any specific LPAR To add an LPAR to up time do the following 1 Inthe My Infrastructure panel click the name of the pSeries server that contains the LPAR that you want to monitor A new window containing information about the system appears 2 Click the Info tab and then click Logical Partitions A list of LPARs appears in the sub panel 3 Click the Add to up time button beside the LPAR that you want to add to up time The Add System window appears 4 If necessary you can change any of the following options e Display name in up time e Description e Group Service Group 5 Click Save to add the LPAR to up time Agentless WMI Systems If the Windows based component of your infrastructure already makes use of WMI Windows Management Instrumentation Windows Elements can be configured to use it for data collection as an alternative to the up time Agent Using WMI allows you to avoid the overhead associated with up time softwar
436. p time Reports for Performance and Analysis 6 To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views 7 If you are generating reports for specific systems in your environment select them from the List of Systems 8 Select a report generation option See Report Generation Options on page 402 for details 9 Doone of the following e Click the Generate Report button e Enter a name for the report in the Save to My Portal As field and optionally enter text in the Report Description field Then click Save Report The report parameters are saved to the My Portal panel Doing this does not generate the report 10 To schedule the saved report to run at a specific time or interval click the Scheduled checkbox See Scheduling Reports on page 407 for more information on configuring a scheduled report Service Monitor Metrics Report You can configure the up time service monitors to retain data which is saved to the up time DataStore for later use The Service Monitor Metrics report visualizes the retained data in a line chart For example if you have configured a service monitor to retain response time data then this report charts any changes in the response time in milliseconds that have occurred over the time period that you specified for the report Creating a Service Monitor Metrics report is a two step
437. peated aes eee 303 POP Email Retrieval cccccccccccccccnesecenecscuceeseneuetsestetenaes 305 SSH Secure SHOW orrrriris is resres tinrent EREU EEEN AEE 307 SMTP Email Delivery rrrrrirrrrrirsirskipirsrrs karirana Ea EEEE EEESE RE 309 SNMP satse NARR 311 TOP EE coves sags cage eaday cays barnas vand dike kekn naken coeds 318 279 Network Service Monitors DNS DNS DNS Domain Name Server is a distributed database that links various host names to specific Internet addresses The DNS monitor determines the IP addresses of external and internal host names by matching a virtual host name to an expected IP address If a match is made the status of the service monitor is OK You can for example use the DNS monitor to e ensure that your audience can access your Web site or portal by making sure that a selected address can be resolved e identify instances in your network environment where resources have had their IP addresses changed and now the resource is no longer available To collect performance information the DNS monitor e opens a UDP socket to a DNS server e creates a query packet e sends the query packet e waits for a response e parses the answers The DNS monitor does not check for the NS or MX records which return names and not IP addresses Non authoritative answers as well as authoritative responses are used Before You Begin 280 Before configuring the DNS monitor determine the IP address for the
438. pecific Applications in your environment select them from the List of Entities 14 Select a report generation option See Report Generation Options on page 402 for details 15 To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information up time software 451 Using Reports Using the File System Service Time Summary Report Reports for Capacity Planning The following is an example of a File System Service Time Summary report File System Service Time Summary Date Range 2008 04 17 00 00 00 to 2008 04 17 12 48 45 between the time range 00 00 to 23 59 Sorted by Descending High Service Time Showing Percentile 95 0 Threshold 20ms System Name Disk lab t1 4 10 1 1 234 corodo HP Integ hpinteg c210d0 c2t1d0 lab t1 2 10 1 1 232 cOt0dO The Vault vault 1 o AIX DEV LPAR 10 1 1 57 hdiskO QA RedHat Instance qal rhes4 x86 qa DC1 10 1 0 98 MyMachine dev rmeloche In this example the disks on each system have high levels of service time and they are in the highest percentile that exceeds the service time threshold 452 File System F Service Time milliseconds High 382 00 341 00 234 00 185 00 171 00 160 00 41 00 29 00 24 00 23 00 up time 5 User Guide Low Average 2 50 53 92 26 37 119 71
439. performing for the time period In the following example the email server performance SLO is achieving 90 03 of its 99 0 target Although the email server availability SLO is achieving its target 99 43 vs 99 both SLOs downtime affects SLA downtime In thise case combined SLO downtime results in the SLA only achieving 89 47 of its target resulting in a critical status r Current SLA Summary Name Value Description ent consists of Exchange mail server availability and Monitoring Period Current Time Achieving Trend Analysis Compliance Period 63 of compliance period 100 of allowable downtime used FY N Service Level Objectives Name Description Achieving Mail Server Availability 99 43 Mail Server Performance 90 03 99 0 L See Viewing SLA Details on page 360 for information on how to find information such as the Achieving statistic in an SLA summary 365 OL e x co gt w D lt Oo OD Fr lt D gt Ko OD OD 3 D pr 7 Working with Service Level Agreements SLA Creation Strategies SLA Creation Strategies The key to an effective SLA is defining a service level that satisfies end users yet is also attainable by IT staff and their systems configurations This section covers the suggested steps to pinpointing this target service level e ensure service monitors exist for all SLA related Elements if you are a new
440. pically used as a preliminary step toward root cause analysis When you first acknowledge an issue by clicking an Element name on either Global Scan or the My Alerts section of My Portal you are shown the Quick Snapshot for that Element From here you can scan the information provided in the charts and tables and begin further investigation For example if you notice problem while viewing the Quick Snapshot you can generate a report to obtain more information about the problem The Quick Snapshot contains the following information System Status Charts CPU Usage Memory Usage Disk I O transfers sec Network I O rates Outages Disk usage Top 10 Processes Process name Process ID CPU usage memory usage File System Statistics Device Mount Size Used space Available space used up time software 489 Using Graphs Viewing the Status of a System Viewing a Quick Snapshot In the Global Scan panel click the name of the system whose information you want to graph The Quick Snapshot is displayed by default Current CPU Memory Performance Summary Usr Sys Wio 8 o CPU Usage last 24 hours Memory Used last 24 hours Usr B Sys D Wio W Total Memory MB E Memory Used MB a 140 31 120 2 110 100 Disk 1 0 last 24 hours Network I O last 24 hours E Transfersisec Receive Rate Send Rate 8 I S 3 E F a E Disk Usage las
441. played in the up time Monitoring Station As well you can specify that the monitor writes the data that the script returns to the up time DataStore You can use the retained data to later generate a Service Metrics report see Service Monitor Metrics Report on page 425 or a Service Metrics graph see Viewing System and Service Information on page 50 Configuring Custom Monitors 324 To configure Custom monitors do the following 1 Inthe Custom monitor template complete the monitor information fields To learn about monitor information fields see Monitor Identification on page 141 2 Complete following fields e Script Name The name of and path to the script or program on the Monitoring Station that will collect metrics m The uptime user account on the up time Monitoring Station must be able to execute the script or program that you use Ensure that the permissions for the uptime user account are set correctly e Arguments Optional Specify any arguments that are required by the script or program up time 5 User Guide dy up time Custom Monitors Output Optional Specify a comparison method to override the settings of an Alert Profile or to return only the most severe errors Do this by selecting an option from the Comparison Method dropdown lists beside the Warning and Critical fields Then enter a value in the field For example to return only unknown errors you can select Exa
442. plication by clicking the Logout button 2 Ensure you are logged in to the Monitoring Station system as the local administrator up time may not function properly if the Monitoring Station is installed when you are logged in as a domain or non local administrator 3 Double click the following file up time 5 0 lt buildf gt win32 x86 ex Where lt build gt is the number of the up time build that you are installing For example up time 5 0 455 win32 x86 ex 4 On the Introduction screen click Next 5 On the License Agreement screen carefully read the up time end user license agreement and then click the I accept the terms of the license agreement option 6 Click Next 7 Do one of the following to set the location where up time will be installed up time 5 User Guide hy up time Installing the up time Monitoring Station e Click Next to accept the default location C Program Files uptime software uptime e Inthe Please Choose a Folder field type the name of the directory where you want to install the application and then click Next Click Choose and select a directory from the Browse for Folder window e To recover the default directory click Restore Default Folder 8 Do one of the following to set the location where the up time DataStore will be installed e Click Next to accept the default location C Program Files uptime softwareluptimelDataStore 5 a Si Ko xe 3 D e I
443. porting and graphing tools that enable you to visualize performance data You can use the reports and graphs as the starting point when analyzing problems in your environment Understanding Reports Reports enable you to visually analyze how individual critical resources such as memory CPU and disk resources are being consumed over specific period of time For detailed information about reports see Using Reports on page 413 If you need to regularly run certain reports you can save them to the My Portal panel See Scheduling Reports on page 407 for more information Understanding Graphs You can graph performance information when you need to view the most common or pertinent performance information for servers in your environment For example you can use a graph to determine CPU usage or the available capacity on a file system Graphs give you a fine level of performance detail You can view graphs in two ways e With Internet Explorer in Microsoft Windows Graphs are rendered using an ActiveX graphing control You can edit and manipulate a graph once it has been displayed and you can create trend lines e Using the Java graphing tool on any platform e g in Firefox running on Linux For more information on graphing see Understanding Graphing on page 479 and Using Graphs on page 487 12 up time 5 User Guide hy up time Understanding Agents i Understanding Agents Agents are small applic
444. pplication Transaction monitor then click Continue The Web Application Transaction Recorder is displayed and the monitor is now listening on port 8001 for traffic Begin stepping through the Web transaction as an end user providing the required data or actions Every URL visited during the transaction is logged and displayed in the recorder B The Web Application Transaction monitor records all data 6 7 up time software inputted during recording this includes any login information It is recommended that you use a test account for the Web application otherwise any user data will be visible in the recorded script At each major step in the Web transaction that signals a new analysis point enter a checkpoint name in the text box at the top of the window then click Mark Checkpoint For example create a checkpoint at a transaction step where the application takes user inputted data and makes database calls You will later set Warning and Critical thresholds that apply to every segment declared in your recording It is recommended that the divisions between your checkpoint intervals are reasonably consistent Continue to repeat steps 4 and 5 until you have completed enough of the Web transaction to test it then click Next Complete the monitor information fields 225 LL gt xo O o fe 5 e Oo 7 Application Monitors Web Application Transactions 226 To learn how to con
445. pplication in Description of Application field 4 Optionally select the group of systems in your up time environment with which this system will be associated from the Parent Group dropdown list By default the Application is added to the My Infrastructure group For more information on groups see Working with Groups on page 105 up time software 101 ainjonsjseajuy INOA Huibheuew pue buiuijeg FI Defining and Managing Your Infrastructure Working with Applications 102 5 10 11 12 13 Select one of the following options from the dropdown list above the Available Master Service Monitors list e the name of a specific system which displays all its service monitors e All which displays all service monitors for every system in your environment Select one or more of the service monitors from the Available Master Service Monitors list and then click Add Select one of the following options from the dropdown list above the Available Regular Service Monitors list the name of a specific system which displays all its service monitors e All which displays all service monitors for every system in your environment Select one or more of the service monitors from the Available Regular Service Monitors list and then click Add Click Save After closing the Add Application window the name of the newly created Application appears in the My Infrastructure panel as a link that can be clicked to view the
446. pply to the group from the Available Alert Profiles list then click Add 6 Select one or more users to add to the group from the Available Users list then click Add 7 Select one or more Distribution Lists to add to the group from the Available Distribution Lists then click Add 8 Click Save up time software 347 Configuring Users Working with Notification Groups Viewing Notification Groups You can view the details of a Notification Group to ensure that the group is properly configured The details of a Notification Group include e the Alert Profiles assigned to the group e the users in the group e whether or not the users are configured to receive alerts e the conditions on which alerts are sent to the users To view Notification Groups do the following 1 Click Users on the up time tool bar 2 Inthe Tree panel click View Notification Groups A list of Notification Groups appears in the Notification Groups subpanel 3 Click the name of the Notification Group that you want to view The details of the group appear in the Notification Groups subpanel 4 To view the details of an Alert Profile click the name of the profile Editing Notification Groups 348 If you find that a Notification Group is not properly configured you can edit that group To edit Notification Groups do the following 1 Do one of the following ni e Click the Edit icon o beside the Notification Group e Click the name of t
447. r no values respectively through the following uptime conf parameter auditEnabled yes Problem Reporting 552 When you encounter a problem with up time Client Care needs specific information to diagnose and fix the problem up time can automatically collect this information and compress it in an archive which you can send to Client Care up time 5 User Guide hy up time up time Diagnosis The archive contains the following up time configuration files system information log files database information and error files and a listing of the DataStore directory Optionally the archive will also contain a copy of the configuration data from your DataStore The archive is saved to the GUI problemreports directory on the Monitoring Station and has a file name with the following format prYYYYMMDD HHMMSS zip e YYYYMMDD is the date on which the report was generated e g 20061212 e HHMMSS is the time at which the report was generated e g 142306 Generating a Problem Report To generate a problem report do the following 1 On the up time tool bar click Config 2 Inthe Tree panel click Problem Reporting If you have generated problem reports in the past they appear in the subpanel 3 If you do not want to include a copy the configuration data from your DataStore click the Include config database dump option 4 Click the Generate Report button A message such as the following appears in the s
448. r 312 supported versions 312 SNMP monitor 311 adding OIDs 314 deleting OIDs 314 loading MIBs 313 manually adding OIDs 314 MIB browser 312 591 Index Solaris Mutex Exception report 436 Splunk Action Profile 394 Live Splunk Listener monitor 238 query monitor 236 SQL Server Advanced Metrics monitor 266 SQL Server Advanced 266 SQL Server Advanced monitor multiple 266 SQL Server Basic Checks monitor 262 SQL Server Tablespace Check monitor 270 SSH monitor 307 starting up time 49 supported Web browsers 28 Sybase monitor 275 System List Syslist 9 systems adding 69 deleting 111 T testing monitors 152 time period definitions 566 567 Top 10 Disks graph 516 topological dependencies 159 adding 160 viewing 160 U UNIX vs Windows 488 up time administrator account 48 exiting 49 installing 25 interface 6 monitoring concepts 4 overview 2 service information 52 services starting 531 stopping 530 stopping and starting 530 starting 49 starting and exiting 48 system information 50 tool bar 6 592 Config 9 Global Scan 7 My Infrastructure 7 My Portal 7 Reports 9 Services 8 Users 8 viewing information 50 up time Agent Monitor 536 up time Agent monitor 192 upgrading to up time 4 39 uptime conf database 532 NetFlow 542 remote reporting 541 RSS feed 537 Splunk 543 Ul instance 542 Web monitor proxy 540 user groups adding 342 deleting 343 editing 342 overview 341 viewing 342 user roles addi
449. r By All service instances that you have permissions to view and that match the filtering criteria appear in the subpanel If for example only 12 of the 58 up time 5 User Guide hy up time Searching and Filtering service instances match your criteria a message like the following one appears in the subpanel Search found 12 out of 21 services 5 To view all matches click the Show All button 6 To remove the filter criteria and restore the complete list of services click Clear up time software 59 Getting Started Audit Logging Audit Logging Enabling 60 up time can record changes to the application s configuration in an audit log The details of the configuration changes are saved in the file audit log found in the logs directory Windows Vista users can find the audit log in the Virtual Store instead of the default location i e C Users uptime AppData Local VirtualStore Program Files lt uptime install directory gt There are many uses for the audit log For example you can use the audit log track changes to your up time environment for compliance with your security or local policies You can also use the audit log to debug problems that may have been introduced into your up time installation by a specific configuration change the audit log enables you to determine who made the change and when it took effect The following is an example of an audit log entry 2006 02 23 12 28 20 082 dchiang
450. r information files or devices in a network The LDAP monitor can check for any settings or information in your LDAP directory The monitor can start the check from any location within your LDAP directory structure The LDAP monitor attempts to match information that you have specified with information available in your LDAP directory If the monitor finds the information the service monitor returns a status of OK Otherwise the monitor returns a Critical error and up time generates an alert If you do not specify any parameters then this monitor 3 only validates that an LDAP server is listening on the specified port Before You Begin up time software To configure the LDAP monitor you should understand how an LDAP directory works and know how LDAP is configured in your environment You can use the following tools to determine the Base Bind and Attribute values of the LDAP directory for which you want to search e atthe Windows command line use ntdsutil exe to retrieve information e one of the many freely available LDAP browsing and editing tools e your own network documentation and determine whether or not the proper configurations have been maintained 291 EL 2 pr Oo Ww 92 Oo Oo Oo 2 Network Service Monitors LDAP Configuring LDAP Monitors To configure LDAP monitors do the following 1 Inthe LDAP monitor template complete the monitor information fields
451. rademarks are registered trademarks of Microsoft Corporation Sybase PowerBuilder and other such trademarks are the registered trademarks of Sybase Incorporated All other trademarks belong to their respective companies property owners and organizations Contacting uptime software By mail uptime software inc 555 Richmond Street West PO Box 110 Toronto Ontario Canada MSV 3B1 Telephone 416 868 0152 Fax 416 868 4867 Contacting Sales To contact sales use the main telephone line 1 416 868 0152 and follow the prompts Please have the following information available so we may serve you better e Operating systems e Key applications and databases Deployment Timeframe e Project to deploy e Key problems e Present tools Contacting Support uptime software delivers responsive customer support Customer support is available to licensed and demonstration users uptime software offers user support through the following e Documentation e Application Telephone e E mail e Internet site Before contacting support consult the up time User Guide up time Release Notes or the help system from the Help button in the application To contact sales use the main telephone line 1 416 868 0152 and select option 2 d up time TABLE OF CONTENTS Welcome to up time ntroducing up IME setsicccscccicsscasdnadisetverentienaciandanecsanssmadecs 2 Who Should Read This Guide rrrnnnnnvnnnnnnrrvnnnnnvrrnnrrnvr
452. raph rrrnnnnnnnvrnnnnnnrrrnnrnnvrnnnnnn 508 LPAR Workload Graphs asrannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 509 Generating an LPAR Workload Graph rrennnnnvrvnnnnnnrrrennnrvrnnnnrn 509 LPAR GPU Utilization Graphsmumeinnirmmsmenansiusne in 510 Network Graphs ss serrr ere 511 UE EEE REESE RR NJS tomatoe 511 0 ST EE RE ET ET 511 NetFIOW EE EE 512 Generating a Network Graph errnnnrrnennnnnvrennnnnvrrnnnnnnrrrnnnnnnrnnnrnnn 512 Disk Performance Statistics Graph anrnnnnnnnnnnnnnnnnnnnnnnnr 514 Generating a Disk Performance Statistics Graph ornnnrnnnnnnn 514 Top 10 Disks Graph sssssssssnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn n 516 Generating a Top 10 Disks Graph rernnnnnnvnnnnnnnnvnnnnnnrrrnnnrnvrnnnnnn 516 File System Capacity Graph annnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 518 Generating a File System Capacity Graph arrrrrrnrnnnrrrnnnnnrrnnnnrn 518 VXVM Stats Graph sssssssssssnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 519 Generating a VXVM Stats Graph rrvrnnnnvvvnnnnnnrrvrrrnnvrrrnrrnnrrnnnnnn 519 Novell NRM Graphs nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 521 Generating a Novell NRM Graph rervrnnnnvvvnnnnnnnnvnnrnnnrrrnnnrn raneren 522 Instance Motion Graphs nnnunnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 523 Generating an Instance Motion Graph eerrnnnrnnnnnnnvrrnnnnnnvrnnnnnn 523 Displaying Detailed Process Information annannnnnnnnnn 524 Generating Detail
453. rate the report You can also schedule reports to be generated and sent by email at particular intervals See Scheduling Reports on page 407 for more information To save reports do the following 1 Inthe Save Report area of the Report subpanel select one of the following options e HTML e PDF e XML e Email 2 If you selected Email in step 1 specify one of the email options 3 Type a name for the report in the Save to My Portal As field 4 Optionally type a description for the report in the Report Description field 5 Click Save Report Saving Reports to the File System 404 You can save reports to the file system of a server in your environment so others in your organization can view the reports You can for example save a report to a Web server for viewing on your Intranet The reports are saved as either PDF or HTML files The system administrator can specify the up time 5 User Guide f up time Saving Reports directory on the server in which reports will be saved by adding the following entry to the file uptime conf publishedReportRoot lt directory name gt Where lt directory name gt the directory into which up time will write reports for example C Program Files uptime software uptime The report files are saved to a subdirectory named GUI published You need permissions to write to the published directory up time automatically names each report file The file name contains the fo
454. rd that is required to login to the database e SID The Oracle System Identifier SID that identifies this Oracle instance The SID defaults to the database name The SID is included in the CONNECT DATA paths of the connect descriptors in the tnsnames ora file and in the definition of the TNS listener in the listener ora file up time will attempt to connect to the database If connection fails the database returns a SOL exception error If you do not complete the Username and Password fields up time software 253 Database Monitors Oracle Advanced Metrics e Buffer Cache Hits Ratio Enter the Warning and Critical thresholds for buffer cache hits that are completed without accessing disk I O To gather as much application data as possible you should enter a high buffer cache hits ratio An Oracle database maintains its own buffer cache inside the system global area for each instance A properly sized buffer cache can yield a cache hit ratio over 90 If a buffer cache is too small the cache hit ratio will be small and the database uses more physical disk I O If a buffer cache is too large then parts of the buffer cache will waste memory resources Data Dictionary Cache Hits Ratio Enter the Warning and Critical thresholds for data dictionary cache hits that are completed without accessing disk I O The data dictionary cache tables provide information about all of the objects stored in your dictionary for example
455. re information 3 Click Finish gt Q lt o gt Oo a e gt e 7 up time software 329 Advanced Monitors Plug In Monitors Plug In Monitors up time can be integrated with plug in monitors that are not part of the standard distribution Plug in monitors are custom service monitors that have been created by uptime software or other up time users The benefit of sharing plug in monitors is that uptime customers with relatively unique but not exclusive monitoring needs can share the results of their efforts with each other Additionally if uptime software creates a custom plug in monitor for a customer s environment this monitor would then be available to all customers The uptime Support Portal is the host to all plug in monitors There you can find and download a plug in monitor archive before installing it on your Monitoring Station All plug in monitors that have been installed will always appear in the Add Service Monitor window ready to be configured as would any pre packaged system monitor Advanced Monitors custom C Custom with Retained Data C External Check Ce outer CPU Memory Usage C pez availability C DB2T JBoss m LSP can t find the monitor you re looking for Click here to find more plug in monitors Installing Plug In Monitors 330 To use a plug in monitor with up time do the following 1 Download the plug in monitor from the uptime Support Portal 2 Locat
456. read INSERT DELAYED also bundles inserts from multiple clients and writes them in one block The DELAY T D option has the following constraints e it only works with MEMORY tables 66 s INSERT DELAYED can only be used for INSERT statements that specify value lists as the server ignores DELAYED for INSERT DELAYED SELECT statements e the server ignores DELAYED for INSERT DELAYED ON DUPLICATE UPDATE statements 66 e you cannot use LAST INSERT ID to get the AUTO_INCREMENT value the statement might generate because the statement returns immediately before the rows are inserted 66 e DELAYED rows are not visible to SELECT statements until they actually have been inserted Delayed Errors The number of delayed insert threads that had an error Max Used Connections The maximum number of connections that have been in simultaneous use since the server was started Open Files The number of open files that must be exceeded before up time generates an alert Open Streams The number of open data streams that must be exceeded before up time generates an alert 247 ZL U iy o o 2 ce ce 7 Database Monitors MySQL Advanced Metrics e Table Locks Im
457. reads are waiting for the same resource During processing the Solaris kernel maintains locks on various resources The kernel allocates enough mutex locks to allow multiple CPUs to complete their work simultaneously However if two or more CPUs try to get the same lock at the same time all but one CPU will stall The Solaris Mutex Exception report pinpoints multi processor Solaris systems that have a high number of mutex stalls The report contains the following information e the display name in up time of the system e the number of CPUs on the system e the average number of mutex stalls for all the CPUs on the system over the time period that you specified if this value exceeds the threshold that you set it is highlighted in red Creating a Solaris Mutex Exception Report To create a Solaris Mutex exception report do the following 1 Inthe Reports Tree panel click Solaris Mutex Exception up time 5 User Guide l up time 8 Reports for Capacity Planning In the Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 If no data available for the date range the report displays a message indicating that there is no data for the time period If you want the report to only include data from certain hours during the day select those hours from the dropdown lists in the Daily Hours section as shown below Daily Hours Include da
458. red enter an appropriate administrative LDAP Username and LDAP Password required to access the directory 6 Inthe User Name field provide the attribute used to retrieve the user name For LDAP synchronization a user name is the minimum amount of directory information up time needs to map to a user profile up time software 353 Configuring Users Changing How Users Are Authenticated 7 For the remaining Field Mappings provide attibutes for other user details you would like to synchronize with the up time user profile i First Name ii Last Name iii Location iv Email Address v Pager Cellphone vi User s Windows Desktop Host Name vii User s Windows Desktop Workgroup Any user attributes chosen to be synchronized with the directory will not be editable in up time 8 Select a User Role to which any newly detected users will be assigned 9 Select a User Group to which any newly detected users will be assigned 10 Click Save Once saved up time will synchronize its list of users with the up time group in the LDAP listing at the specified interval up time DataStore Authentication By default up time uses its own database for password storage and look up If you are switching back to using the DataStore from a central AD or LDAP directory all up time users created while either was used as the authentication method will no longer have passwords You will need to modify all existing user accounts to include passwords
459. ripts directory on the Splunk server 4 Configure a Live Splunk For information on configuring Live Splunks see the Splunk user manual When setting up your Live Splunk select the Run the shell script option on the configuration page Then enter the path to alertUptimeStatusHandler sh in the field up time software 241 LL gt xo O o fe 5 e Oo 7 Application Monitors Live Splunk Listener Configuring the Live Splunk Listener Monitor To configure a Live Splunk Listener monitor do the following 1 Complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information e Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 3 Click Finish 242 up time 5 User Guide D upitime CHAPTER 12 Database Monitors The database monitors track the performance and health of following MySQL Advanced Metrics asasoansannnnnnssnsnnnnnnnnnnnnnnnnn 244 MySQL Basic Ch CKS
460. rmance history 368 up time 5 User Guide SLA Creation Strategies f up time shows that a 95 service level is attainable if the IT department is able to isolate and improve key underperforming systems v o ral oO 6 E v a v v oO c 2 T 2 a a lt x 2 2 o 2 2 I lt a 16 Working with Service Level Agreements DE E0 8007 62 0 8002 8T 0 8007 LT E0 8007 9Z 0 8007 ST E0 8007 2 0 8002 ET7 0 8007 zz 0 8007 bZ 0 8002 OZ 0 8007 6 0 8002 8 E0 8007 LI E 0 8007 9 E0 800Z S E0 8007 trL E0 800Z E L 0 8007 TL E 0 8007 LL 0 8007 OL 0 8007 50 0 8007 80 0 8007 20 0 8002 90 0 8007 50 0 8007 SLA History for Application Server Performance 100 0 80 0 DE E0 8007 6z 0 8007 8z 0 8007 7 0 8007 9z 0 8007 sz 0 8007 z E 0 8007 ez 0 8007 zz 0 8007 bZ 0 8002 OZ 0 8007 BL 0 8007 8 0 8007 vE0 8007 9L 0 8007 SL 0 8007 tr L 0 8007 L 0 8007 TI 0 8007 LL 0 8007 OL 0 8007 50 0 8007 80 0 8007 20 0 8002 90 0 8007 S0 0 8007 PO E0 8007 time software 369 up Working with Service Level Agreements Working with SLA Reports Working with SLA Reports up time provides two types of SLA reports The SLA Summary report provides high level SLA compliance information and the SLA Detailed report provides SLO and service level compliance information for system
461. rmance of Elements in your environment perform common database administration tasks For more information see Advanced Monitors on page 321 Contact uptime software Client Care for assistance with configuring advanced monitors Types of Advanced Monitors There are three advanced monitors Custom Monitors that return the status of a monitor and an automated message to clarify the returned status Custom with Retained Data Monitors that return the following e upto 10 values that you can capture and can evaluate e areturn status a message You can also configure these monitors to save data to the database which you can use to generate a Service Metrics report see Service Monitor Metrics Report on page 425 or a Service Metrics graph see Viewing System and Service Information on page 50 External Check Monitors that rely on an external event to trigger the capture of service information External check monitors enable you to determine when to collect service data based on an external application event that you specify For more information on configuring and using advanced monitors see Advanced Monitors on page 321 139 snf g SJOHUON 2914135 Bu Using Service Monitors Using Service Monitors Selecting a Monitor To select a monitor do the following 1 Click Services on the up time tool bar 2 Click Add Service Instance in the Tree panel The Add Service Monitor window appears
462. rom the Choose CPUs to graph list 7 Click Generate Graph up time software 497 Using Graphs Graphing Memory Usage Graphing Memory Usage Used up time uses the following graphs to chart memory usage on a system e Used e Cache Hit Rate e Paging Statistics e Free Swap These graphs use the same input criteria but they return different data For information on how to generate these graphs see Generating a Memory Usage Graph on page 500 This graph charts the amount of memory being used on a system Used memory is the amount of physical memory occupied by the operating system system library files and applications Cache Hit Rate 498 This graph indicates how effectively buffers are controlling the flow of data between disks and the system CPU cache is a small store of free memory that is used by frequently performed tasks for repeated fast disk access The cache hit rate measures how often the system accesses the CPU cache The cache hit rate calculations are taken from the following metrics e The number of transfers between the system buffers and various disks e The number of times the system buffer was accessed Cache read efficiency should be close to 100 Cache write efficiency should be approximately 66 However low percentages do not always indicate performance problems up time 5 User Guide f up time Graphing Memory Usage Paging Statistics Free Swa This graph indicates whether or
463. roperly to ensure that the configuration is correct If the configuration is not correct then you can immediately fix any configuration errors before they become a problem To test a service monitor do the following 1 2 152 On the up time toolbar click the Services tab In the navigation menu click View Service Instances A list of available service monitors appears in the sub panel Click the name of the service monitor that you want to test Click the Test Service Instance button A pop up window appears containing the status of the monitor and a message related to the status The following image illustrates such a message Test Service Instance UPTIME hpinteg OK up time agent running on hpinteg up time agent 4 6 4 HP UX LA JJ RR When finished click the Close Window button up time 5 User Guide f up time Service Groups Service Groups Service groups are monitor templates that enable you to simultaneously apply a common service check to one or more hosts that you are monitoring Defining and using service groups can simplify the setup and maintenance of common service checks that you want to perform across multiple hosts When adding a host to up time you assign a service group to it instead of manually adding service checks For more information see Understanding Service Groups on page 20 Creating Service Groups To create service groups do the following 1 2 O
464. rrnnnnnvrennrnnnnr 2 up time Architecture xs narnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 3 up time Service Monitoring Concepts a ranrnnnnnnnnnnnnnnnnnnnn 4 Understanding up time Understanding the up time Interface rannnnnnnnnnnnnnnnnnnnnn 6 TG TG TOO Bar Aer 6 Systeri LIST ENE EEE 9 CONS EEE EE EE 10 System EE tea veesaeens 11 Understanding Reports and GraphS ssssssssssnnnnnnnnnnnnnn 12 Understanding Reports ssssesssnnnseesinnnnnneennnnnnnnneesennnnnnneenennnnnn 12 Understanding Graphs EE EE 12 Understanding AgentS aarnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 13 Understanding Major and Minor Versions cccccccccccceeeeeeeneeenees 13 Understanding the up time DataStore a annnnnnnnnnnnnnnnnnnr 15 Connecting to the DataStore Using ODBC rrrrvrnnnnnrvvnnnnnnvrnnnrnnrer 15 Understanding Service Monitors asrxannnnnnnnnnnnnnnnnnnnnnnnnn 17 Understanding Database MONItrs ccccccccccccccecccecceeeteeeeeeeetees 17 vi Understanding Agentless Monitors Using Net SNMP 17 Understanding Services srxarnannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 20 Understanding Service Groups rnnrvrnnnnnrronnonnvrrernnnnnvvnrnnnnrnnnnennnnn 20 Understanding the Status of Services xxrrasrranvnnvuvnnvunnnn 21 Understanding Dates and TimesS aarnannnnnnnnnnnnnnnnnnnnnnnnnr 22 Understanding Retained Data nnrnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 24 Installing up t
465. rs Are Authenticated on page 349 If you have set a user password re enter it in the Confirm Password field Enter the full name of the user in the First Name and Last Name fields Optionally enter the user s geographical location or department in the Location field If the user will be receiving alerts via email enter the user s email address in the Email Address field up time software 337 Configuring Users Working with Users 10 11 Select one of the following options from the Time Period for Emailing dropdown list e 24x7 e 9am to 5pm weekdays e another Monitoring Period that you have previously created If the user will receive alerts on their cell phone or pager enter the email address of the user s cell phone or pager in the Pager Cellphone Address field The email address takes the following format lt number gt mobile_provider_domain Where lt number gt is the user s cell phone number and mobile_provider_domain is the Internet domain of the user s mobile phone service For example 1234567890 mymobile com Select an option from the Time Period for Pager Cellphone Messages dropdown list The options are the same as the ones listed in Step 8 If the user will receive alerts via the Window messaging service enter the name of the user s computer in User s Windows Desktop Hostname field B To receive popup alerts you must enable the Windows 12 13 338 messaging service on the
466. ry e the amount of free memory e the amount of free swap space Processes e the name of a process e the ID of a process PID the amount of memory used by a process e process run time in centi seconds on the CPU e the number of running processes Network e the name of the network interface e the number of kilobytes flowing into the interface per second up time 5 User Guide hy up time Understanding Service Monitors e the number of kilobytes flowing out of the interface per second e the number of inbound errors F e the number of outbound errors e File System e the name of the file system e the size of the file system e the amount of the file system that is being used e User the number of users who are logged into the system For more information on SNMP and Net SNMP see SNMP on page 311 Cc gt Q D Q 2 Q S D 3 D up time software 19 Understanding up time Understanding Services Understanding Services Services are specific tasks or sets of tasks performed by an application in your environment For example network services such as FTP or TCP transmit data in a network Database services such as Oracle SQL Server MySQL or Sybase store and retrieve data in a database up time service monitors continually check the condition of services to ensure that they are providing the functions required to support your business up time service monitors use a co
467. s Using the WebSphere Report Since WebSphere is large and complex it can be difficult to pinpoint the source of a problem with the server or an application running on the server This is especially true when that problem is intermittent Watching for problems in real time only gives you a snapshot of the problem The up time WebSphere report on the other hand gives you a detailed historical perspective of the problem Using the information in the report you can find the source of the problem For example users have trouble working with an application that intensively uses a database Checking the Connection Pool charts section of a WebSphere report could indicate the source of the problem the database has reached its maximum number of connections WebSphere Server Connection Pool Pool size E DefautDatasource B jdbc_DefauttEJBTimerDataSource E jdbc_PlantsByvVebSphereDataSource 120 100 80 E 60 40 20 0 2006 09 21 10 20 2006 09 21 10 40 11 00 11 20 11 40 12 00 12 20 12 40 13 00 13 20 13 40 14 00 14 20 14 40 2006 09 21 2006 09 21 2006 09 21 2006 09 21 2006 09 21 2006 09 21 2006 09 21 2006 09 21 2006 09 21 2006 09 21 2006 09 21 2006 09 21 You can then adjust the size of the database connection pool to allow more connections Or if a WebSphere application is using a large amount of memory you could check the JVM charts section of the report If there are spikes in the heap size
468. s on page 529 applied to all Elements only Elements added after threshold Changes to Global Scan thresholds are not retroactively changes will reflect those changes 554 up time 5 User Guide f up time up time Measurement Tuning Changing Global Scan Threshold Settings You can modify the Global Scan threshold settings through the following parameters default values are shown globalscan cpu warn 70 A Warning level status is reported when CPU usage is at 70 or greater globalscan cpu crit 90 A Critical level status is reported when CPU usage is at 90 or greater globalscan diskbusy warn 70 A Warning level status is reported when a disk on the host is busy for 70 or more of a five minute time frame globalscan diskbusy crit 90 A Critical level status is reported when a disk on the host is busy for 90 or more of a five minute time frame globalscan diskfull warn 70 A Warning level status is reported when 70 or more of the disk space on the host is used globalscan diskfull crit 90 A Critical level status is reported when 90 or more of the disk space on the host is used globalscan swap warn 70 A Warning level status is reported when 70 or more of the swap space on a disk is in use globalscan swap crit 90 A Critical level status is reported when 90 or more of the swap space on a disk is in use up time software 555 Configuring and Managing up time up time Measur
469. s you can the maximum size of data files to prevent disk drives from running out of space If you do not up time 5 User Guide f up time specify the size of data files the database assumes that the size is unlimited up time measures the size of data files and log files as a a percentage of their maximum size If a data file has an infinite maximum size the percent of maximum datafile size must be near zero You should always specify the SQL Server Tablespace Check maximum size of each data file The following diagram illustrates six data files in three file groups in three databases across two instances of a system SQL Server Instance_A Database_1 Filegroup md md File A File B O gt If you set SQL Server Instance_B with a Critical threshold of 90 and a Warning threshold of 70 the SQL Server Tablespace Check monitor SQL Server Instance B Database 2 Filegroup Oo Lo r U sy iy Oo D 2 D Oo e 7 watches the size of all data files in that instance The monitor sends an alert if any of the files reaches or exceeds the defined thresholds Configuring SQL Server Tablespace Check Monitors To configure SQL Server Tablespace Check monitors do the following 1 In the SQL Server Tablespace Check monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on pag
470. s from the Monitoring Period dropdown list to specify when alerts can be sent e 24x7 e 9 am to 5 pm weekdays e 5pm to 7 30 am weekdays and all weekend until Monday morning e 12am to 12 30am Monday Getting Additional Help If you need more information about certain fields on the monitor template hold your mouse over the inverted chevron w beside the name of the field A tool tip that describes the field will be displayed 150 up time 5 User Guide J up time Cloning Service Monitors Cloning Service Monitors Cloning a service monitor makes a copy of the service monitor and all of its parameters Cloning a service monitor is useful if for example you want to use similar monitors for several servers in your environment To clone service monitors do the following 1 2 On the up time tool bar click Services ja In the Service Instances subpanel click the Clone icon beside the name of the service monitor A copy of the monitor template for the service monitor appears Enter information in the fields of the monitor template As a minimum you must e enter a new name for the monitor in the Service Name field e select a system to which you want to apply the monitor from the Host dropdown list Click Save up time software 151 SJOHUON 2914135 bulsn ry Using Service Monitors Testing Service Monitors Testing Service Monitors You can test that a service monitor is functioning and collecting data p
471. s from the dropdown lists in the Daily Hours section as shown below Daily Hours Include data samples between these hours only End 21 00 z For example if you want to report to cover the hours from 8 00 a m to 6 00 p m select 8 00 from the Start dropdown list and 18 00 from the End dropdown list If you are monitoring systems that store network traffic data in packets rather than bytes enter a conversion ratio in the Bytes per Packet field For example you can specify a conversion ratio of 1 000 bytes per packet The default is 750 bytes per packet To generate reports for groups of systems select the groups from the List of Groups area To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views If you are generating reports for specific Applications in your environment select them from the List of Entities Select a report generation option See Report Generation Options on page 402 for details To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information up time 5 User Guide up time Reports for Capacity Planning Using the Network Bandwidth Report The following is an example of a Network Bandwidth repor
472. same time Packets that are involved in a collision are broken into fragments and must be retransmitted NetFlow The NetFlow graphing function transfers you to your Scrutinizer instance For node type Elements that are exporting data to Scrutinizer a graph that covers a specified time frame is generated It shows the monitored node s bi directional throughput rates through known ports which are determined based on use by all known applications For other Elements the generated graph shows network traffic from the host allowing you to pinpoint heavy users See Generating a Network Graph for information on how to generate this graph Generating a Network Graph To generate network graphs do the following 1 Inthe Global Scan or My Infrastructure panel click the name of the system whose information you want to graph 2 Inthe Tree panel click the Graphing tab 512 up time 5 User Guide hy up time Network Graphs 3 Click one of the following options e I O e Errors e NetFlow available if up time has been integrated with Scrutinizer 4 For I O and Errors graphs select the start and end dates and times for which the graph will chart data For NetFlow select one of the set time frames For more information see Understanding Dates and Times on page 22 5 For 1 O and Errors graphs select one or more network interfaces from the Available Interfaces list and then click Add 6 Click Generate Graph
473. se Time Enter the Warning and Critical Response Time thresholds for the length of time a service check needs to complete For more information see Configuring Warning and Critical Thresholds on page 144 3 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish up time software 255 ZL U iy o o 2 ce ce 7 Database Monitors Oracle Basic Checks Oracle Basic Checks The Oracle Basic Checks monitor does the following determines whether or not a host running an Oracle database is available e determines whether or not an Oracle service is running on a system determines whether or not you can log into an Oracle database e evaluates a response based on a script that you have executed against a database or database instance Use the Oracle Tablespace Check monitor see Oracle Tablespace C
474. secccceeeneeseseeeceeeesnsaceeeneneeees 117 Groups and Views in the Global Scan Panel s cccccceeeeeeees 118 Viewing All SLAS ccccccccsccccsccssstccnnssnedsesesecscssascccansscenane 119 up time 5 User Guide D up time SLA Status NAN sates osc ees ee a ns toad ee decencterec rae aeeeeeeee ents 120 Generating an SLA Detailed Report arnrvnnnnnanvvnnnnnnrnnnnnnnrnennnn 121 SLA View Types va isanaseagSiasnianapsancasnele peastandaccnde neue deapedouneessuennes 121 Viewing All ApplicatiOnS a xasrnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 124 Condensed VIEW uuakssesbetnsndensanidninetuiemsneniassnente 125 Detailed View Le pp te 126 Viewing All ElementS a asnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 127 Viewing All ServiCces a rasrannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 129 Viewing the Resource Scan Report usranuannnunnnnnnnnnunnunnn 130 Performance Gauges eee 130 24 Hour Performance Graphs rrnrnnnnnnvnnrnnnnronnnnnrrrrrnnonnvnnnnnnnnr 131 SSG EG Eg TE RE ENE EEE 131 Viewing Scrutinizer StatuS s arnannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 133 Changing Reporting Thresholds anrannnnnnnnnnnnnnnnnnnnnnnnnnn 134 Using Service Monitors OVGIVIEW cscscsccsccscccscsececcsnsdssdnsessecesesescansencisscnnsesetenscecens 136 Using Service MoOnitOrS as asnannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 137 Using Agent Monitors rs 137 Using Agentless Monitors vas umarsnenenmemenidiasnimnnsms ekve 138 Using Ad
475. select the dates and times on which to report For more information see Understanding Dates and Times on page 22 3 Click the Display entity custom fields option to insert the content of the custom fields in the system profile into the report The custom fields contain additional information about the system for example the types of reports that should be run on this system or when maintenance is scheduled For more information see page 100 4 Inthe Target Machine area do the following to specify the hardware of the server on which the other servers will be consolidated Select the type of processor used on the target server from the Architecture dropdown list e Alpha A 64 bit processor from HP up time software 433 Using Reports Reports for Capacity Planning e Itanium A 64 bit processor from Intel e x86 A standard 32 bit processor e Sparc The range of SPARC processor used on system that run the Solaris operating system e POWER The POWERS processor used with IBM p series and i series servers e Select number of CPUs on the target system from the Num CPUs dropdown list Then enter the processor speed of the CPUs in the MHz field For example if the target system has four CPUs and each have a processor speed of 1 000 MHz select 4 from the dropdown list and enter 1000 in the field e Select the type of disk interface that is used on the target server from the Disk I O dropdown list e A
476. settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 11 Click Finish nd Diagnosing Web Transaction Performance To view Web transaction performance via playback create a Service Metrics graph for the Web Application Transaction monitor s system To generate a Service Metrics graph either select the system to which the Web Application Transaction monitor is associated in My Infrastructure or the monitor itself in the main Services panel Click the Graphics tab then click Service Metrics 227 LL gt xo O o fe 5 e Oo 7 Application Monitors Web Application Transactions The Service Metrics graph shows how long each transaction segment took to complete during playback and in doing so provides an end to end performance snapshot of the components of your infrastructure that deliver applications to users For example the following metrics graph shows that the execution of the commands found in checkpoint 3 took excessively long to complete v sla1 9999 WebSphere lab w Show Editor dialog m 3 up time 5 ad point Time eckpoint Times 04 Impati res 06 Back Home Applet com steema teechart UptimeTeeChartApplet started Since other checkpoints performed well the poor performance of a single checkpoint indicates possible issues with
477. specific EJB or set of EJBs e Servlet Name Regex Filter A regular expression used to limit metrics collection to a specific servlet e JDBC Resource Name Regex Filter up time software 209 Application Monitors WebLogic regular expression used to limit metrics collection to a specific JDBC resource 4 Specify a warning and critical threshold for the following e the appropriate WebLogic metrics For more information about each metric see page 204 Response Time This is the length of time a service check takes to complete For more information on using thresholds to set alerts see Configuring Warning and Critical Thresholds on page 144 5 To save the data from the thresholds for graphing or reporting click the Save for Graphing checkbox beside each of the metrics that you selected in the previous step 6 Complete the following settings e Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 7 Click Finish 210 up time 5 User Guide d up time WebSphere WebSphere WebSphere is a software pla
478. sses and there are 20 running on the system up time retrieves the 10 busiest processes Is monitored Click this checkbox to turn monitoring off for this system If monitoring is turned off the system will not appear in the Global Scan panel Click Save up time 5 User Guide hy up time Working with Applications Working with Applications An Application provides the overall status for one or more services You can for example add an Application that checks the status of a system s Web services database and file system capacity When creating an Application you must specify the following e master service monitor s One or more monitors can be used to determine the status of the Application as a whole e regular service monitors Other service monitors that are associated with a master service monitor but are not used to determine the status of the Application as a whole For more information on services see Using Service Monitors on page 135 For information on viewing information about Applications see Viewing Details About Applications on page 103 Adding Applications To add an Application do the following 1 Inthe My Infrastructure panel click Add Application 2 Inthe Add Application window enter a descriptive name for the Application in the Name of Application field This name will appear in both the My Infrastructure and Global Scan panels 3 Optionally enter a description for the A
479. st to identify recurring problems that affect business outcomes 358 up time 5 User Guide hy up time SLAs Service Monitors and SLOs SLAs Service Monitors and SLOs Like other up time Elements i e systems network devices and Applications an SLA definition consists of service monitors that you have previously created Depending on its use an SLA can consist of a single service level objective SLO that in turn consists of a single service monitor In other cases an SLA s coverage can be broad enough to include an ungainly list of service monitors in this case the SLA can be refined to consist of multiple SLOs that focus on different aspects of the SLA Creating multiple objectives helps you further refine your performance targeting and reporting For example consider an SLA called Web Application that focuses on IT performance for end users The SLA s objectives could be broken down by performance 9L e SLO 1 application availability the application is available 99 of the time e g using an HTTP monitor e SLO 2 application speed the application s Web transactions always complete in fewer than 10 seconds e g using the Web Application Transaction monitor Consider another example an SLA called Customer Service Group that focuses on the operational readiness of a support team The SLA s objectives could be broken down by application e SLO 1 helpdesk application e SLO 2 bug tracking
480. structure Overview Overview 116 The Global Scan panel enables you to view the current status of all of the Elements servers and devices Applications and SLAs in your environment When initially viewed the Global Scan panel typically contains a list of all the Elements that are being monitored by up time as shown below d Current User admin up time 5 Global Scan My Portal My Infrastructure Services Users Reports Config Global Scan View SLAs View Applications View All Elements View Resource Scan View All Services Current Location My Enterprise Elements Service Status Outages CPU Disk Memory OK WARN CRIT MAINT UNKN ACK 1hr 12hr 24hr USR SYS WIO TOT Used Busy Swap Used voos ME FilterSNMP 2 GingerSNMP Goggle my app Novell E GE amp amp I VG UdEEe BVVBUBUB Description Elements Service Status by Group WARN CRIT MAINT UNKNOWN Email Systems Unix Boxes YMware Boxes Windows Boxes Total Recent Outages 16 08 18 08 20 08 22 08 00 08 02 08 04 08 06 08 08 08 10 08 12 08 14 03 The Elements table displays the following information e the status and number of services that are associated with the Element e the number of recent service outages up time 5 User Guide hy up time Overview e CPU usage e hard disk usage e memory usage Service status indicators range from normal green to Warning yellow to Critical red and also include an Unknown state gray An Unknown state i
481. subpanel lists the following information e the names of the Elements in your environment including the source Local Datacenters prefix names e the status of the services that are assigned to each Element e the number of outages over the last hour 12 hours and 24 hours e the percentage of CPU resources being consumed by users the system and by disk I O up time software 127 Overseeing Your Infrastructure Viewing All Elements 128 the percentage of the system disk that is being used and the percentage that is busy the amount of memory swap space that is being used If up time cannot contact an Element then the following message is displayed The availability check has failed The values in each column are hyperlinks Click one of the links to display the following information in the system information or graphing subpanels Click any value in the OK WARN CRIT MAINT or UNKNOWN columns to open the Status subpanel See Status on page 52 for more information Click any value in the Outages column to open the Outages subpanel See Outages on page 53 for more information Click any value in the USR SYS WIO or TOT columns to open the Usage Busy report subpanel For more information see Usage busy on page 491 for more information Click any value in the Used column to open the File System Capacity report subpanel See File System Capacity Graph on page 518 for more information Clic
482. t Network Bandwidth Report Date Range 2008 04 17 00 00 00 to 2008 04 17 10 59 19 between the time range 00 00 to 23 59 Bytes per packet 750 Hostname Interface AIX DEV LPAR 10 1 1 57 eno AIX QA LPAR 10 1 1 56 en2 AIXS aix5I unknown end ELinux elinus etht sitO ethO ESX4 vmh esx4 ymnicO vmnic1 ESX umh esx7 umnicO vmnic1 Exchange uptime exchange netifo netif1 umh prod vmnicO ethO WebSphere lab websphereS1 netifO netif1 Total MB In 2 186 97 417 78 0 00 223 55 0 00 0 00 14 49 1 311 36 1 422 36 2 801 19 0 00 9 66 362 76 55 057 18 14 756 37 932 94 24 36 Total MB Out 2 125 59 336 75 0 00 126 35 0 00 0 00 11 50 5 709 02 435 30 3 546 28 84 88 9 66 454 91 77 215 26 7 039 90 932 94 124 54 In this example the system Filter has high levels of network traffic flowing in and out of a particular network interface Based on this information you can generate a Network graph see page 511 for more information to get a better idea of why network I O is so high on the system Disk I O Bandwidth Report The Disk I O Bandwidth report keeps track of the amount of data being read from and written to a disk on a system The report can the display the amount of data either as blocks or megabytes The report contains the following information the display name of the system in up time e the names of each disk on the system up time software 441 6L o gt Ko J
483. t Any blocking of data required for compliance under this Agreement is considered to be violation of this Agreement and will result in immediate termination of this Agreement pursuant to Section 4 2 2 License Automatic Update and Expiration Your license may include an expiration date that can result in the termination of the license For permanent license keys the license updates will be available to you upon payment of the appropriate then current Uptime license fees You must contact Uptime to take the appropriate steps to obtain the permanent key If your license key is stolen or if you suspect any improper or illegal usage of your license outside of your control you should promptly notify Uptime of such occurrence replacement license will be issued to you and the suspect license will be allowed to expire For your convenience Uptime provides license expiration warnings in the product interface should there be any issues that would cause the license to 578 up time 5 User Guide D up time NOTICE TO USER expire It is your responsibility to contact Uptime regarding any potential expiration Uptime is not liable for any damages or costs incurred in connection with an expiring license 2 3 Proprietary Rights to Software and Trade Marks You acknowledge that the Software and the Documentation are proprietary to Uptime and the Software and Documentation are protected under Canadian copyright law and international treaties You further a
484. t SE 38 of allowable downtime used 99 39 of target 99 0 OK The SLA is performing within its target For more information about what kind of SLA information you can view in the Global Scan panel see Viewing All SLAs on page 119 Viewing SLA Details The details of an SLA definition can be viewed in the Service Level Agreement General Information subpanel This can be accessed from the My Infrastructure panel by clicking the SLA name listed among the 360 up time 5 User Guide D up time up time software Viewing Service Level Agreements Elements or from the Global Scan panel by clicking the Info tab in the Tree panel then clicking Info 4 U admi up times Global Scan My Portal My Enterprise Services Users Reports Config f r Graphing Semices kafe Service Level Agreement General Information Enterprise Application Config Service Level Agreement Profile info Jame Services Display Name User Group User Group Service Level Objective Name Description D PG osde gt PG websphere The General Information subpanel displays a summary for the SLA that includes the following e Target Percentage the targeted percentage of up time of the SLA s component services over the Monitoring Period e Monitoring Period the days and time frames during which uptime is measured e Compliance Period Type the compliance period intervals over which SLA compliance is me
485. t then click Add 7 Optionally select one of the views from the Available Entity Views list then click Add 8 Click Save Viewing User Groups To view user groups do the following 1 Inthe Tree panel click View User Groups A list of user groups appears in the User Groups subpanel Editing User Groups To edit user groups do the following 1 Inthe Tree panel click View User Groups 1 Do one of the following e Click the Edit icon oy beside the name of the user group e Click the name of the user group whose information you want to edit and then click Edit User Group in the User Group subpanel 342 up time 5 User Guide dh up time Working with User Groups The Edit User Group window appears 2 Edit the information as described in the section Adding User Groups on page 342 Deleting User Groups To delete user groups do the following 1 Inthe Tree panel click View User Groups 2 Click the Delete icon di 1 beside the name of the user group that you want to delete You cannot delete the SysAdmin user group 3 On the warning dialog box that appears click OK EN O e e lt gt e Cc 2 oD 7 up time software 343 Configuring Users Managing Distribution Lists Managing Distribution Lists A Distribution List allows you to use an email alias to send alerts to end users who aside from wanting to be informed of status alerts have no other reason to use u
486. t 24 hours Outages last 24 hours E Total MB Used MB Outage 4 OK Current Top 10 Processes details PID Pr or Time Memory Used Name dtgreet inetd AIXPowerMatDaemon Generally speaking you can access a Quick Snapshot for an Element by clicking the Graphing tab then clicking Quick Snapshot in the Tree panel 490 up time 5 User Guide J up time Monitoring CPU Performance Monitoring CPU Performance up time uses the following graphs to chart the performance of one or more CPUs on a system e Usage busy e Run Queue Length e Run Queue Occupancy These graphs use the same input criteria but they return different data For information on how to generate these graphs see Generating a CPU Performance Graph on page 494 Usage busy The Usage Busy graph charts the percentage of a system s CPU resources that are being used over a period that you specify This graph displays three components of CPU time user system and wait I O Taken together these components display the total amount of CPU usage Ona system with multiple CPUs the numbers are averages across all CPUs CPU Usage in Windows The key CPU usage metric in Windows is Usr Time which monitors the amount of time the CPU spends processing a thread that is not idle If usage is consistently at 80 to 90 you may need to upgrade the CPU or add more processors You should monitor a separate instance of this count
487. t how and where you would like the report delivered For more information about using the Reports panel see Using Reports on page 413 Config The Config panel enables you to configure the following e up time license information and the license key e archive policies e mail servers e Monitoring Periods e remote reporting instances e user authentication You can also generate problem reports and edit the uptime conf file from the Config panel For more information about using the Config panel see Configuring and Managing up time on page 527 ist The system list Syslist is a popup window that contains the following information e the display names in up time and the host names of systems in your environment arranged in alphabetical order w dn Buipueyssapun F Understanding up time Understanding the up time Interface e the name of the group to which if any the system belongs You access the system list by clicking the Syslist icon in the top right corner of the up time Web interface A window like the following one appears System List Display Name Host Name Entity y A AIX DEY LPAR 10 1 1 57 Unit SE AIX QA LPAR 10 1 1 56 Uni Y 28 arxs aix5l zy SL Development Group Development Group My Enterprise DP Pr Bs Email Delivery Email Delivery My Enterprise Ty Enterprise Application Enterprise Application My Enterprise ESX4 ESX7 Exchange lab t1 2 lab t1 4 vmh prod vmh prod
488. t metric The average amount of memory in KB that was zeroed out Memory Swap Used Avg Host metric The average amount of memory in KB that was used by the swap file Memory Swap Target Guest metric The total amount of memory in KB that can be swapped LL Disk Total Latency Host metric The average time in milliseconds taken for disk commands by a guest OS This is the sum of kernelCommandLatency and physical deviceCommandLatency Disk Kernel Latency Host metric The average time in milliseconds spent in the ESX Server VMkernel per command gt xo O o fe 5 e Oo 7 Disk Device Latency Host metric The average time in milliseconds taken to complete a command from the physical device Disk Queue Latency Host metric The average time in milliseconds spent in the ESX Server VMkernel queue per write Disk Commands Aborted Host metric The number of disk commands aborted during the defined interval Disk Commands Issued 221 Application Monitors ESX Advanced Metrics 222 Host metric The number of disk commands issued during the defined interval Disk Bus Resets Host metric The number of bus resets during the defined interval For more information about setting thresholds see Configuring Warning and Critical Thresholds on page 144 Complete the following settings Timing Settings see Adding Monitor Timing Settings Inform
489. t of time requires to read and write data to and from the volume Select only one option if you are comparing more than one volume Click Generate Graph up time 5 User Guide hy up time Novell NRM Graphs Novell NRM Graphs up time can collect data from systems that are running version 6 5 of the Novell Remote Manager NRM up time retrieves NRM service metrics and then stores this information in the DataStore Using the data that is collected from NRM you can generate graphs for the following metrics e Available Memory The amount of memory that is not allocated to any service e DS Thread Usage The number of server threads that Novell eDirectory uses The server thread limit ensures that server threads are available for other functions as needed e Work To Do Response Time The amount of time that a Work To Do process requires to run from the time a process is scheduled e Allocated Server Processes How the service processes are allocated on the NRM system e Available Server Processes The number of available processes on the NRM system e Abended Thread Count The number of threads that have abended ended abnormally and that are suspended because of abended recovery e Packet Receive Buffers The status of Packet Receive Buffers which transmit and receive packets for the NRM system e Available ECBs The status of available Event Control Blocks ECBs which are Packet Receive Buffers that have b
490. ta samples between these hours only End 21 00 z For example if you want to report to cover the hours from 8 00 a m to 6 00 p m select 8 00 from the Start dropdown list and 18 00 from the End dropdown list Optionally enter a value in the Highlight average SMTX over threshold field If the number of mutex stalls for a system averaged for all of its CPUs over the defined reporting time period exceeds the value in this field the number will be highlighted in the report For example if you enter 75 and a server returns 93 that value is highlighted If you want to generate reports for groups of systems select the groups from the List of Groups area To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views If you are generating reports for specific Applications in your environment select them from the List of Entities Only Solaris systems with two or more CPUs are show in the List of Entities Select a report generation option See Report Generation Options on page 402 for details up time software 437 Using Reports Reports for Capacity Planning Network 438 9 To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more i
491. tabase If the connection attempt fails the database returns a SQL exception error SQL Server can use one of the following authentication modes e Windows Authentication Mode Enables users to connect to a SQL Server instance through a Windows user account e Mixed Mode up time 5 User Guide f up time up time software SOL Server Tablespace Check Enables users who to connect to a SQL Server instance through a Windows account to use either Windows authentication or SQL Server authentication Instance The SQL Server instance name This is usually the default instance You can install multiple instances of SQL Server on one computer An instance can be e The default instance This instance is identified by the network name of the computer on which it is running Applications using client software from earlier versions of SQL Server can connect to a default instance SQL Server version 6 5 or 7 0 servers can operate as default instances A computer can have only one version functioning as the default instance at a time ZL e A named instance of SQL Server This instance is identified by the network name of the computer plus an instance name in the format lt computername gt lt instancename gt Most applications must use SQL Server client components to connect to a named instance However you can use the SQL Server version 7 0 Client Network Utility to configure a server alias name that the version 7 0 client c
492. tances that you want to monitor A new window containing information about the system appears 2 Click the Info tab and then click VMware Instances up time software 79 ainjonsjseajuy INOA BuiBeuey pue buiuijeg FI Defining and Managing Your Infrastructure Working with Systems A list of VMware instances appears in the sub panel as illustrated below VMware Instances VMware Display Name IP Dev1 w2k3se dev1 rhes4 dev1 sles9 10 1 1 123 Dev1 w2k3ee css11 w2k3ee x86 dev sles10 x86 dev1 w2k3se r2 x86 10 1 1 130 dev1 rhes4u5 x86 dev1 rhes4ub x86 dev2 sles10 x86 dev1 vista32 dev2 rhes4u5 x86 dev2 vista32 c554 w2k3ee 10 1 0 221 lab v5 sla Guest OS Microsoft Windows Server 2003 Enterprise Edition Red Hat Enterprise Linus 4 Suse Linux Enterprise Server Microsoft Windows Server 2003 Enterprise Edition Microsoft Windows Server 2003 Enterprise Edition Suse Linux Enterprise Server Microsoft Windows Server 2003 Standard Edition Red Hat Enterprise Linus 4 Red Hat Enterprise Linux 4 Suse Linux Enterprise Server Microsoft Windows Vista Red Hat Enterprise Linux 4 Microsoft Windows Vista Microsoft Windows Server 2003 Enterprise Edition Red Hat Enterprise Linux 4 3 Click the Add to up time button The Add System window appears instance is not on The Add to up time button is not visible if a VMware 4 If necessary you can change any of the following options e Display name in u
493. tem and Service Information 3 Click the Rescan Configuration button to refresh the configuration information for an agent or a Net SNMP host You would do this for example if a disk was added to the system A progress window appears When the message Configuration Rescanning Completed appears click Close Window Information about the configuration changes if any appears in Configuration Changes section of the subpanel If the system that you selected in step 1 is a node then only the following information appears the display name and host name of the node its parent group and whether or not the node is monitored e CPU Information Lists the speed in MHz of all of the CPUs on the system e Network Lists the network interfaces on the system as well as the IP addresses of those interfaces e Disks File System Lists the disks that are on Solaris and Linux systems and the names of the file systems that up time is monitoring e Poll Agent Displays the output from an up time agent that you suspect may have a problem You can forward the output to uptime software Client Care when you encounter problems with up time e Services Lists the services assigned to the system as well as the interval in minutes at which the services are checked e User Groups Lists the user groups that are associated with the system up time software 51 Getting Started Viewing System and Service Information Viewing Servic
494. tening on a BR port leave the remaining TCP service monitor settings blank e String to Send The string that contains the command to which the service or application can respond up time 5 User Guide l up time TCP Use SSL Select this option if your connection uses SSL Secure Sockets Layer for security String to Receive The string that is returned by the specified port and host The string is the response to the command that was specified in the String to Send field Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 3 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish up time software 319 EL 2 pr Oo Ww 92 Oo Oo Oo 2 Network Service Monitors TCP 320 up time
495. ter predetermined amounts of time and can run on any system In addition to your up time license the License Info sub panel displays the number of individual licenses that are currently being used in your environment This number is broken down by systems nodes and if applicable VMware ESX processors To install or update a license do the following 1 Inthe Tree panel click License Info If you currently have an up time license it is displayed in the License Information subpanel 2 Paste the new or updated license into the License Key text box 3 Click Update up time software 563 Configuring and Managing up time License Information 564 up time 5 User Guide D uptime APPENDIX A Reference This appendix contains the following sections Frequency Definitions a asasnaannansnnnnnnnnnnnrnnnnnnnrnnnnnrnne 566 Time Period Definitions sssisssrirrirsirriisirirsitarisirrirsririsi ariaa 567 565 Frequency Definitions Frequency Definitions To define synchronization frequencies in up time you input a string that represents the amount of time between actions These units of time can be days hours minutes seconds or a combination Frequency definitions are used when configuring user detail synchronization when configuring up time to use an Active Directory or LDAP listing for user authentication and management See Changing How Users Are Authenticated on page 349 for more information All time uni
496. tform that provides firms with an environment for developing and deploying Web services and E Commerce applications Since WebSphere large and complex it can be difficult to pinpoint the source of a problem especially when that problem is intermittent The up time WebSphere monitor collects data that you can use to generate a report which will give you a historical view of problems that occur on a WebSphere server See WebSphere Report on page 463 for more information The WebSphere monitor enables you to collect data so that you can e determine whether or not the server can cope with its load LL e determine the cause of problems with the server The size of the connection pool to the data source e collect and retain data for later graphing and reporting The following table lists the counters the WebSphere monitor collects from 2 a WebSphere Application Server a r oO Variable Counters 3 Connection pools PoolSize o Oo 7 FreePoolSize The number of free connections in the pool PercentUsed The percentage of the connection pool that is currently in use WaitTime The average time in milliseconds that a connection is used The average time is the difference between the time at which the connection is allocated and the time at which it is returned up time software 211 Application Monitors WebSphere Variable Counters CreateCount The total number of connections
497. th option collects information from the up time software 23 w dn Buipueyssapun F Understanding up time Understanding Retained Data 24 Understanding Retained Data up time enables you to save some or all of the metrics that its monitors collect to the DataStore You can use the retained data to generate a Service Metrics report see Service Monitor Metrics Report on page 425 or a Service Metrics graph see Viewing System and Service Information on page 50 The data that you can retain varies from monitor to monitor For example with the Windows Service Check monitor you can save the Service Status and Response Time metrics With the Exchange monitor you can save all Web Mail and SMTP metrics You can save data to the DataStore by clicking the Save for Graphing checkbox on a monitor template as shown below Exchange Settings Porty Use SSLv Web Mail Sends Per Second Warning Critical Web Mail Auths Per Second v Warning Critical SMTP Bytes Sent Per Second v 7 Use Sst is greater than or equal to z is greater than or equal to z is greater than or equal to z is greater than or equal to z Save for Graphing IV Save for Graphing NW up time 5 User Guide D uptime CHAPTER 3 Installing up time This chapter explains how to install up time in the following sections Installation Plan cis cigs cadavers caus insoni verke reri AE
498. that is available for swapping If the amount of swap space drops to zero then the system cannot create new processes or store information in the tmp file system Linux swaps data to a dedicated swap partition up time software 499 Using Graphs Graphing Memory Usage Generating a Memory Usage Graph To generate a memory usage graph do the following 1 Inthe Global Scan or My Infrastructure panel click the name of the system whose information you want to graph 2 Inthe Tree panel click the Graphing tab 3 Click one of the following options e Used e Cache Hit Rate e Paging Statistics e Free Swap 4 Select the start and end dates and times for which the graph will chart data For more information see Understanding Dates and Times on page 22 5 Click Generate Graph 500 up time 5 User Guide f up time Graphing Processes Graphing Processes up time uses the following graphs to chart the activity of processes on a system e Number of Processes e Process Running Blocked Waiting e Process Creation Rate These graphs use the same input criteria but they return different data For information on how to generate these graphs see Generating a Process Graph on page 502 up time also has other process graphs which collect more detailed information For information on the other process graphs see e Displaying Detailed Process Information on page 524 e Workload Graphs on page
499. that were created CloseCount The total number of connections that were closed WaitingThreadCount The number of threads that are currently waiting for a connection UseTime The average time in milliseconds that a connection is used The average use time is the difference between the time at which the connection is allocated and that time at which it is returned Per EJB CreateCount The number of times that the Enterprise JavaBeans that are running on the server were created RemoveCount The number of times that the EJBs were removed PassivateCount The number of times that EJBs were removed from the cache Note that passivation preserves the state of the EJBs on the disk MethodCallCount The total number of method calls that were made to the EJBs MethodResponseTime The average response time in milliseconds on the bean methods 212 up time 5 User Guide f up time Variable Java Virtual Machine Other WebSphere Counters cpuUsage The percent of CPU resources that were used since the last query HeapSize The total amount of memory that is available for the JVM UsedMemory The amount of memory that is being used by the JVM ActiveCount The number of global transactions which are concurrently active CommittedCount The total number of global transactions that have been committed RolledBackCount The total number of global transactions that have be
500. the Global Scan or My Infrastructure panel click the name of the system whose information you want to graph 2 Inthe Tree panel click the Graphing tab 3 Click Top 10 Disks 4 Select the start and end dates and times for which the graph will chart data For more information see Understanding Dates and Times on page 22 5 Select one of the following options e Percent Busy The percentage of the disk capacity that is being used server itself is saturated but that the client always has For NFS systems 100 busy does not indicate that the outstanding requests to that server e Average Queue The average number of processes that are waiting to access the disk The length of the queue is affected by the amount of time that each transaction requires to perform a disk operation For both sequential and random disk transactions a complete transaction must occur before the next transaction can begin Longer disk operations per transactions increase the average length of the queue 516 up time 5 User Guide d up time Top 10 Disks Graph e Read Writes The number of read write requests per second from or to a disk e Throughput blks s The amount of traffic in 512 byte blocks that is flowing to and from a disk e Average Wait Time The average time in milliseconds that a transaction is waiting in a queue The wait time is directly proportional to the length of the queue e Average Serve Time The average
501. the Group report options by system checkbox Selecting this option combines the metrics for each system for which you are generating the report 5 To generate reports for systems in specific groups select the groups from the List of Groups area 6 To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views 7 If you are generating reports for specific systems select the systems from the List of Systems 8 Select a report generation option See Report Generation Options on page 402 for details 9 If you want to save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information Using the LPAR Workload Report The LPAR Workload report takes the guesswork out of determining CPU entitlements for the LPARs on a pSeries server The entitlements indicate the amount of CPU power that is assigned to an LPAR For example you have an LPAR with hard entitlement one that cannot use spare processing power from another CPU on the server and its CPU usage 476 up time 5 User Guide hy up time Reports for Virtual Environments is constantly at or near the maximum In this case you can either increase the CPU entitlement of the LPAR or change it to a soft entitlem
502. the Scheduled Reports checkbox and then select the time at which to run the report from the dropdown lists For example to run the report at 3 30 p m select 15 from the first dropdown list and 30 from the second dropdown list as shown below Vv Scheduled Report Run at 18 X 01 X D up time software 407 suondo 140day buipuejsjopun SL Understanding Report Options Scheduling Reports 5 Select one of the following options e Daily e Every 1 day s Every Weekday Do one of the following e Click the Every option and select the number of days from the dropdown list e Click the Every Weekday option e Weekly Every 1 E week s on E Sunday M Monday 0 Tuesday O Wednesday E Thursday 0 Friday O Saturday Do the following e Select a number of weeks from the Every week s on dropdown list If for example you select 2 from the list the report will be run every two weeks e Select one or more days of the week on which the report will be run e Monthly Q Day 1 3 of every 1 3 month s O mna fest i Sunday of every E month Do one of the following e Select the Day option From the first dropdown list select the day from 1 to 31 on which to run the report Then select the month from 1 to 12 during which to run the report 408 up time 5 User Guide hy up time Scheduling Reports For example if you select 3 and 7 from the dropdown lists the report will be run on the th
503. the operating system and type of hardware on which the host is running up time software 57 Getting Started Searching and Filtering e any information in the four custom fields in the system profile e g the job being done by the system and its physical location For more information see Editing a System Profile on page 99 Filtering Service Instances If you have a large number of hosts and want to view information about a particular service instance associated with those hosts you can filter out the services that you do not want to see in the Service Instances subpanel To filter service instances do the following 1 On the up time tool bar click Services 2 Inthe Tree panel click View Service Instances 3 Enter text in one of the following fields in subpanel e Name The name of a particular service instance for example PING Serverl You can enter partial names of service instances in this field For example if you want to filter on instances that contain the text Mailbox enter Mailbox in the field e Host The name of a host with which the service is associated This can be the actual name of the host or the display name in the up time Web interface e Monitor The name of a particular monitor on which you want to filter For example Ping or LDAP You can enter partial names of monitors in this field For example if you want to filteron File System Capacity enter Capacity in the field 4 Click Filte
504. time in milliseconds required to perform a task 6 Click Generate Graph sydes9 Buisn pe up time software 517 Using Graphs File System Capacity Graph File System Capacity Graph A File System Capacity graph charts the amount of total and used space in kilobytes on a server s disk On Windows servers up time looks at the capacity of the main partition usually the c drive On UNIX and Linux servers up time looks at the individual file systems for example var export usr on all the disks on the server If a single disk system has no partitions then the file 3 system capacity is the same as the disk capacity The File System Capacity graph visualizes the following statistics e Total Size The total amount of space available on the system e Space Used The amount of space on the file system that has been used Generating a File System Capacity Graph 518 To generate a File System Capacity graph do the following 1 Inthe Global Scan or My Infrastructure panel click the name of the system whose information you want to graph 2 Inthe Tree panel click the Graphing tab 3 Click File System Capacity 4 Select the start and end dates and times for which the graph will chart data For more information see Understanding Dates and Times on page 22 5 Select one or more file systems from the list If you are generating a graph for a Windows system you will only be able to generate a graph for the C
505. time is integrated with Scrutinizer For information on how to generate these graphs see Generating a Network Graph on page 512 The I O graph charts the average amount of data that is moving in and out of a network interface over a specified time period up time also identifies bursts of network traffic The I O graph captures the following statistics e Inbytes The number of bytes received over the network interface each second Out bytes The number of bytes sent by the network interface each second The Errors graph charts the number of network interface errors that occur each second The most common types of errors include collisions in a hubbed environment or the presence of full duplex handshake errors between a system and a switch As well the following communication line problems can cause network errors e Excessive noise up time software 511 Using Graphs Network Graphs e Cabling problems e Problems with backbone connections The Errors graph captures the following statistics In Errors A data packet was received but could not be decoded because either the header or trailer of the packet was not available e Out Errors A data packet could not be sent due to problems transmitting the packet or formatting the packet for transmission e Collisions The simultaneous presence of signals from two nodes on the network A collision can occur when two nodes start transmitting over a network at the
506. time total e Downtime Top 20 The 20 systems with the highest downtime totals for the given time period e Incident Priority Quadrant A graph in which all selected Elements are placed on quadrants based on the total downtime and number of incidents caused by their associated service monitors Note that to provide clear results in the report only service monitors that were manually assigned to and are directly associated with an Element are taken into account when downtime and incident counts are tallied This means service monitors that may be automatically installed such as the Platform Performance Gatherer are not included additionally only an Application s status as a whole affectsdowntime and incident counts but its component service monitors both master and regular service monitors do not Using downtime and efficiency counts the Incident Priority report includes the following key elements e Mean Time Between Failure The average amount of time that an Element s associated service monitors were all running i e in non critical states over a given time period Elements whose associated service monitors experience no downtime are still included in the report but will not include an MTBF count since they did not experience an incident during the time period e Mean Time to Repair The average number of minutes any of an Element s associated service monitors were in a critical state over a given time period
507. tion Click Next On the Confirm Installation dialog screen click Next Installing Agents on Solaris You install up time agents for Solaris at the command line To install an agent on Solaris do the following 1 2 Log into the system as user root Using telnet or FTP transfer the archive containing the agent to the system on which you want to install the agent You should copy the archive to a temporary directory on the system Extract the archive using the following command tar xvf uptmagnt lt version gt tar Where lt version gt is the version of the agent for example solaris 4 0 Run the following command pkgadd d Follow the prompts from the pkgadd utility to select the agent package and install it up time software PI 5 a Si Ko xe 3 D Installing up time nstalling Agents Installing Agents on UNIX You install up time agents for various UNIX platforms at the command line using a shell script To install an agent on a UNIX system do the following 1 2 Log into the system as user root Using telnet or FTP transfer the archive containing the agent to the system on which you want to install the agent You should copy the archive to a temporary directory on the system Extract the archive Depending on the version of UNIX you will need to extract the archive using either the tar command or a combination of the gzip and tar commands For example to extract the ag
508. tion Exchange 194 IIS 200 Live Splunk Listener 238 Splunk query 236 up time Agent 192 WebLogic 203 WebSphere 211 cloning 151 comparisons 143 database MySQL Advanced Metrics 244 MySQL Basic Checks 251 Oracle Advanced 253 Oracle Basic Checks 256 Oracle Tablespace 259 SQL Server Advanced Metrics 266 SQL server Basic Checks 262 SQL Server Tablespace 270 Sybase 275 editing performance 157 getting help 150 identification 141 monitor settings 142 Monitoring Period 150 Monitoring Periods 397 network DNS 280 FTP 283 HTTP 285 IMAP 289 LDAP 291 NFS 295 NIS YP 297 NNTP 299 Ping 303 up time 5 User Guide l up time POP 305 SMTP 309 SNMP 311 SSH 307 TCP 318 overview 17 136 template 141 testing 152 timing setting options 147 timing settings 146 types 137 Windows Active Directory 187 Event Log Scanner 178 Service Check 182 SMB 185 Multi System CPU report 418 My Infrastructure 65 acknowledge alerts 112 adding Applications 101 adding groups 105 adding nested groups 106 adding nested views 109 adding systems 67 Application details 103 editing Applications 103 editing system profile 99 overview 66 views 108 My Infrastructure panel 7 My Portal panel 7 61 62 MySQL Advanced Metrics monitor 244 MySQL Basic Checks monitor 251 N Net SNMP 311 Network Bandwidth Report 438 Network graphs 511 network monitors DNS 280 FTP 283 HTTP 285 IMAP 289 LDAP 291 NFS 295 NIS YP 297 NNTP 299 up time s
509. to WMI do the following 1 In the Global Scan or My Infrastructure panels click the name of the Windows server 2 Click the Info tab then click Info amp Rescan up time software 83 94njonIseJju 1NOA Bulbeuey pue buiuijeg FI Defining and Managing Your Infrastructure Working with Systems 84 6 Click the Edit Collection Method link found beside the Collection Method setting as shown below ndas d Feritas LUgnKar FOIUME Manager ING Collection Method up time Agent version 5 3 0 0 Edit Collection Method Is VMware Instance The Edit Data Collection Method window appears Select the WMI Agentless data collection option Select the Use WMI Global Credentials check box if they have been configured and you would like to use them see Configuring Global WMI Credentials on page 536 for more information otherwise complete the following fields e Windows Domain The Windows domain in which WMI has been implemented e Username The name of the account with access to WMI on the Windows domain e Password The password for the account with access to WMI on the windows domain Click Save to retain your changes and close the pop up window Switching an Element to Agent Based Data Collection To change the data collection source for an individual Windows Element from WMI to the up time Agent do the following 1 In the Global Scan or My Infrastructure panels click the name of the Windows server Click the
510. to generate a report or graph 4 Complete the following settings e Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish 274 up time 5 User Guide D up time Sybase Sybase The Sybase monitor does the following e determines if the database is responding on the standard port e sends Sybase Transact SQL scripts to the database for processing The Transact SQL scripts can be very basic SQL statements such as sphelp db sampledbl exit select 1 The scripts can also be more complex statements that involve functions and other data processing ZL Configuring Sybase Monitors To configure Sybase monitors do the following 1 Inthe Sybase monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following fields U iy o o 2 ce ce 7 e Port The number of the port number on which the database is listening The default is 5000 e Port C
511. to re send the notification in the End alerting on notification number field Optionally click the Never Stop Notifying option to have up time continually send notifications Select one of the following notification options e Email Alert Sends the alert to the email addresses of the members of a Notification Group e Pager Alert Sends the alert to the pagers of the members of a Notification Group e Script Alert Uses a script to send the alert via SMS to the mobile phones of the members of a Notification Group Since this alert option relies on a script or batch file you must enter its name and path in the Script Path field for example usr local uptime scripts scriptAlert sh When the alert is triggered up time runs the script and passes the script or batch file a set of parameters The script is run for each up time user who will receive the SMS message up time 5 User Guide l up time Alert Profiles For details on how to create the script see the Client Care Web site Knowledge Base article Creating Custom Alert Scripts in up time Alert Profiles e Windows Popup Alert Sends the alert via the Windows messaging service to the desktops of the members of a Notification Group Select one or more groups that will receive the notifications from the Available Notification Groups list and then click Add Click Save Viewing Alert Profiles To view Alert Profiles do the following 1 2 On the
512. to the Base location of your Active Directory structure The format of the Bind string must match the Base location of your Active Directory structure Depending on your network security model you will need domain controller administration privileges to bind to the locations on which you want to match information e Attribute The attribute or information for which you want to search in your Active Directory An Active Directory entry consists of a set of attributes Each attribute has a type which describes the kind of information contained in the attribute and one or more values which contain the actual data For example the entry jsmith inter net has the Attribute value jsmith inter net The Attribute type is e mail Response Time Enter the Warning and Critical Response Time thresholds For more information see Configuring Warning and Critical Thresholds on page 144 3 Optionally click the Save for Graphing checkbox beside the Response Time option to save the data for a metric to the DataStore which can be used to generate a report or graph 188 up time 5 User Guide hy up time Active Directory 4 Complete the following settings e Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for m
513. tors DNS IP Address The IP address for which you want to check If this address is not returned the status of the service monitor becomes Critical Response Time Enter the Warning and Critical Response Time thresholds for the amount of time required to complete a service check For more information see Configuring Warning and Critical Thresholds on page 144 3 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph 4 Complete the following settings Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information Alert Settings see Monitor Alert Settings on page 148 for more information Monitoring Period settings see Monitor Timing Settings on page 146 for more information Alert Profile settings see Alert Profiles on page 381 for more information Action Profile settings see Action Profiles on page 389 for more information 5 Click Finish 282 up time 5 User Guide f up time FTP FTP The FTP monitor can determine whether or not an FTP server is listening or is available on a specified port the response time of an FTP server The FTP monitor tries to open an FTP connection to the server If the response takes longer than the defined thresholds up time generates an alert Configuring FTP Monitors To configure FTP monitors do the following 1
514. ts are represented by a one letter abbreviation e days d e hours h e minutes m seconds s Frequency definitions can be a combination of any of these time units and their values in descending order without spaces e ld e 1d12h e 1h30m e 30s 566 up time 5 User Guide l up time Time Period Definitions Time Period Definitions When defining new or editing existing Maintenance Profiles and Monitoring Periods you need to use precise definitions that up time can correctly interpret Time period definitions use a controlled vocabulary that allow you to precisely define combine and exclude time periods Although all examples listed in the following sections are written in mixed case e g Every Oct 28 none of the terms used in time period definitions is case sensitive Building Blocks The following tables outline the basic components of all time period definitions Time Units e Units of time that act as building blocks in definitions include times of day days of the week months years and exact dates Times Required hour of day correct 12 hour clock suffix 8500 EM inputted as Am or PM Optional minutes of the hour correct spaces 8 PM 8 00PM 8PM Not Accepted missing 12 hour clock incorrect suffix 8 00 24 hour clock convention 20 00 20 00 PM up time software 567 ER ES ETE Y Time Period Definitions Days
515. u to pinpoint which SLO is causing SLA non compliance and in turn which monitors are causing the SLO to experience downtime For more information about viewing SLA details and defining SLOs that help you accurately gauge the performance of your IT infrastructure see Working with Service Level Agreements on page 357 123 F O lt OD 2 OD 2 gt e lt Oo lt n o pen i a 1 Overseeing Your Infrastructure Viewing All Applications Viewing All Applications 124 Applications provide the overall status for one or more services that up time monitors Applications group services such as ping checks and checks for the status of the up time agents that are installed on a system An Application can contain many services and enable you to better analyze component outages versus true Application outages An Application consists of e master service monitors One or more monitors can be used to determine the status of the Application as a whole e regular service monitors Other service monitors that are associated with a master service monitor but are not used to determine the status of the Application as a whole The status of each Application is color coded e Applications highlighted in green are functioning normally e Applications highlighted in yellow are in a warning state e Applications that are in a critical state when one or more master service monitors reac
516. uates this reply and up time captures the report that the program displays Configuring Ping Monitors To configure Ping monitors do the following 1 Inthe Ping monitor template complete the monitor information fields 2 pr Oo Ww 92 Oo Oo Oo 2 To learn about monitor information fields see Monitor Identification on page 141 2 Complete the following fields Number to send The number of packets to send to an IP address or domain name This value determines the number of times the ping command attempts to contact a server up time software 303 Network Service Monitors Ping 304 Average Round Trip Time Enter the Warning and Critical thresholds for the average round trip time for the number of packets sent by the ping command The round trip time is in milliseconds This value is a good indicator of ping performance because a variety of factors including different packet paths to and from the server can affect the round trip time of a packet Percent Loss Enter the Warning and Critical thresholds for the number of packets that did not returned a reply For example if four packets were sent and only two are returned the percent loss is 50 Response Time Enter the Warning and Critical Response thresholds for the length of time the service check takes to complete For more information see Configuring Warning and Critical Thresholds on page
517. ubpanel Problem report created pr20061017 094927 zip Click the name of the problem report to download it to your local file system then send the archive to uptime software Client Care up time software 553 Configuring and Managing up time up time Measurement Tuning up time Measurement Tuning In some cases you can make measurement adjustments to up time s default values Changes can be made to the following e the number of threads allocated to service monitors e status thresholds in the Resource Scan and Global Scan panels e how often performance and status are checked for monitored hosts Service Monitor Thread Counts By default the number of Java threads allocated to service and performance monitors is 100 This can be modified with the following uptime conf parameter serviceThreads 100 Status Thresholds The Global Scan threshold settings determine when a cell in the Global Scan panel changes state to reflect a host s status change green represents normal status yellow represents Warning status and red represents Critical The Resource Scan threshold settings determine the size of the gauge ranges on the Resource Scan view green represents normal status yellow represents Warning status and red represents Critical status You can change the thresholds used to determine status by manually inputting settings in the up time Configuration panel as outlined in Modifying up time Config Panel Setting
518. ucture that is used by up time by running the appropriate command e Linux usr local uptime resetdb really e Solaris opt uptime resetdb really e Windows C Program Files uptime software uptime resetdb really 38 up time 5 User Guide hy up time Upgrading to up time 5 Upgrading to up time 5 If you are using a previous version of up time and intend to upgrade to version 5 you can find detailed information about the upgrade process at the Client Care Web site http support uptimesoftware com up time software 39 5 a Si Ko xe 3 D Installing up time nsta ling Agents Installing Agents up time agents are used to retrieve detailed performance statistics such as CPU memory process disk and network usage from the hosts that you are monitoring The agents can also securely and remotely execute programs The Windows agent can start and stop services and reboot the machine The installation process for agents varies by operating system On UNIX Linux and IBM pSeries systems installation is done at the command line using a script On Windows installation is done using a graphical utility All client systems must be accessible via a name This name should exist in either the etc hosts table on the Monitoring Station or be accessible via a nameserver for example files NIS or DNS If the host IP is changed then the Monitoring Station may send requests to the incorrect m
519. ugh which communication to the domain controller occurs If communication to the domain controller is secure select the SSL check box In the Domain Name field enter the domain that contains the domain controller Continue to the next section to enable and configure synchronization from the Active Directory listing to up time user profiles If you do not wish to synchronize users click Save Clicking Save switches the authentication source to Active Directory Administrators still need to create profiles for all up time users but will not need to set a password for each one See Adding Users on page 337 for more information Defining Active Directory Synchronization Mapping Before synchronizing user details a populated uptime group must already exist in the Active Directory listing you will also need to know its distinguished group name as it will be required during configuration All DataStore based user profiles will be deleted when you switch to Active Directory for synchronization a list of affected users will be displayed during configuration Before continuing you should ensure your up time users are also in the AD listing 350 up time 5 User Guide hy up time Changing How Users Are Authenticated To configure user detail synchronization from the Active Directory list do the following 1 Click Edit Configuration to open the User Authentication Configuration pop up window 2 Select the Synchroniz
520. ul completion of an audited security event e Failure Audit Oo Oo ie Er a ie Oo Oo Found in the Security log this describes the failure of an audited security event e Number of Lines The number of lines in the log file that up time will scan using the criteria specified in the monitor template The default is 1000 and the maximum is 10000 e Match source with The application system component or application module that triggered the event up time software 179 Microsoft Windows Monitors Windows Event Log Scanner 180 Match category with The way in which the application system component or application module that triggered the event classifies the event For example System Event in the Security Log or Installation CI Service or wrapper in the Application and System logs Match event ID with A number that identifies the type of event Match user name with The name of the user associated with a logged event Match computer name with The name of the computer on which the event occurred Search description for Enter the string for which you want to search in the event log for example The WMI Performance Adapter servic ntered th running state The string is evaluated as a regular expression Response Time Enter the Warning and Critical Response Time thresholds for the length of time a service check takes to complete For more informati
521. uld occur when a monitor has gone from an OK to a Warning Critical or Unknown status The duration for rechecks should be shorter than the regular check interval The minimum recheck interval is one minute Rechecks continue to run as they are needed until the maximum number of rechecks has occurred e Max Rechecks The maximum number of times that up time rechecks a service Once the specified number of rechecks is completed the last state that was checked is reported If the last status was not OK up time generates an alert up time software 147 SJOHUON 2914135 HuIsN ry Using Service Monitors The Monitor Template Adding Monitor Timing Settings Information To add monitor timing settings information do the following 1 Select the Monitored check box to activate the service monitor up time does not send alerts if the service monitor is not activated 2 Complete the following settings e Timeout Ensure that the Timeout duration that you define is longer than the defined Response Time e Check Interval e Recheck Interval e Max Rechecks Monitor Alert Settings The monitor alert settings enable you to turn alert notifications on or off based the status of a service monitor The following options are available in this area e Notification Determines if notifications regardless of status or interval should be issued for this monitor e Alert Interval The frequency in minutes at which alerts
522. umber in the column for that variable to go to its Graphing page where you will be able to generate a graph up time software Overseeing Your Infrastructure Overview When you click the file folder icon amp to the left of a system name an expanded view of the server information appears The following image illustrates the expanded view re EJ FiltersNMP up time displays the following information for the system in the expanded view e thefirst row displays the names of the services and their corresponding states associated with the system e the second row lists the top five CPU consuming processes for the system e the third row displays the last five error messages if any for the system Groups and Views in the Global Scan Panel When you create groups or views see Working with Groups on page 105 and Working with Views on page 108 they appear in their own sections in the Global Scan panel The following information is displayed e the names and descriptions of the groups the number of Elements in each group e the status of the hosts that make up the group e the number of alerts per group When you click a group or view in the Global Scan panel the systems that make up the group or view and details about their status are displayed 118 up time 5 User Guide f up time Viewing All SLAs Viewing All SLAs Service level agreements in the Global Scan panel indicate whether performance targets are
523. ums There is also a search engine with which you can find information in the Client Care Web site Knowledge Base and support forums The following image illustrates the top portion of the My Portal panel Knowledge Base OI Community C Need Help p tutorial on how to quickly get up Check out adding sys graphing t you going h N le Launch Tutorials le Start Adding Systems up time 5 User Guide hy up time Overview My Preferences The My Preferences section enables you to e View your user account settings Click the View icon or your user name to open your account settings in the subpanel You can also edit your user information by clicking Edit User e Change your user account settings Click the Edit icon The Edit User window appears See Editing User Information on page 340 for details on editing your user account settings Latest up time Articles The Latest up time Articles section contains a list of recent Knowledge Base articles This list is fed to the My Portal panel via RSS Really Simple Syndication a method for delivering summaries of and links to Web content You simply click the title of the article to open it in your Web browser up time Information The up time Information section contains the following information about your Monitoring Station e Whether or not updates are available If an update is available there will be a link to the uptime software Client Care
524. unt with access to WMI on the windows domain 14 If you want to associate this system with a group select the name of the group from the Group dropdown list See Overview on page 66 for more information on defining groups 15 If you want to associate this system with a service group select the name of the group from the Service Group dropdown list See Service Groups on page 153 for more information 16 Click Save A window listing general information about the system you have added appears 17 If you want to add another system or network device click Add Another Then repeat steps 2 to 14 Otherwise click Close up time software 73 ainjonsjseajuy INOA BuiBeuey pue buiuijeg FI Defining and Managing Your Infrastructure Working with Systems It can take up to 15 minutes for the Monitoring Station to retrieve enough samples to provide historical graphing data to the Monitoring Station 18 Click Save Auto Discovery 74 It can be time consuming to add a large number of systems to up time using the Add System Network Device window especially if you do not know the exact names or IP addresses of those systems With Auto Discovery up time can detect the systems on your network that have an IP address within a range that you specify up time does the following to search for the systems in your environment e Uses the ping utility to determine whether or not systems are available on the network
525. up time can collect workload information from IBM pSeries servers that have logical partitions LPARs To have up time collect this information you must install the latest AIX or Linux agents on the on the LPARs whose workloads you want to profile There are two options for installing agents on IBM pSeries servers with logical partitions LPARs e Installing the agent on a pSeries server with an HMC e Installing the agent on a pSeries server without an HMC that uses the Integrated Virtual Manager IVM In both cases you will need to install the agent on each LPAR whether you use an HMC determines how the agent is installed on the Virtual I O VIO partition Installing the agent on a pSeries server with an HMC Before you can monitor the logical partitions on an IBM pSeries server you must install an agent on each LPAR and on the VIO Use the following instructions to install the agent on an IBM pSeries server that is managed by an HMC up time software 43 Installing up time Installing Agents To install an agent on an LPAR that is on IBM pSeries server with an HMC do the following 1 Ensure you are logged in to the HMC as a super administrator level user up time communicates with the HMC to acquire LPAR information 2 If Linux is running on the LPAR do the following Log into the LPAR as root Copy the RPM file containing the Linux agent to the LPAR Run the following command rpm i lt agent name gt
526. up time tool bar click Services In the Tree panel click View Alert Profiles The Alert Profiles subpanel appears The subpanel displays the settings that you configured when you created the profile as well as a list of the services that are attached to the profile To test whether or not the profile will send alerts click the Test Alert Profile button A popup window appears and the alert is sent using the notification method email pager script or Windows popup that is specified in the profile The following is an example of an email alert Notification type Problem 27 4 2006 09 19 Host Test Host OK Service Test Monitor Service State OK Output This is a test notification please ignore When the alert is sent the message Alert Profile Tested appears in the popup window If an error message appears in the popup window edit the profile and test it again up time software 383 Alerts and Actions Alert Profiles Editing Alert Profiles To edit Alert Profiles do the following 1 On the up time tool bar click Services 2 Inthe Tree panel click View Alert Profiles ni 3 Click the Edit Alert Profile icon beside the name of the profile that you want to edit The Edit Alert Profile window appears 4 Edit the Alert Profile fields as described in the section Creating Alert Profiles on page 382 Associating Alert Profiles to Elements You can associate an Alert Profile to any Service Mon
527. ups area To generate reports for one or more views select the groups from the List of Views area See Working with Views on page 108 for more information about views If you are generating reports for specific systems in your environment select them from the List of Elements up time software 459 Using Reports Reports for Availability 8 Select a report generation option See Report Generation Options on page 402 for details 9 To save the report or schedule it to run at a specific time or interval complete the settings in the Save Reports section of the subpanel See Saving Reports on page 404 and Scheduling Reports on page 407 for more information Service Monitor Availability Report 460 The Service Monitor Availability report tracks the status of the services associated with the hosts in your environment This report lists the percentage of time each service was in the following states over the time period that you specify OK Warning Critical Maintenance or Unknown For more information on each status see Understanding the Status of Services on page 21 Creating Service Monitor Availability Reports To create Service Monitor Availability reports do the following 1 Inthe Reports Tree panel click Service Monitor Availability 2 Inthe Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on pag
528. ur Infrastructure Working with Systems Host Type Sample Hosts File Entry Node Host Name www myDomain ca Display Name Description Type Node Group Web Si Your Domain A Web site tes Novell NRM Host Name novellOl Display Name Type Novell SSL true Port 546 dn3 NRM Group Unix Boxes Group Novell System Net SNMP v2 Host Name ga Display Name Description Type Net SN Read Communit teway mydomain com gateway SNMP snmp v2 P v2 y myCo pub Net SNMP v3 Host Name SN Display Name Description Type Net SN Read Communit P 1 SNMP 1 et SNMP system P v3 y public Username myU Password myP sernam assword Privacy Password myOtherPassword Group Linux Systems pSeries LPAR 98 Host Name 10 Display Name HMC Hostname Type pSeries Managed Serve SNO1B030K Username hsc Password hsc 1 2 42 HMC Managed Server 10 1 1 255 LPAR Server HMC r Server 7610 31C root root up time 5 User Guide hy up time Working with Systems Host Type Sample Hosts File Entry Virtual Node Host Name router Toronto Display Name Toronto Router Description Router for Toronto branch Type Virtual Node Pingable True Group Routers WMI Agentless Host Name Win7 Production Display Name Windows 7 Production Description Win7 agentless WMI Type WMI Agentless Group Windows Boxes WMI Domain
529. ur WebLogic server for monitoring do the following 1 2 3 4 Enable IIOP on your WebLogic server For example on WebLogic 10 select the Protocols tab when configuring server settings then select the Enable IIOP checkbox Enter an IIOP user name Enter an IIOP user password If possible restart the WebLogic server The user name and password created here are used when configuring a WebLogic 10 monitor in up time up time 5 User Guide hy up time WebLogic Configuring WebLogic Monitors To configure monitors for WebLogic 9 11 do the following 1 Inthe WebLogic monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following fields e Username The IOP user name you created when you first enabled OP on the WebLogic server LL e Password The IOP password you created when you first enabled IIOP on the WebLogic server e WebLogic Port The number of the port number on which the WebLogic server is listening The default is 7001 3 Limit the returned results of a specific resource type by completing some of the following fields gt xo O o fe 5 e Oo 7 e Number of Results A limit on the number of matching application resources whose metrics are collected e EJB Name Regex Filter A regular expression used to limit metrics collection to a
530. user s computer See Enabling the Windows Messaging Service on page 381 for information Enter the workgroup or domain to which the user s computer belongs in the User s Windows Desktop Workgroup field Select an option from the Time Period for Windows Popups dropdown list The options are the same as the ones listed in Step 8 up time 5 User Guide f up time up time software 14 15 16 17 18 Working with Users If the user will receive alerts select the Should the user receive alerts option If you select this option you must also enter information in the Email Address or Pager Cellphone Address fields If you selected the Should the user receive alerts option in step 14 select one of the following options Alert on Critical The user receives an alert when up time detects a critical problem with one or more of monitored services Alert on Warning The user receives an alert when up time detects a potential problem with one or more monitored services Alert on Unknown The user receives an alert when up time detects an error in the configuration of the monitor or if up time cannot execute the service check Alert on Recovery The user receives an alert when the service recovers from an error for example an application process or service restarts or a server reboots Click the Disable ActiveX Graphs option to display graphs using a Java applet instead of in 3D B Do not sele
531. ustrate the number of minutes that the CPU run queue spent over the threshold e optionally a list of processes that were in the run queue during the time period that you specify Generating a CPU Run Queue Threshold Report To generate a CPU Run Queue Threshold report do the following 1 Inthe Reports Tree panel click CPU Run Queue Threshold 2 Inthe Date and Time Range area select the dates and times on which to report For more information see Understanding Dates and Times on page 22 up time software 445 Using Reports 446 Reports for Capacity Planning If no data available for the date range the report displays a message indicating that there is no data for the time period To only include data from certain hours during the day select those hours from the dropdown lists in the Daily Hours section as shown below Daily Hours Include data samples between these hours only End 21 00 z For example if you want to report to cover the hours from 8 00 a m to 6 00 p m select 8 00 from the Start dropdown list and 18 00 from the End dropdown list In the Max CPU field specify the threshold for CPU usage CPU usage is considered critical when both the CPU usage and the length of the run queue exceed this threshold In the Threshold field enter the number of queued up jobs that when exceeded is considered excessive Multiple CPUs are taken into account so that the defined thresho
532. vanced Monitors uesvmanimuirnssmesihasvdasddermnnnespiren 138 Selecting Monitot mussamskearumminspessnsssisinitesits tnn 140 The Monitor Template nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnr 141 Monitor Identification 4assmreoenunbesamdabkeeumnmk eig 141 Adding Monitor Identification Information rrvrrrrnnnnnnnnnnrrrnnnnr 142 Monitor Settings Configuration srrrvennrnnvrrnnnnnnvnnnrnnnnvnnennnnnnn 142 Configuring Warning and Critical Thresholds orrrrrnnrnnrrnnnrrn 144 up time software ix D OD ie k O Oo gt p D gt pr 7 Monitor Timing Settings sssscccccccssocecnscasnoescssvadesoessseraccavsboreessecbeanes 146 Monitor Alert SettiNngS spade keen 148 Monitoring Period Settings rrrnvrnnnnnnnrrrnnn vr rrnnnnnrrnnnnnnrnnnnnnnnnn 150 Getting Additional Help uauaarsasargpemndensamnaGum ne 150 Cloning Service MOnitOrS a xannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 151 Testing Service MOnitOrs a nannnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 152 Service Groupsaiaasesrsrrsgarrae seere 153 Creating Service Groups siceccciccacsivcsexesnstenesbeveetnverseneessivenssieretndents 153 Editing Service GOUPS oss vss seundsrssacs sears vonshinddasceunsapedewpuadacrudeanstoss 154 Changing Host ChecksS aa xasnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 156 Change a Host Check unne daseacceesedasenedeessexteeseas 156 The Platform Performance Gatherer a rasrnsrunnnnnnnnnnnnnr 157 Editing the Pl
533. ve hosts that are not members of a specific group select My Infrastructure from the dropdown list to view the ungrouped hosts If you have not created groups the dropdown list is not available and a list of hosts appears in the list See Working with Groups on page 105 for more information about grouping hosts 10 Select one or more hosts from the list to immediately associate 11 with the service group then click Add Click Finish Editing Service Groups 154 To edit service groups do the following 1 2 3 On the up time tool bar click Services In the Tree panel click View Service Groups Click the Edit icon 7 beside the name of the service group that you want to edit up time 5 User Guide dy up time Service Groups 4 To change the name and description of the group do the following Enter a new name in the Name field Enter a new description of the service group in the Description field Click Save 5 To edit the services in the service group do the following Add services by clicking on one or more services in the Available Master Services list and then clicking Add Remove services by clicking on one or more services in the Selected Master Services list and then clicking Remove Click Save 6 To edit the Element Groups assigned to the group do the following Add Element Groups by clicking on one or more entries in the Available Element Groups list and then clicking Add M
534. ves m Windows Vista users can find the DataStore archive in the Virtual Store instead of the default location i e C Users uptime AppData Local VirtualStore Program Files lt uptime install directory gt Once backed up archives can be stored offline If required they can be temporarily imported into the DataStore up time software 545 Configuring and Managing up time Archive Categories Archiving the DataStore The following table lists the statistical categories whose archiving can be configured along with the corresponding DataStore database table Archive Policy Category Overall CPU Memory Database Table performance CPU Multi CPU performance aggregate Detailed Process performance psinfo Disk Performance performance disk File System Capacity performance fscap Network performance network User Information performance who Volume Manager performance vxvol Retained Data erdc int data erdc decimal data erdc string data Configuring an Archive Policy To set an archive policy do the following 1 On the up time tool bar click Config 2 Inthe Tree panel click Archive Policy 3 For the following categories specify the number of months worth of data that will be retained in the DataStore before being removed and archived e Overall CPU Memory Statistics e Multi CPU Statistics e Detailed Process Statistics 546 up time 5 User Guide
535. vidual LPAR For example an entitlement of 0 5 indicates that an LPAR is assigned half of the processing power of a CPU You can use the graphs to give you a clearer view of how much you may need to increase an LPAR s entitlement Instead of using trial and error to determine optimum entitlements you can use actual data to determine accurate entitlements To generate an LPAR CPU Utilization graph do the following 1 Inthe Global Scan or My Infrastructure panel click the name of the pSeries server which is hosting the LPAR whose information you want to graph 2 Inthe Tree panel click the Graphing tab 3 Under the LPAR Workload heading click Workload CPU Utilization 4 Select the start and end dates and times for which the graph will chart data For more information see Understanding Dates and Times on page 22 5 Select the name of the LPAR whose information you want to graph If the message There are no LPARs for this date range is displayed do one of the following e Click the Update List button e Change the date range 6 Click Generate Graph 510 up time 5 User Guide f up time Network Graphs Network Graphs I O Errors Network graphs track the performance and reliability of your computing network You can generate the following network graphs e I O e Errors e NetFlow The I O and Errors graphs use the same input criteria but return different data NetFlow graphs are available if up
536. want to graph 2 Inthe Tree panel click the Graphing tab 3 Click one of the following options Number of Processes e Process Running Blocked Waiting e Process Creation Rate 4 Select the start and end dates and times for which the graph will chart data For more information see Understanding Dates and Times on page 22 5 Click Generate Graph 502 up time 5 User Guide hy up time Graphing TCP Retransmits Graphing TCP Retransmits The TCP Retransmits graph indicates whether or not data is being transmitted over a network Using TCP information is transmitted in pieces called packets A packet consists of e A header Contains transmission information such as the IP addresses of the sender and receiver the protocol that is being used and the packet number e A payload Contains the data that is being sent e A trailer Contains data that denotes the end of the packet as well as error correction information TCP retransmits indicate that certain network services may not be completing properly because of a high load on a network or a system A lost packet can indicate network congestion and requires the sender to reduce the transmission rate and to retransmit the packet A slower transmission rate combined with retransmitted packets reduces network performance Generating a TCP Retransmits Graph To generate a TCP retransmits graph do the following 1 Inthe Global Scan or My infrastructure pane
537. ware 87 ainjonsjseajuy INOA Huibheuew pue buiuijeg FI Defining and Managing Your Infrastructure Working with Systems e CPU Utilization Connection Usage e Available Memory e DS Thread Usage e Packet Receive Buffers e Available Event Control Blocks ECBs e LAN Traffic e Available Disk Space e Disk Throughput Each statistic returns one of the following statuses Good The statistic is well within the threshold suspect value e Suspect The statistic is between the threshold good and critical values e Bad The statistic is greater than the threshold critical value Work To Do Response Time This statistic enables you to view how processes share the CPU The response time is the amount of time that a Work To Do process requires to run If this statistic returns a value of Suspect you can check the running threads to determine why there is a delay in the Work To Do threads If the value is Bad thread is probably running more than it should or it is hung You should identify the parent NetWare Loadable Module and then unload and reload it if possible Allocated Service Processes This statistic enables you to view as a graph how the service processes are allocated on your server If the service processes are approaching the maximum increase the value of the Maximum Server Processes Set parameter If you have only a few 88 up time 5 User Guide f up time Working with Systems available server processes increase the M
538. word The password that is required to log in to the WebLogic server Specify a warning and critical threshold for the following the appropriate WebLogic metrics For more information about each metric see page 204 Response Time This is the length of time a service check takes to complete For more information on using thresholds to set alerts see Configuring Warning and Critical Thresholds on page 144 up time software LL gt xo O o fe 5 e Oo 7 207 Application Monitors WebLogic To save the data from the thresholds for graphing or reporting click the Save for Graphing checkbox beside each of the metrics that you selected in the previous step Complete the following settings e Timing Settings see Adding Monitor Timing Settings Information on page 148 for more information e Alert Settings see Monitor Alert Settings on page 148 for more information e Monitoring Period settings see Monitor Timing Settings on page 146 for more information e Alert Profile settings see Alert Profiles on page 381 for more information e Action Profile settings see Action Profiles on page 389 for more information Click Finish Monitoring WebLogic 9 11 208 In order for up time to collect information from a WebLogic 9 10 or 11 server the the Internet Inter Orb Protocol IIOP must be enabled on your WebLogic server To enable prepare yo
539. work File System enables UNIX and Linux systems to share directories across a network The NFS monitor can determine the performance of your NFS Network File System server and its ability to communicate with NFS clients by measuring the available NFS mounts This monitor runs the showmount e command to extract the number of NFS file systems that are exported If the showmount command fails then up time generates an alert Configuring NFS Monitors EL To configure NFS monitors do the following 1 Inthe NFS monitor template complete the monitor information fields To learn how to configure monitor information fields see Monitor Identification on page 141 2 Complete the following fields e Mounts Select a comparison method and then enter the Warning and Critical Mount thresholds for the number of mounts on which NFS is loaded For more information see Configuring Warning and Critical Thresholds on page 144 2 pr Oo Ww 92 Oo Oo Oo 2 Response Time Enter the Warning and Critical Response Time thresholds for the length of time a service check takes to complete For more information see Configuring Warning and Critical Thresholds on page 144 up time software 295 Network Service Monitors NFS 3 Click the Save for Graphing checkbox to save the data for a metric to the DataStore which can be used to generate a report or graph 4 Complete the fo
540. y in the second Permissions area enable one or more of the following options by clicking the Allowed checkbox e Administrator The user can perform all up time administration tasks e Acknowledge Alerts The user can acknowledge an alert See Understanding Alerts on page 378 for more information e Save Reports EN The user can save reports Links to the saved reports will appear in the My Portal panel or the user can save reports to a local or network drive Saving Reports on page 404 for more information 7 Click Save Viewing User Roles O e e lt gt e Cc 2 oD 7 You can view a user role to ensure that the permissions for the role are properly configured To view user roles do the following 1 Inthe Tree panel click View User Roles A list of the user roles appears in the Users subpanel Clicking a user role displays a table that summarizes the role s configured permissions those up time software 335 Configuring Users Working with User Roles which have been granted as denoted by a green check mark y as shown below Info Permission Add Edit Delete Users Elements Services Element Groups Action Profiles Alert Profiles Time Periods Service Level Agreements Q 41 4 4 ACE Element iews Permission Allowed Administrator Acknowledge Alerts Save Reports Editing User Roles To edit user roles do the followi
541. y can also acknowledge an alert When you acknowledge an alert up time records the acknowledgement which can be viewed in the Service Monitor Outages report sends an acknowledgement message to any up time user who received the last alert turns off alert escalation but continues monitoring the problem and only sends an alert when the status of the system or Application returns to OK To acknowledge alerts do the following 1 Status Monitor File System Capacity UPTIME lab websphere51 PING lab websphere51 WebSphere Plants Response 3 112 In the Infrastructure panel click the name of the Element that generated the alert The System General Information subpanel appears In the Tree panel click the Services tab and then click Status Status information for the monitors associated with the Element appears in the subpanel as shown below Duration Monitor Information 10days6h C is 100 full 5 6h6m up time agent running on I 10m 17s Ping completed 1 sent 0 2 10 days 6h 1 Finished playback time 1740 ms Click the Acknowledge icon XX in the Ack column up time 5 User Guide dh up time Acknowledging Alerts The acknowledgement message window appears Acknowledge Please enter the reason for Acknowledging the status 4 Type a comment relating to the alert or why it has been acknowledged and then click Submit An email containing the following information is sent to any up ti
542. ystems in your environment You can view this report by clicking the View Resource Scan tab Resource Scan is divided into three sections a set of performance gauges 24 hour performance graphs and an Elements chart As you click through lists in the Resource Scan report the status reported in the gauges and charts reflects your current view whether it is focused on parent groups nested groups or individual Elements Performance Gauges 130 There are two sets of gauges that are updated every 15 minutes with new data The top row of gauges displays an average of the most recent 15 minute time frame the bottom row of gauges displays a minimum maximum and average value for the last 24 hour period up to the most recent 15 minute time frame The gauges show the following information e CPU Usage The percentage of the system s CPU resources that are being used Memory Usage The amount of memory expressed as a percentage of total available memory being consumed by a process Disk Busy The percentage of time that the disk is Memory Usage 24h handling transactions in progress Disk Capacity The percentage of space on the system disk that is being used up time 5 User Guide f up time Viewing the Resource Scan Report 24 Hour Performance Graphs The 24 hour gauges display a minimum maximum and average value the full 24 hour performance history is displayed in the graphs below CPU Usage last
543. zation s reaction time with virtual systems management and map established policies to automated actions When configuring Action Profiles up time communicates with Orchestrator and dynamically produces a list of all available workflows This includes any third party workflow packages that have been installed on the Orchestrator server including the up time Orchestrator package When a workflow is selected and the Get Parameters button is clicked the corresponding input parameter fields are dynamically displayed allowing you to specify parameter values required to completely configure the workflow for execution should an up time alert initiate it Orchestrator Input Parameter Variables When configuring a VMware vCenter Orchestrator workflow you have at your disposal a set of up time specific variables that can be entered as parameter variables and whose ensuing runtime values will be passed to the Orchestrator workflow during execution The variables available to you are those that are used when creating a custom alert format See Custom Alert Format Variables on page 386 for information SNMP Trap Actions 390 You can also configure an Action Profile to send an SNMP trap to a particular host An SNMP trap is notification that is issued by a system that is running SNMP when a problem occurs The host to which the SNMP trap is sent must be running an SNMP trap listener up time 5 User Guide dy up time Action Profiles

up.time 5 User Guide - Documentation Portal

Contents

Download Pdf Manuals

Related Search

Related Contents