Overview

This section describes the troubleshooting tools available for RTView. If an RTView application is not performing as expected, investigate the following most critical and commonly encountered issues:

Display Server: Check the Display Server CPU usage. This is the most typical issue for the Display Server as the application processes many user requests.

Data Server and Historian: Check both the amount of available memory and memory used on the Data Server and the Historian. The amount of memory used grows as these applications run, so the amount of memory originally designated might no longer be adequate. If the amount of memory used is close to the amount designated, consider making more memory available.

Note: When troubleshooting, it is helpful to have performance metrics for both before and after a failure.

Troubleshooting Tools

The following tools are available for troubleshooting your RTView system.

“RTView Monitor” - View status, performance metrics, and memory allocation and utilization for all licensed RTView Data Servers, Display Servers, Historian applications and Tomcat applications in your system. Use the RTView Monitor to determine how an RTView application is utilizing its memory.

“Java Tools” - SL recommends that you install JDK v1.6 on systems that run RTView applications. Use tools such as JConsole, Jmap, and JPS to gather more complete information for troubleshooting your RTView system.

“Log Files” - View RTView application output and error streams while installing an RTView application or while operating in the production environment.

RTViewDs - Query the health of your data source data adapter and debug the adapter. Use this debugging tool if you encounter problems accessing your data. RTViewDs provides connectivity and response time information to help you determine whether the problem is with accessing your data source, or if the data source is slow in responding. The following data sources support RTViewDs:

        JMX: Provides database connection information. See “RTViewDs” in the “Attach to JMX Data” section for more information.

        SNMP: Provides metrics for SNMP requests completed and traps received per connection. See “RTViewDs” in the “Attach to SNMP Data” section for more information.

        SQL: Provides two tables containing metrics about SQL database connections and queries. See “RTViewDs” in the “Attach to SQL Data” section for more information.

        XML: Provides information about the listeners currently active in the XML data source. Use this information to troubleshoot Data Server clients, as the XML data source handles listeners for ALL RTView data attachments that are redirected to a Data Server. See “RTViewDs” in the “Attach to XML Data” section for more information.

        TIBCO Rendezvous: Provides RTViewDs subject fields containing TIBCO Rendezvous monitoring information. See “TIBCO Rendezvous RTViewDs Fields” in the “Attach to TIBCO Rendezvous Data” section for more information.

        TIBCO Hawk: Provides agent information, such as agent connection and status information, alert information and IP address. See “RTViewDs” in the “Attach to TIBCO Hawk Data” section for more information.

        Caches: Provides cache configuration and cache table size at runtime information for each table maintained by the cache data source. See “RTViewDs” in the “Attach to Cache Data” section for more information.

        RTVAgent: Provides an RTViewDs.agents table containing a list of currently connected agents. See “RTViewDs” in the “Attach to RTVAgent Data” section for more information.

JMX MBeans - The Data Server is instrumented with JMX, including the following MBeans:

        “RTView:name=Troubleshooting” MBean: Use this MBean to troubleshoot an unresponsive server.

        “RTViewDataServer:name=Manager” MBean: Use this MBean to manage and monitor clients and application settings.

Troubleshooting Steps

If an RTView Server has become unresponsive or slow, this might be due to heavy CPU load or memory usage on the host system. Perform the following steps:

1.     Observe overall system performance on the host machine by checking CPU and memory usage. You can use:

        Windows Task Manager

        a Linux utility, such as top

2.     Observe RTView Application performance on the host machine. Identify the RTView Application consuming the most resources (the largest CPU load or memory usage), then determine what action the application is specifically performing while consuming them. You can use:

        “RTView Monitor”

        “Java Tools”

Note the following:

        Overall CPU usage. Is it consistently high? If so, which applications are the biggest CPU users?

        Available memory. Which applications are the biggest memory users?

        Compare system performance shortly after the RTView Server is started to system performance at the onset of the problem. Has the overall CPU load increased noticeably? Has the available memory decreased significantly?

Note: Heap usage might go up and down between garbage collection intervals but, overall, the heap size should be stable. An exception to this is an RTView server that runs the RTVIew cache data source. In that case, heap usage grows until each cache history table reaches its maximum row counts or the value specified for the timeSpan property. See “Caches” for more information.

3.     Obtain RTView application logs. See “Log Files” for more information.

4.     If the issue involves the RTView Display Server, obtain Tomcat logs.

Note: Tomcat logs can also be useful if the RTView Data Servlet is used to provide HTTP access to the Data Server.

5.     Investigate using the “RTView:name=Troubleshooting” MBean.

6.     At this point you can contact SL Support (support@sl.com) or continue to the next step.

7.     If an RTView server is still running, perform multiple application “Thread Dumps”.    

8.     If an RTView server is still running and memory consumption is an issue, perform multiple Jmap log reports. See “Jmap Utility” for more information.

9.     Provide all information to SL support and, if possible, wait to cycle servers until a review has been made.