Tuesday, March 14, 2017

Shooting Troubles Down: Troubleshooting Support for UltraESB-X

The troubleshoot reporting support that was made available in UltraESB Legacy via JMX interfaces, is available in the new UltraESB-X as well. Now it is exposed as a set of HTTP endpoints, as part of the management API, and in addition to invocation and status checks it now also supports the direct downloading of accumulated troubleshoot information as a Zip archive.

UltraESB-X troubleshoot reporting framework supports generation of diagnostic reports of an ESB instance, including both filesystem (configuration files, logs, etc.) and in-memory (thread and heap dumps, environmental variables, etc.) status. Reporting options are provided by tasks on the instance, and a reporting cycle involves running a subset of available tasks to generate a report archive containing accumulated data.

The troubleshoot reporter is accessible via the following endpoints on the management API:

  1. GET /troubleshoot/tasks lists all troubleshoot reporting tasks made available by the instance, with details of any parameters they can accept
  2. POST /troubleshoot/start starts a troubleshoot reporting session on the server asynchronously, accepting a JSON-formatted task execution configuration that includes the following parameters:
    • credentialMask: regular expression for credential masking
    • credentialPassword: password to be used in credential masking
    • keyFactoryName: name of the secret key factory for credential masking
    • cipherName: name of the cipher suite for credential masking
    • taskParamMap: a JSON map with keys denoting the queue of tasks to be executed, mapped to any user-specified task parameter values
    • filePath: absolute path (optionally including file name) where the report should be saved on the target instance
  3. POST /troubleshoot/run is similar to above, but executes the task queue synchronously (the call is blocking, and returns upon completion along with the final execution results).
  4. GET /troubleshoot/status returns the current status of the troubleshoot session (e.g. RUNNING, SUCCESS, FAILED)
  5. GET /troubleshoot/current returns the status of the currently executing troubleshoot reporting task
  6. GET /troubleshoot/summary returns a report containing a summary of the denoting the last executed task session, including session start timestamp, and durations and final status of each task
  7. GET /troubleshoot/report downloads the generated reporting archive as an application/octet-stream

Currently the following troubleshoot tasks are available on UltraESB-X:

  • conf-dir archives the configuration directory ($X_HOME/conf)
  • logs archives the logs directory ($X_HOME/logs) along with currently generated logs
  • projects-dir archives the projects directory ($X_HOME/projects)
  • lib-dir-info archives the entire library directory ($X_HOME/lib)
  • lib-custom archives the $X_HOME/lib/custom directory, useful when some custom libraries added to the ESB seem to be causing issues
  • lib-patches archives the $X_HOME/lib/patches directory, useful to determine the patch versions installed on a particular ESB instance
  • detailed-logs is similar to logs, but changes the log level of a particular ESB logger over a given time interval before collecting the logs. This can be useful to diagnose a particular component via logging. Accepts parameters:
    • logger: fully qualified name of the logger whose level should be adjusted
    • level: level to which the logger should be updated temporarily
    • period: the duration (seconds) over which the modified log level should be maintained
  • thread-dump generates a series of thread dumps of the JVM running the UltraESB-X instance. This can be useful for analyzing possible threading-related issues such as deadlocks. Accepts parameters:
    • count: number of dumps to be generated
    • period: time (in seconds) between two dumps
  • heap-dump generates a heap memory dump of the JVM running the UltraESB-X instance
  • sys-info saves information of the system and JVM running the UltraESB-X instance, as a plain-text file
  • esb-info saves information of the UltraESB-X instance as a plain-text file

An example HTTP trace involving the invocation of the reporter involving the tasks thread-dump and logs-dir would appear like:

GET /management/troubleshoot/tasks HTTP/1.1
Accept: application/json
Authorization: 
User-Agent: Jersey/2.7 (HttpUrlConnection 1.8.0_65)
Host: localhost:8085
Connection: keep-alive

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 2761
Server: Jetty(9.2.3.v20140905)

[{"description":"archives the UltraESB projects directory, optionally with credential hiding","id":"projects-dir","intensive":false,"maskable":true,"name":"Projects Directory Archival","parameters":{}},{"description":"archives the UltraESB configuration directory, optionally with credential hiding","id":"conf-dir","intensive":false,"maskable":true,"name":"Configuration Directory Archival","parameters":{}},{"description":"archives the UltraESB custom libraries directory","id":"lib-custom","intensive":false,"maskable":false,"name":"Custom Library Archival","parameters":{}},{"description":"produces thread dumps of the JVM running the UltraESB instance","id":"thread-dump","intensive":true,"maskable":false,"name":"JVM Thread Dump","parameters":{"period":{"configurable":true,"defaultValue":"5","description":"period (in seconds) between thread dumps","name":"period"},"count":{"configurable":true,"defaultValue":"3","description":"number of thread dump samples","name":"count"}}},{"description":"takes a detailed log sample over a given time by escalating log levels of specified loggers","id":"detailed-logs","intensive":true,"maskable":true,"name":"Detailed Log Sampling","parameters":{"period":{"configurable":true,"defaultValue":"60","description":"period (in seconds) over which to collect detailed logs","name":"period"},"level":{"configurable":true,"defaultValue":"DEBUG","description":"level to which logging should be escalated for sampling","name":"level"},"logger":{"configurable":true,"defaultValue":"org.adroitlogic","description":"logger whose level should be escalated for sampling","name":"logger"}}},{"description":"Lists the files reside inside the library directory and its main sub directories","id":"lib-dir-info","intensive":false,"maskable":false,"name":"Library Directory Information","parameters":{}},{"description":"archives the log files currently residing in the UltraESB instance, optionally with credential hiding","id":"logs","intensive":false,"maskable":true,"name":"Log Archival","parameters":{}},{"description":"produces a heap dump of the JVM running the UltraESB instance","id":"heap-dump","intensive":true,"maskable":false,"name":"JVM Heap Dump","parameters":{}},{"description":"archives the UltraESB patch libraries directory","id":"lib-patches","intensive":false,"maskable":false,"name":"Patch Library Archival","parameters":{}},{"description":"produces a detailed version and status information report of the UltraESB instance","id":"esb-info","intensive":false,"maskable":true,"name":"UltraESB Information","parameters":{}},{"description":"produces a detailed report on the system and JVM running the UltraESB instance","id":"sys-info","intensive":false,"maskable":true,"name":"System Information","parameters":{}}]


POST /management/troubleshoot/start HTTP/1.1
Accept: application/json
Authorization: 
Content-Type: application/json
User-Agent: Jersey/2.7 (HttpUrlConnection 1.8.0_65)
Host: localhost:8085
Connection: keep-alive
Content-Length: 190

{"cipherName":null,"credentialMask":null,"credentialPassword":null,"filePath":"/tmp/thread-log.zip","keyFactoryName":null,"taskParamMap":{"thread-dump":{"period":"5","count":"2"},"logs":{}}}

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 61
Server: Jetty(9.2.3.v20140905)

{"msg":"Troubleshoot reporting session started successfully"}


GET /management/troubleshoot/summary HTTP/1.1
Accept: application/json
Authorization: 
User-Agent: Jersey/2.7 (HttpUrlConnection 1.8.0_65)
Host: localhost:8085
Connection: keep-alive

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 278
Server: Jetty(9.2.3.v20140905)

{"outputPath":"/tmp/thread-log.zip","results":[{"duration":0,"extraInfo":null,"lastUpdated":1489424639250,"status":"SUCCESS","taskId":"initializer"},{"duration":0,"extraInfo":null,"lastUpdated":1489424639250,"status":"RUNNING","taskId":"thread-dump"}],"startTime":1489424639250}


GET /management/troubleshoot/status HTTP/1.1
Accept: application/json
Authorization: 
User-Agent: Jersey/2.7 (HttpUrlConnection 1.8.0_65)
Host: localhost:8085
Connection: keep-alive

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 17
Server: Jetty(9.2.3.v20140905)

{"msg":"RUNNING"}


GET /management/troubleshoot/summary HTTP/1.1
Accept: application/json
Authorization: 
User-Agent: Jersey/2.7 (HttpUrlConnection 1.8.0_65)
Host: localhost:8085
Connection: keep-alive

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 478
Server: Jetty(9.2.3.v20140905)

{"outputPath":"/tmp/thread-log.zip","results":[{"duration":0,"extraInfo":null,"lastUpdated":1489424639250,"status":"SUCCESS","taskId":"initializer"},{"duration":5075,"extraInfo":null,"lastUpdated":1489424644325,"status":"SUCCESS","taskId":"thread-dump"},{"duration":1,"extraInfo":null,"lastUpdated":1489424644326,"status":"SUCCESS","taskId":"logs"},{"duration":2,"extraInfo":null,"lastUpdated":1489424644328,"status":"SUCCESS","taskId":"zip_creator"}],"startTime":1489424639250}


GET /management/troubleshoot/status HTTP/1.1
Accept: application/json
Authorization: 
User-Agent: Jersey/2.7 (HttpUrlConnection 1.8.0_65)
Host: localhost:8085
Connection: keep-alive

HTTP/1.1 200 OK
Content-Type: application/json
Content-Length: 17
Server: Jetty(9.2.3.v20140905)

{"msg":"SUCCESS"}


GET /management/troubleshoot/report HTTP/1.1
Accept: application/json
Authorization: 
User-Agent: Jersey/2.7 (HttpUrlConnection 1.8.0_65)
Host: localhost:8085
Connection: keep-alive

HTTP/1.1 200 OK
Content-Type: application/octet-stream
Content-Disposition: attachment; filename=thread-log.zip
Content-Length: 7117
Server: Jetty(9.2.3.v20140905)

[binary ZIP content]

Unless you are accessing the management API via a custom-built HTTP client, you can simply use the UXTerm command line interface to access the troubleshoot reporter—in fact, many of the management API operations—quite conveniently, without ever worrying about low-level details. We are also working on a Java client for the API, meaning that integrating the management API with your own system will be as easy as a few lines of plain old Java code, in the not-too-distant future.

No comments: