[Mondrian] Mondrian system metrics via JMX
andrew.yim at truvenhealth.com
Tue Apr 29 12:26:22 EDT 2014
This is great and we'll definitely look into utilizing this feature as we're headed towards using JMX as a monitoring tool for all our applications.
However, you have a question below that we're definitely interested in, specifically "What MDX query generated what SQL queries?". We attempted to do this by hooking Mondrian with SLF4j/Logback and setting the MDC parameter with each request. We were hoping when we logged MDX and SQL queries we could capture the MDC parameter we set (which is thread-safe), but since the MDX & SQL is executed in a separate thread pool this is not working as we expected.
There are a few methods that we also tried using programmatically in the MondrianServer monitor, specifically "getSqlStatements()", but whenever we called this the Map was already cleared. It seems after each SQL query executes, it immediately clears itself from the Map. I assume that this is done on purpose to manage memory consumption.
If there could be a vote for getting this type of enhancement added to the monitor sooner, we would definitely be in favor!
Have you heard of any implementation ideas on this or if something is in the works?
Andrew Yim | Truven Health Analytics | O: 734.913.3174 | M: 734.347.8669
From: mondrian-bounces at pentaho.org [mailto:mondrian-bounces at pentaho.org] On Behalf Of Matt Campbell
Sent: Tuesday, April 15, 2014 5:06 PM
To: mondrian at pentaho.org
Subject: [Mondrian] Mondrian system metrics via JMX
I've recently committed support for accessing Mondrian's monitor via JMX in both lagunitas and master branch. Details of how to setup connections and what metrics are available are at http://wiki.pentaho.com/display/analysis/Monitoring+Mondrian+System+Metrics+with+Java+Management+Extensions+%28JMX%29.
There have been minor renames of some methods in the various *Info classes for consistency with JMX expectations. For example, ServerInfo.cellCacheMissCount() has become .getCellCacheMissCount(). So if you were formerly accessing this information programmatically, you may need to update these references.
I'll be interested to see whether having easier access to this information increases usage of the monitoring stats. Putting on an administrator's hat, I think there is a fair amount of information that seems interesting and valuable: cache hit/miss counts, sql counts, # statements currently executing. There are also a lot of questions an administrator might ask that cannot be answered yet:
1) How many MDX queries have failed? How many have been cancelled? That's the sort of information that would be great to tie to an alerting threshold in a tool like Nagios.
2) Are there any "hung" queries (i.e. running for longer than N)?
3) What's the breakdown of time spent executing SQL versus Mondrian execution time?
4) What MDX query generated what SQL queries?
5) What is the aggregate time spent in statement execution?
I can also imagine an admin would want the ability to reset running totals via JMX.
I'd love to see more use cases identified and entered as Jira tickets.
Mondrian mailing list
Mondrian at pentaho.org
More information about the Mondrian