[Mondrian] Mondrian system metrics via JMX
mcampbell at pentaho.com
Tue Apr 15 17:05:58 EDT 2014
I've recently committed support for accessing Mondrian's monitor via JMX in both lagunitas and master branch. Details of how to setup connections and what metrics are available are at http://wiki.pentaho.com/display/analysis/Monitoring+Mondrian+System+Metrics+with+Java+Management+Extensions+%28JMX%29.
There have been minor renames of some methods in the various *Info classes for consistency with JMX expectations. For example, ServerInfo.cellCacheMissCount() has become .getCellCacheMissCount(). So if you were formerly accessing this information programmatically, you may need to update these references.
I'll be interested to see whether having easier access to this information increases usage of the monitoring stats. Putting on an administrator's hat, I think there is a fair amount of information that seems interesting and valuable: cache hit/miss counts, sql counts, # statements currently executing. There are also a lot of questions an administrator might ask that cannot be answered yet:
1) How many MDX queries have failed? How many have been cancelled? That's the sort of information that would be great to tie to an alerting threshold in a tool like Nagios.
2) Are there any "hung" queries (i.e. running for longer than N)?
3) What's the breakdown of time spent executing SQL versus Mondrian execution time?
4) What MDX query generated what SQL queries?
5) What is the aggregate time spent in statement execution?
I can also imagine an admin would want the ability to reset running totals via JMX.
I'd love to see more use cases identified and entered as Jira tickets.
More information about the Mondrian