[Mondrian] per query memory limit?

Julian Hyde jhyde at pentaho.com
Thu Jul 11 17:25:04 EDT 2013


Some comments on how this might be achieved.

It's hard to figure out how many bytes a java data structure is using, so you're right to use a data structure that is a proxy for memory usage. Cell count (for aggregations) and member count (for dimension cache) are probably the right ones to use. Although members take quite a lot more memory than cells, especially dense cells.

If by "monitor", you mean create another thread that periodically checks the state of the query, then I disagree. The usage could go up very quickly, and by the time the query has been killed, the damage has been done: other users' data has been thrown out of the cache.

I'd go for a variant of #2. Keep a tally of the number of cells & members used thus far in executing the query, and abort if it crosses a threshold.

Note that we have something similar to this already, namely mondrian.result.limit. This feature would use a new property, but use a similar mechanism, and would also throw a ResourceLimitExceededException.

Can you please log a jira case for this so that we can track it.

Julian


On Jul 11, 2013, at 1:49 PM, "Wright, Jeff (Truven Health)" <jeff.s.wright at truvenhealth.com<mailto:jeff.s.wright at truvenhealth.com>> wrote:

Has anybody else every thought about implementing some kind of per query memory limit?

In our application, users can create ad hoc queries against a large schema with many medium-to-high cardinality dimensions. For that to work, it's important to be able to stop a query from taking over the Mondrian instance and using all memory. We've tried the memory threshold property and that's not a good solution.

I think in general there are two possible approaches:

1) Try to estimate the memory that will be required ahead of time, abort if too high.
2) Monitor some data structure that is a proxy for memory as you evaluate the query, and abort it out when you cross the threshold.

We've done some work with cell limits that is sort of like #1. But we find that a naive cell estimate is likely to miss some intermediate memory usage.

Any thoughts?

--Jeff Wright

_______________________________________________
Mondrian mailing list
Mondrian at pentaho.org<mailto:Mondrian at pentaho.org>
http://lists.pentaho.org/mailman/listinfo/mondrian

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20130711/7613c0c0/attachment.html 


More information about the Mondrian mailing list