[Mondrian] per query memory limit?

Wright, Jeff (Truven Health) jeff.s.wright at truvenhealth.com
Wed Aug 7 13:52:46 EDT 2013


Here's another way to say what I was trying to say. We're counting member/cell requests to try to stop a query from getting us to Java OutOfMemoryError. It seems odd to me that if I have a query that is 20% too big to get by the limit, that it might get by the limit on some days if 30% of the cells happen to be cached.

No, I wasn't picturing per user limits. But I am thinking that it would be good to set limits somewhere in the range of 25-50% of available heap, to take into account concurrency. My reasoning is that the large queries that are at risk of using all heap are relatively rare. I'd like to be able to support 2 of them at once but not 10.

As far as how general this concern is, what I wrote in MONDRIAN-1661 is that I believe any application that allows ad hoc queries against a schema with many medium to high cardinality dimensions is vulnerable to users crashing Mondrian with OOME.

--jeff

From: mondrian-bounces at pentaho.org [mailto:mondrian-bounces at pentaho.org] On Behalf Of Julian Hyde
Sent: Wednesday, August 07, 2013 1:40 PM
To: Mondrian developer mailing list
Subject: Re: [Mondrian] per query memory limit?

On Aug 7, 2013, at 10:26 AM, "Wright, Jeff (Truven Health)" <jeff.s.wright at truvenhealth.com<mailto:jeff.s.wright at truvenhealth.com>> wrote:


Revisiting this thread... We have spent some time on code changes and testing related to counting cell and member requests at the point of loading from the DBMS. I'm trying to figure out how to think about how this interacts with caching. Looking for some reactions...

The cell requests seem to be only for cells that are not already cached (I expect members are same). My thought experiment on this is that potentially a clever/lucky user could work around a cell request limit by carefully working up to the full query. For example, if my query is comparing 2012 data to 2011, and gets failed for too many cell requests, I could run a query first for 2011 and generate fewer cell requests. But that doesn't really reduce the memory overhead of my combo query... or does it?

If you mean could someone game the system and steal resources from other users. Yes. But unless Mondrian actually does per-user resource accounting that's unavoidable. Per-user resource accounting would be so heavyweight that the chances are that everyone would lose.


My thinking so far is that this would actually be a combined limit, something like

If ( ( 2 * memberRequests + cellRequests ) > requestThreshold )
                throw new RequestLimitException();

But maybe we need to take into account the size of member and cell caches? I haven't looked to see how accessible that is.

I think a hybrid limit would work OK.  I'd replace the 2 with 10, because members take a lot more space than cells (especially when the cells are dense).

The member and cell caches don't have a "size". The JVM has a memory limit, that's it.

You can submit this as a patch, but I'm not yet convinced that this should be in the main code base. I've not heard of anyone else with this requirement.

Julian

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20130807/8af4e66a/attachment.html 


More information about the Mondrian mailing list