[Mondrian] Optimizing FunUtil.evaluateMembers()

Tue Jan 6 05:28:51 EST 2009

Noting that mondrian spends time in FunUtil.evaluateMembers is like noting
that a lisp program spends a lot of time in eval. Mondrian's evaluator is
recursive, and evaluateMembers is typically at the top of the recursion.
It's what's happening underneath that is important, and that is often very
different for different queries.

The same kind of parallelization approaches are available for mondrian as
for a DBMS:

* intra-operator parallelism, e.g. using lots of threads to make the ORDER
function complete its sorts faster.

* inter-operator parallelism by vertical partitioning, e.g. allocating a
thread to compute a crossjoin as an iterator, and another to compute a
filter function as an iterator on top of it

* inter-operator parallelism by horizontal partitioning, e.g. computing
crossjoin(Set1, Set2) by dividing Set1 in half and computing
union(crossjoin(Set1A, Set2), crossjoin(Set1B, Set2)).

Intra-operator parallelism is the easiest. If you can do that, do that.

One complicating factor for inter-operator parallelism is that mondrian has
3 evaluation techniques: evaluating lists, evaluating as iterators, and
evaluating as native SQL. If you're going to apply an approach that relies
on iterators, you need to make sure that all operators have iterator
implementations.

Choose a query that is causing you grief and figure out whether it belongs
to a pattern that would benefit from parallelization in general. Then I
suggest that you transform the Calc tree to introduce a parallelism
operator. If you are doing horizontal partitioning you will probably
duplicate expressions.

Julian

> -----Original Message-----
> From: mondrian-bounces at pentaho.org 
> [mailto:mondrian-bounces at pentaho.org] On Behalf Of Eric McDermid
> Sent: Monday, January 05, 2009 11:57 AM
> To: Mondrian developer mailing list
> Subject: [Mondrian] Optimizing FunUtil.evaluateMembers()
> 
> Marc and I have been looking at performance optimizations for 
> a client  
> using Mondrian 2.4 against MySQL.  Turns out that about 60% of the  
> time their app spends in Mondrian code is in  
> mondrian.olap.fun.FunUtil.evaluateMembers(), so we're trying to find  
> ways to improve that hit.
> 
> As it turns out, a majority of the time spent in 
> evaluateMembers() is  
> actually spent at a lower level, in  
> mondrian.rolap.RolapAggregationManager.getCellFromCache(), 
> and that's  
> on a secondary hit (i.e. it doesn't include SQL access time, since a  
> previous execution already loaded the cache).  I don't see any  
> immediately obvious bottlenecks to eliminate there, however.
> 
> Since evaluateMembers() cycles through a member iterator, one 
> thought  
> was to do those evaluations in parallel using multiple threads.  At  
> first glance this seemed promising, but after digging a 
> little deeper  
> I'm concerned that this would require a major overhaul to 
> ensure that  
> all the calculators, functions in the function table, cache access,  
> etcetera are all threadsafe.  Additionally, we'd likely need 
> to ensure  
> that the worker threads don't wind up running the same SQL queries  
> downstream somewhere.
> 
> Has anyone else run into similar issues?  We are working on 
> upgrading  
> the app to the latest Mondrian release, but that will take a while,  
> and I'm not familiar enough with the deltas between the two Mondrian  
> versions to know if upgrading is likely to help with the 
> problem we're  
> seeing.
> 
>   -- Eric
> 
> 
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian
> 
> 
>