[Mondrian] Visualizing expression evaluation / optimizingcalculations

Julian Hyde jhyde at pentaho.com
Tue Feb 3 22:20:20 EST 2009

> Eric wrote:
> Second, and more important in the "teach a man to fish" 
> sense, what's  
> the best way to get a sense for how this sort of thing breaks down  
> internally for a given query?  I don't see any debug log output that  
> seems appropriate, nor have I found quite the right place to 
> drop in a  
> breakpoint (I'm still looking, it's just taking forever).

Mondrian can't do 'explain plan' but there's no reason in principle why it
can't. You could add a tracer that prints the calc tree after the query has
been prepared.

For this, see e.g. TestContext.compileExpression, and class CalcWriter.

> First, is it reasonable to assume that the performance gain is a  
> result of simply specifying the most restrictive subcondition in the  
> filter first?

That would be my guess. To test it, add a wrapper that extends GenericCalc
and counts the number of executes. Then add a tracer that prints the calc
tree after execution with the number of times that each calc is called.

To add those extra wrapper calcs, write a subclass of ExpCompiler.
DteCompiler does something similar already, so follow its example.

> If so, is there a more appropriate way to help 
> Mondrian  
> figure out that certain filter conditions are more 
> restrictive, or is  
> order of conditions the only option?

The more appropriate way is to have an cost-based optimizer which is
extensible by adding rules. Something I'd love to do.

> are there any  
> generalized rules of thumb to follow, such as ensuring that 
> conditions  
> involving aggregate measures come before conditions involving  
> calculated members?

There are hundreds of possible rules for optimizing queries. I wouldn't try
to come up with a list off the top of my head; they tend to be apparent when
looking at particular examples, but the problem is figuring out whether they
are always an improvement. And of course the optimizer would be making the
decision based on imprecise statistics.

With a rule-based optimizer framework we could selectively enable rules and
figure out whether they earn their keep.


More information about the Mondrian mailing list