[Mondrian] The perils of the crossjoin optimizer

Wright, Jeff jeff.s.wright at truvenhealth.com
Fri Nov 6 09:48:50 EST 2015


I've read this several times and I'm just not following. I think you're asking for feedback on a property setting that will control whether Mondrian uses the crossjoin optimizer or does something else. And it sounds like you're looking for a way to make sure the slicer gets used in the cell loading sql.

Could you maybe make a stab at describing the two query execution plans, and how this optimizer threshold comes into play?

--jeff

From: mondrian-bounces at pentaho.org [mailto:mondrian-bounces at pentaho.org] On Behalf Of Matt Campbell
Sent: Friday, November 06, 2015 9:19 AM
To: Mondrian developer mailing list <mondrian at pentaho.org>
Subject: [Mondrian] The perils of the crossjoin optimizer

Mondrian's crossjoin optimizer acts as a fall back to native crossjoin, applying an alternative optimization strategy in cases where native evaluation was not possible or disabled.  It works by loading cell data for crossjoined tuples to eliminate empty intersections.  Loading these cells can be an added cost, but often that's okay since the cells may have been needed anyway, if not by this query then potentially by similar queries.

There are scenarios where the cost can be excessive, however.  The MDX below loads detail rows with a grand total.  In this case, the second crossjoin (of two "Total" calculated members) cannot be natively evaluated, so the crossjoin optimizer gets a shot.  Since the WHERE slicer is also a calculated member, however, this results in an unconstrained SQL query against the fact table.  That a big expense to reduce a tuple set that's already of size 1.

The default crossjoin optimizer threshold has a value of 0 tuples, meaning it always kicks in.  I'm not sure what the ideal default is, but it seems to me that in most cases with tiny sets the risk of expensive SQL outweighs the benefit.  A setting somewhere in the 10-100 region would eliminate many SQL queries with total/subtotal MDX like the one below.

Thoughts?

WITH
member Store.[Store Total] as
    'Aggregate([Store].[Store State].[WA].children)'
member Product.[Product Total] as
    'Aggregate({[Product].[All Products].[Drink], [Product].[All Products].[Food]})'
member [Education Level].[Education Filter] as
    'Aggregate({[Education Level].[All Education Level].[Bachelors Degree],
       [Education Level].[All Education Level].[Graduate Degree]})'

SELECT Measures.[Unit Sales] on 0,
  NON EMPTY
     UNION(
       CrossJoin([Store].[Store State].[WA].children,
                  {[Product].[All Products].[Drink], [Product].[All Products].[Food]}),
       CrossJoin(Store.[Store Total], Product.[Product Total])) on 1
FROM Sales
WHERE
   [Education Level].[Education Filter]

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20151106/db21a65e/attachment-0001.html 


More information about the Mondrian mailing list