[Mondrian] Query performance issues

Matt Campbell mcampbell at pentaho.com
Fri Oct 2 13:56:12 EDT 2015


Thanks.  That’ll make a lot of people happy.

From: mondrian-bounces at pentaho.org [mailto:mondrian-bounces at pentaho.org] On Behalf Of Tom Barber
Sent: Friday, October 02, 2015 1:55 PM
To: Mondrian developer mailing list <mondrian at pentaho.org>
Subject: Re: [Mondrian] Query performance issues


Good point I'll send it over
On 2 Oct 2015 18:51, "Matt Campbell" <mcampbell at pentaho.com<mailto:mcampbell at pentaho.com>> wrote:
BTW- don’t forget to issue a pull request for the change you mentioned for MONDRIAN-2284 earlier this month (the one to support the asterisk operator for native CJ).



From: mondrian-bounces at pentaho.org<mailto:mondrian-bounces at pentaho.org> [mailto:mondrian-bounces at pentaho.org<mailto:mondrian-bounces at pentaho.org>] On Behalf Of Tom Barber
Sent: Friday, October 02, 2015 1:44 PM
To: Mondrian developer mailing list <mondrian at pentaho.org<mailto:mondrian at pentaho.org>>
Subject: Re: [Mondrian] Query performance issues


Interesting stuff thanks for the explanation Matt!

Tom
On 2 Oct 2015 18:34, "Matt Campbell" <mcampbell at pentaho.com<mailto:mcampbell at pentaho.com>> wrote:
Hi Tom,

Your second query involves CrossJoins with sets that include multiple levels of the same hierarchy, e.g. [Time].[Year] and [Time].[Quarter].  Native CrossJoin depends on members of each set being of the same level in order to form proper push-down tuple queries.  So native eval is disabled for this query and Mondrian’s stuck loading the full Cartesian product of all CJ’d sets into process.

We could potentially enhance native eval to cover cases like this, although the native eval logic is complex and it could be a big effort.

-matt


From: mondrian-bounces at pentaho.org<mailto:mondrian-bounces at pentaho.org> [mailto:mondrian-bounces at pentaho.org<mailto:mondrian-bounces at pentaho.org>] On Behalf Of Tom Barber
Sent: Friday, October 02, 2015 5:49 AM
To: Mondrian developer mailing list <mondrian at pentaho.org<mailto:mondrian at pentaho.org>>
Subject: Re: [Mondrian] Query performance issues

Sorry should probably also point out both tests were using the same mondrian build 3.6.5

Tom

On 2 October 2015 at 10:43, Tom Barber <tom at analytical-labs.com<mailto:tom at analytical-labs.com>> wrote:
Hello folks,

I had a query about MDX query performance between Saiku 2 and 3 and so we did some testing and I'd like some expert analysis if anyone has a spare 10 minutes:

SELECT
NON EMPTY {Hierarchize({[Measures].[Unit Sales]})} ON COLUMNS,
NON EMPTY Hierarchize(Union(CrossJoin([Time].[Year].Members, CrossJoin({[Store].[USA].[OR]}, CrossJoin([Customers].[Name].Members, [Product].[Product Name].Members))), Union(CrossJoin([Time].[Year].Members, CrossJoin(Filter({[Store].[USA].[OR].[Portland]}, (Exists(Ancestor([Store].CurrentMember, [Store].[Store State]), {[Store].[USA].[OR]}).Count  > 0)), CrossJoin([Customers].[Name].Members, [Product].[Product Name].Members))), Union(CrossJoin({[Time].[1997].[Q3]}, CrossJoin({[Store].[USA].[OR]}, CrossJoin([Customers].[Name].Members, [Product].[Product Name].Members))), CrossJoin({[Time].[1997].[Q3]}, CrossJoin(Filter({[Store].[USA].[OR].[Portland]}, (Exists(Ancestor([Store].CurrentMember, [Store].[Store State]), {[Store].[USA].[OR]}).Count  > 0)), CrossJoin([Customers].[Name].Members, [Product].[Product Name].Members))))))) ON ROWS
FROM [Sales]

This is the MDX created by an old Saiku 2.6 server and the query executes in about 20 seconds.

with set [~Time_Time_Year] as 'Exists({[Time].[Time].[Year].Members}, [~Time_Time_Quarter])'
  set [~Time_Time_Quarter] as '{[Time].[Time].[1997].[Q3]}'
  set [~ROWS_Time_Time] as 'Hierarchize({[~Time_Time_Year], [~Time_Time_Quarter]})'
  set [~Store_Stores_Store State] as '{[Store].[Stores].[USA].[OR]}'
  set [~Store_Stores_Store City] as 'Exists({[Store].[Stores].[USA].[OR].[Portland]}, [~Store_Stores_Store State])'
  set [~ROWS_Store_Stores] as 'Hierarchize({[~Store_Stores_Store State], [~Store_Stores_Store City]})'
  set [~ROWS_Customer_Customers] as '{[Customer].[Customers].[Name].Members}'
  set [~ROWS_Product_Products] as '{[Product].[Products].[Product Name].Members}'
select NON EMPTY {[Measures].[Unit Sales]} ON COLUMNS,
  NON EMPTY (Crossjoin(Crossjoin(Crossjoin([~ROWS_Time_Time], [~ROWS_Store_Stores]), [~ROWS_Customer_Customers]), [~ROWS_Product_Products])) ON ROWS
from [Sales]

This is the much cleaner MDX from the 3.x server, but this query timesout every time. Can someone shed any light on why the performance is so much worse? I've tried cellbatch size and stuff but none of it makes a shred of difference.

Thanks

Tom


_______________________________________________
Mondrian mailing list
Mondrian at pentaho.org<mailto:Mondrian at pentaho.org>
http://lists.pentaho.org/mailman/listinfo/mondrian

_______________________________________________
Mondrian mailing list
Mondrian at pentaho.org<mailto:Mondrian at pentaho.org>
http://lists.pentaho.org/mailman/listinfo/mondrian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20151002/2cc56cdd/attachment-0001.html 


More information about the Mondrian mailing list