[Mondrian] MDX for histogram over Large-Member level

Julian Hyde jhyde at pentaho.com
Wed Jun 27 13:56:01 EDT 2012


This is an instance of the "MDX is SQL" anti-pattern. Calling aggregate functions such as Count to large numbers of cells is rarely the right thing to do in MDX. It usually indicates that the writer is thinking in SQL.

Can you create a measure [Measures].[Balance Range] that has values small, medium and large. Then you can slice on it, rather than filtering.

But then there's a question of when you are interested in the balance. Today? Aggregated over the last month? Aggregated over all time?

Hope this helps. I don't have time to fully solve your problem, but hopefully I gave you some pointers, and maybe someone else on the list can help.

Julian

On Jun 27, 2012, at 7:46 AM, mehdi b wrote:

Hi,

I want to calculate histogram data for a level over a measure of a cube.I wrote the following MDX:

with
member [Measures].[Range1] as
'Count(Filter([CustomerID].[CustomerID].members, [Measures].[WholeBalance]<10000))'
member [Measures].[Range2] as
'Count(Filter([CustomerID].[CustomerID].members, [Measures].[WholeBalance]>=10000 and [Measures].[WholeBalance]<100000))'
select
{[Measures].[Range1],[Measures].[Range2]} on axis(0)
from [CustomerBal]

The MDX calculates the result, but my CustomerID level has 2,150,354 members, and as I checked for every calculated member, Mondrian generates a SQL to find the members with the condition in Filter() and then count every found member in memory instead of using SQL count function. This results to long execution and bad performance for my reports.

The generated SQL for [Measures].[range1]:
SELECT "COR_CB_CUSTOMERBAL"."MS_CUSTOMERID" AS "c0"
FROM "BIDWH"."COR_CB_CUSTOMERBAL" "COR_CB_CUSTOMERBAL"
GROUP BY "COR_CB_CUSTOMERBAL"."MS_CUSTOMERID"
HAVING (SUM("COR_CB_CUSTOMERBAL"."MR_WHOLEBALANCE") < 10000.0)
ORDER BY "COR_CB_CUSTOMERBAL"."MS_CUSTOMERID" ASC

The generated SQL for [Measures].[range2]:
SELECT "COR_CB_CUSTOMERBAL"."MS_CUSTOMERID" AS "c0"
FROM "BIDWH"."COR_CB_CUSTOMERBAL" "COR_CB_CUSTOMERBAL"
GROUP BY "COR_CB_CUSTOMERBAL"."MS_CUSTOMERID"
HAVING ((SUM("COR_CB_CUSTOMERBAL"."MR_WHOLEBALANCE") >= 10000.0)
AND (SUM("COR_CB_CUSTOMERBAL"."MR_WHOLEBALANCE")      < 100000.0))
ORDER BY "COR_CB_CUSTOMERBAL"."MS_CUSTOMERID" ASC

But it would be better if generated SQL uses the SQL count function to directly count the members.

So what can I do?
_______________________________________________
Mondrian mailing list
Mondrian at pentaho.org<mailto:Mondrian at pentaho.org>
http://lists.pentaho.org/mailman/listinfo/mondrian

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20120627/c741212a/attachment.html 


More information about the Mondrian mailing list