[Mondrian] MDX for histogram over Large-Member level

mehdi b mehdi_xinu at yahoo.com
Sat Jun 30 02:39:25 EDT 2012


I thought you were a polite guy, but you are not! My fault was to call you "Dear"!

You are not obliged to answer me in the first place. In fact your first 
answer didn't help me. If it is hard for you, don't involve, you are not forced (I repeat it again for your attention!), and instead, really try hard to learn to be polite!!!


I have enough pride and courage to use the framework for a project of 500,000$. I have experience of using the framework for more than one year and I deployed more than 20 Cubes under it. I even change the source code and add some aggregator in RolapAggregator class to support stddev and some complex aggregation expressions!


FYI, this is my third email to Pentaho group and the first one to be answered (the second one was a bug report, and I don't care if it has been read or not)!!!


The reason that my email came to you directly is I thought when I click the reply, the Pentaho's forum email is set as receiver like other groups, but your email was set instead! So I didn't mean to send to you directly and "begging you", I'm so sorry for your viewpoints and thoughts!!!




________________________________
 From: Julian Hyde <jhyde at pentaho.com>
To: mehdi b <mehdi_xinu at yahoo.com> 
Sent: Thursday, June 28, 2012 4:03 PM
Subject: Re: [Mondrian] MDX for histogram over Large-Member level
 

I said I don't have time to help you, then you send me a personal begging email. For heavens sake. Have you no pride? Or can't you read?




On Jun 28, 2012, at 2:16 AM, mehdi b wrote:

Dear Julian,
>
>
>Thanks for your answer. The number of ranges and the start and end of each range is determined by the user, so they are not fixed. Therefore, I can't define a dimension as [Balance Range] with members like small, medium and large to apply slicing.
>
>
>About my cube, I have a table of all customers with the last balance of each customer, and I defined it in Mondrian as a cube. To calculate the histogram values over last balance of each customer, I defined the CustomerId as a dimension to apply the aforementioned MDX (following one):
>
>
>with
>member [Measures].[Range1] as 
>'Count(Filter([CustomerID].[CustomerID].members, [Measures].[WholeBalance]<10000))'
>member [Measures].[Range2] as 
>'Count(Filter([CustomerID].[CustomerID].members, [Measures].[WholeBalance]>=10000 and [Measures].[WholeBalance]<100000))'
>select
>{[Measures].[Range1],[Measures].[Range2]} on axis(0)
>from [CustomerBal]
>
>
>The XML cube definition
>
>
>    <Cube name="CustomerBal" cache="false" enabled="true" defaultMeasure="WholeBalance">
>        <Table schema="BIDWH" name="COR_CB_CUSTOMERBAL"/>
>
>        <DimensionUsage name="Currency" source="Currency" foreignKey="FD_CURRENCY"/>
>
>        <Dimension name="CustomerID">                                                                                                              
>            <Hierarchy hasAll="true" allMemberName="All">                                                                                    
>                <Level name="CustomerID" column="MS_CUSTOMERID" uniqueMembers="true" type="Integer"/>
>            </Hierarchy>                                                                                                                                           
>        </Dimension>                                                                                                                                              
>
>        <Measure name="WholeBalance" column="MR_WHOLEBALANCE" aggregator="sum"/>
>    </Cube>
>
>
>
>I would appreciate any help on this issue :)
>
>
>
>
>________________________________
> From: Julian Hyde <jhyde at pentaho.com>
>To: mehdi b <mehdi_xinu at yahoo.com>; Mondrian developer mailing list <mondrian at pentaho.org> 
>Sent: Wednesday, June 27, 2012 5:56 PM
>Subject: Re: [Mondrian] MDX for histogram over Large-Member level
> 
>
>This is an instance of the "MDX is SQL" anti-pattern. Calling aggregate functions such as Count to large numbers of cells is rarely the right thing to do in MDX. It usually indicates that the writer is thinking in SQL.
>
>
>Can you create a measure [Measures].[Balance Range] that has values small, medium and large. Then you can slice on it, rather than filtering.
>
>
>But then there's a question of when you are interested in the balance. Today? Aggregated over the last month? Aggregated over all time?
>
>
>Hope this helps. I don't have time to fully solve your problem, but hopefully I gave you some pointers, and maybe someone else on the list can help.
>
>
>Julian
>
>
>On Jun 27, 2012, at 7:46 AM, mehdi b wrote:
>
>Hi,
>>
>>
>>I want to calculate histogram data for a level over a measure of a cube.I wrote the following MDX:
>>
>>
>>with
>>member [Measures].[Range1] as 
>>'Count(Filter([CustomerID].[CustomerID].members, [Measures].[WholeBalance]<10000))'
>>member [Measures].[Range2] as 
>>'Count(Filter([CustomerID].[CustomerID].members, [Measures].[WholeBalance]>=10000 and [Measures].[WholeBalance]<100000))'
>>select
>>{[Measures].[Range1],[Measures].[Range2]} on axis(0)
>>from [CustomerBal]
>>
>>
>>
>>The MDX calculates the result, but my CustomerID level has 2,150,354 members, and as I checked for every calculated member, Mondrian generates a SQL to find the members with the condition in Filter() and then count every found member in memory instead of using SQL count function. This results to long execution and bad performance for my reports.
>>
>>
>>The generated SQL for [Measures].[range1]:
>>SELECT "COR_CB_CUSTOMERBAL"."MS_CUSTOMERID" AS "c0"
>>FROM "BIDWH"."COR_CB_CUSTOMERBAL" "COR_CB_CUSTOMERBAL"
>>GROUP BY "COR_CB_CUSTOMERBAL"."MS_CUSTOMERID"
>>HAVING (SUM("COR_CB_CUSTOMERBAL"."MR_WHOLEBALANCE") < 10000.0)
>>ORDER BY "COR_CB_CUSTOMERBAL"."MS_CUSTOMERID" ASC
>>
>>
>>The generated SQL for [Measures].[range2]:
>>SELECT "COR_CB_CUSTOMERBAL"."MS_CUSTOMERID" AS "c0"
>>FROM "BIDWH"."COR_CB_CUSTOMERBAL" "COR_CB_CUSTOMERBAL"
>>GROUP BY "COR_CB_CUSTOMERBAL"."MS_CUSTOMERID"
>>HAVING
 ((SUM("COR_CB_CUSTOMERBAL"."MR_WHOLEBALANCE") >= 10000.0)
>>AND (SUM("COR_CB_CUSTOMERBAL"."MR_WHOLEBALANCE")      < 100000.0))
>>ORDER BY "COR_CB_CUSTOMERBAL"."MS_CUSTOMERID" ASC
>>
>>
>>But it would be better if generated SQL uses the SQL count function to directly count the members. 
>>
>>
>>
>>So what can I do?
>>_______________________________________________
>>Mondrian mailing list
>>Mondrian at pentaho.org
>>http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20120629/a712b8a4/attachment.html 


More information about the Mondrian mailing list