[Mondrian] Help with sort of large dimension and calculated members

Matt Campbell mcampbell at pentaho.com
Tue May 5 16:49:59 EDT 2015


Typically when I see these sorts of questions turn up on the forum I recommend setting the cellBatchSize to a large number just to diagnose the problem.  I.e. set mondrian.rolap.cellBatchSize=2000000000 in mondrian.properties.  Setting the value so high is *not safe* for production usage, but it can be useful for troubleshooting.  If it does reduce the 100s of queries down a more reasonable number then it confirms that the query is creating a large number of cell request.  The default cellBatchSize is 100K, which is adequate for many workloads.

Assuming that using a large value does in fact allow the MDX to run more reasonably then the next question is why so many batch requests are being made.  Occasionally the problem is that the query is failing native evaluation, so the tuple set is not reduced and Mondrian ends up evaluating a Cartesian product.  Some times calculations are complex and need to be rethought.  Other times the non empty crossjoin is really just enormous.  If the problem is the latter than you need to adjust the cellBatchSize up from its default of 100K to a number that meets the needs of your workload without opening you up to excessive resource demand.  



-----Original Message-----
From: mondrian-bounces at pentaho.org [mailto:mondrian-bounces at pentaho.org] On Behalf Of Tom Barber
Sent: Tuesday, May 05, 2015 4:38 PM
To: Mondrian developer mailing list
Subject: Re: [Mondrian] Help with sort of large dimension and calculated members

Alright chaps,
This question was also asked on the Saiku list and I punted him over here, because this is a recurring issue that I saw which got me to swap from M3 to M4, and various other people have been talking about recently with no real way to track it down or reproduce it, but this chunking stuff has affected various people but as everyone's database/queries are different its hard to reproduce.

Tom


On Tue, May 05, 2015 at 10:31:02PM +0200, Roland Bouman wrote:
>Just curious...can you share your MySQL schema + data? I'm intrigued by 
>your query and would like to run it myself. Perhaps some optimization 
>on the MySQL end of things is possible as well...
>
>On Tue, May 5, 2015 at 10:26 PM, Joshua Diamond 
><diamond.joshh at gmail.com>
>wrote:
>
>>
>> Hi,
>>
>> Thanks for letting me join this list. I joined because I'm having a 
>> problem that no amount of research has been able to solve.
>> Here's my specs:
>> Analyzer: Saiku 2.6 CE
>> mondrian-3.8.0.0-209.jar
>> database: MySql
>> Attached is my schema.
>> Attached are my mondrian properties.
>>
>> my fact table is roughly 500,000 rows.
>> my largest dimension table is my CandidatesDim dimension is 50,000 rows.
>>
>> When ever I try to run a query like the following, mondrian generates 
>> hundreds of sql querries and even if I set my query timeout property 
>> to something ridiculous, it never finishes.
>> SELECT
>> NON EMPTY Hierarchize(Union(CrossJoin({[OfficeDim.OfficeHier].[U.S.
>> President]}, CrossJoin([PartyDim.PartyHier].[Party].Members,
>> {[Measures].[VotesDisplay]})), CrossJoin({[OfficeDim.OfficeHier].[U.S.
>> President]}, CrossJoin([PartyDim.PartyHier].[Party].Members,
>> {[Measures].[Percentage]})))) ON COLUMNS, NON EMPTY 
>> CrossJoin([GeoDim.GeoHier].[State].Members,
>> [CandidatesDim.CandidatesHier].[Candidate].Members) ON ROWS FROM 
>> [ElectionResults] WHERE {[CycleDim.CycleHier].[General].[2012 
>> General]}
>>
>> The problem seems to happen whenever I have more than one dimension 
>> on an axis and more than one measure.
>>
>> If I run this query it finishes in 3 seconds:
>> SELECT
>> NON EMPTY CrossJoin({[OfficeDim.OfficeHier].[U.S. President]}, 
>> CrossJoin([PartyDim.PartyHier].[Party].Members, 
>> {[Measures].[Percentage]})) ON COLUMNS, NON EMPTY 
>> CrossJoin([GeoDim.GeoHier].[State].Members,
>> [CandidatesDim.CandidatesHier].[Candidate].Members) ON ROWS FROM 
>> [ElectionResults] WHERE {[CycleDim.CycleHier].[General].[2012 
>> General]}
>>
>> I've tried almost every combination of mondrian properties. I've 
>> tried setting saiku.olap.convert.query = true. I've created aggregate 
>> tables that never got used.
>>
>> What am I doing wrong? Is this just a mondrian 3.x issue that I'm 
>> forced to live with? I'm open to any thoughts you all might have.
>>
>> Thank you so much for your help.
>>
>> Josh
>>
>> --
>> Joshua Diamond
>> e. Diamond.JoshH at gmail.com
>> c.: (561)676-3275
>>
>>
>>
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org
>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>>
>
>
>--
>Roland Bouman
>blog: http://rpbouman.blogspot.com/
>twitter: @rolandbouman
>linkedin: http://www.linkedin.com/profile/view?id=5142800&trk=tab_pro
>
>Author of "Pentaho Solutions" (Wiley, ISBN: 978-0-470-48432-6
>http://tinyurl.com/lvxa88) and "Pentaho Kettle Solutions" (Wiley, ISBN:
>978-0-470-63517-9 http://tinyurl.com/33r7a8m)

>_______________________________________________
>Mondrian mailing list
>Mondrian at pentaho.org
>http://lists.pentaho.org/mailman/listinfo/mondrian

_______________________________________________
Mondrian mailing list
Mondrian at pentaho.org
http://lists.pentaho.org/mailman/listinfo/mondrian


More information about the Mondrian mailing list