[Mondrian] Woefully slow query

Matt Campbell mcampbell at pentaho.com
Tue Feb 4 08:45:19 EST 2014


Hi Tom,
I've been attempting to reproduce what you're seeing.  I have no problem
producing the many "terrible queries" by constructing queries that
exceed cell batch size.  After increasing the batch size, though, I
haven't been able to produce the indefinite hang that you are seeing.

A few questions:
1)  What is optimizePredicates set to?  (default is true)
2)  What is your maxConstraints value?  (default is 1K)
3)  What's your heap size?  (I'm curious if the 10+ minute hang is due
to memory thrashing).

Thanks!
Matt

On 02/03/2014 07:21 AM, Tom Barber(Alabs) wrote:
> Alrighty then,
>
> I've checked the cell batch size.
>
> Whilst setting it up at 2M stops the terrible queries, the rendering 
> time on Mondrian 4 is 12 seconds, and Mondrian 3 never rendered.
>
> Setting it at 1M adds all the customer numbers into the where and 
> performance continues to be awful (its been running for the last 10 
> minutes with no result).
>
> I've seen this before in a similar degenerate setup on Mondrian 3, at 
> the time I assumed it was just the server sucking, clearly I was wrong.
>
> Anyway, we have people on the Saiku IRC channel talking about the 
> performance going backwards all the time. Something went wrong in the 
> last 12 months on Mondrian, but I'm happy Mondrian 4 is showing such a 
> great improvement and my cube required no migration, so I'll continue 
> using that.
>
> The problem with performance degredation is being able to demonstrate it 
> in a way that developers can understand and figure out a fix against, 
> and being able to test against it going forward it seems.
>
> Tom
>
> On 03/02/14 11:52, Matt Campbell wrote:
>> That sounds like a cell batch issue to me, too, but it's curious that Mondrian 4 doesn't have the same performance problem with that query.  Default batch size is 100K for both Mondrian 3 and 4.
>>
>>
>> ________________________________________
>> From: mondrian-bounces at pentaho.org [mondrian-bounces at pentaho.org] On Behalf Of Luc Boudreau [lucboudreau at gmail.com]
>> Sent: Sunday, February 02, 2014 9:26 PM
>> To: Mondrian developer mailing list
>> Subject: Re: [Mondrian] Woefully slow query
>>
>> I think Andy is right.
>>
>> Luc
>>
>> On Feb 2, 2014 5:51 PM, "Andy Grohe" <agrohe21 at gmail.com<mailto:agrohe21 at gmail.com>> wrote:
>>
>> If I recall correctly, this had/has to do with the cell batch size property.
>>
>> On Feb 2, 2014 4:48 PM, "Julian Hyde" <jhyde at pentaho.com<mailto:jhyde at pentaho.com>> wrote:
>> Well, I guess I'm pleased that things have gotten a lot better in mondrian-4! I think there were a couple of bugs about generating lots of SQL statements. I don't recall which they were, or when they got fixed.
>>
>> Julian
>>
>>
>> On Jan 31, 2014, at 11:27 AM, Tom Barber(Alabs) <tom at analytical-labs.com<mailto:tom at analytical-labs.com>> wrote:
>>
>>> Ok here is the issue:
>>>
>>> SELECT NON EMPTY {Hierarchize({{[Invoice Date].[Invoice
>>> Date].[Year].Members}, {[Invoice Date].[Invoice
>>> Date].[Month].Members}})} ON COLUMNS,
>>> NON EMPTY {Hierarchize({[Customer].[Customer].[Customer].Members})} ON ROWS
>>> FROM [Aging]
>>>
>>> Using the following schema and ddl and ~ 35k rows on Postgres.
>>>
>>> On  mondrian Trunk it does this for literally hours:
>>>
>>> 2014-01-31 16:54:16,950 DEBUG [mondrian.sql] 228: Segment.load:
>>> executing sql [select "aging_link"."Country Code" as "c0",
>>> "aging_link"."Customer Number" as "c1", "datelookup_1"."calendaryear" as
>>> "c2", sum("aging_link"."Invoice Amount") as "m0" from
>>> "public"."aging_link" as "aging_link", "public"."datelookup" as
>>> "datelookup_1" where "aging_link"."Country Code" = 'Somewhere' and
>>> "aging_link"."Customer Number" in ('272483545', '272483941',
>>> '272484027', '272484605', '272485439', '272485822', '272485950',
>>> '272486276', '272486317', '272486544', '272486867', '272487956',
>>> '272488313', '272488876', '272488959', '272489022', '272490318',
>>> '272490404', '272490420', '272491126', '272491270', '272491449',
>>> '272491452', '272491861', '272492033', '272492541', '272493292',
>>> '272493672', '272493870', '272494071', '272494899', '272496598',
>>> '272497054', '272497249', '272497492', '272497588', '272497997',
>>> '272498169', '272498718', '272499261', '272499571', '272500138',
>>> '272501706', '272501889', '272502530', '272503207', '272503278',
>>> '272504776', '272504789', '272505571', '272506037', '272506107',
>>> '272507126', '272507407', '272509487', '272510830', '272512951',
>>> '272513194', '272513219', '272513222', '272513251', '272513404',
>>> '272513590', '272514423', '272514915', '272515554', '272516825',
>>> '272516979', '272517000', '272517422', '272519192', '272519332',
>>> '272520123', '272521296', '272521308', '272521692', '272522187') and
>>> "aging_link"."Due Date" = "datelookup_1"."daydate" group by
>>> "aging_link"."Country Code", "aging_link"."Customer Number",
>>> "datelookup_1"."calendaryear"]
>>>
>>> Running 100's of queries just iterating through the whole fact table.
>>>
>>> If I run the identical schema on Mondrian 4 it runs a grand total of 10
>>> queries and the whole thing finishes in 12 seconds (including rendering
>>> 12500 rows in Saiku)
>>>
>>> 2014-01-31 19:26:24,327 DEBUG [mondrian.sql] 125:
>>> SqlTupleReader.readTuples [[Invoice Date].[Invoice Date].[Year]]:
>>> executing sql [select "datelookup"."calendaryear" as "c0" from
>>> "aging_link" as "aging_link", "datelookup" as "datelookup" where
>>> "aging_link"."Invoice Date" = "datelookup"."daydate" group by
>>> "datelookup"."calendaryear" order by "datelookup"."calendaryear" ASC
>>> NULLS LAST]
>>> 2014-01-31 19:26:24,434 DEBUG [mondrian.sql] 125: , exec 106 ms
>>> 2014-01-31 19:26:24,435 DEBUG [mondrian.sql] 125: , exec+fetch 108 ms, 3
>>> rows
>>> 2014-01-31 19:26:24,435 DEBUG [mondrian.sql] 126:
>>> SqlTupleReader.readTuples [[Invoice Date].[Invoice Date].[Month]]:
>>> executing sql [select "datelookup"."calendaryear" as "c0",
>>> "datelookup"."calendarmonth" as "c1", "datelookup"."monthname" as "c2"
>>> from "aging_link" as "aging_link", "datelookup" as "datelookup" where
>>> "aging_link"."Invoice Date" = "datelookup"."daydate" group by
>>> "datelookup"."calendaryear", "datelookup"."calendarmonth",
>>> "datelookup"."monthname" order by "datelookup"."calendaryear" ASC NULLS
>>> LAST, "datelookup"."calendarmonth" ASC NULLS LAST]
>>> 2014-01-31 19:26:24,520 DEBUG [mondrian.sql] 126: , exec 85 ms
>>> 2014-01-31 19:26:24,521 DEBUG [mondrian.sql] 126: , exec+fetch 86 ms, 31
>>> rows
>>> 2014-01-31 19:26:24,522 DEBUG [mondrian.sql] 127:
>>> SqlTupleReader.readTuples [[Customer].[Customer].[Customer]]: executing
>>> sql [select "aging_link"."Country Code" as "c0", "aging_link"."Customer
>>> Number" as "c1", "aging_link"."Customer Name" as "c2" from "aging_link"
>>> as "aging_link" group by "aging_link"."Country Code",
>>> "aging_link"."Customer Number", "aging_link"."Customer Name" order by
>>> "aging_link"."Country Code" ASC NULLS LAST, "aging_link"."Customer
>>> Number" ASC NULLS LAST]
>>> 2014-01-31 19:26:25,254 DEBUG [mondrian.sql] 127: , exec 732 ms
>>> 2014-01-31 19:26:25,374 DEBUG [mondrian.sql] 127: , exec+fetch 852 ms,
>>> 12382 rows
>>> 2014-01-31 19:26:25,669 DEBUG [mondrian.sql] 128:
>>> SqlStatisticsProvider.getColumnCardinality: executing sql [select
>>> count(distinct "Country Code") from "aging_link"]
>>> 2014-01-31 19:26:25,944 DEBUG [mondrian.sql] 128: , exec 274 ms
>>> 2014-01-31 19:26:25,944 DEBUG [mondrian.sql] 128: , exec+fetch 275 ms, 1
>>> rows
>>> 2014-01-31 19:26:25,944 DEBUG [mondrian.sql] 129:
>>> SqlStatisticsProvider.getColumnCardinality: executing sql [select
>>> count(distinct "calendaryear") from "datelookup"]
>>> 2014-01-31 19:26:25,966 DEBUG [mondrian.sql] 129: , exec 21 ms
>>> 2014-01-31 19:26:25,966 DEBUG [mondrian.sql] 129: , exec+fetch 22 ms, 1 rows
>>> 2014-01-31 19:26:25,966 DEBUG [mondrian.sql] 130:
>>> SqlStatisticsProvider.getColumnCardinality: executing sql [select
>>> count(distinct "Customer Number") from "aging_link"]
>>> 2014-01-31 19:26:26,177 DEBUG [mondrian.sql] 130: , exec 210 ms
>>> 2014-01-31 19:26:26,177 DEBUG [mondrian.sql] 130: , exec+fetch 211 ms, 1
>>> rows
>>> 2014-01-31 19:26:26,178 DEBUG [mondrian.sql] 131: Segment.load:
>>> executing sql [select "aging_link"."Country Code" as "c0",
>>> "aging_link"."Customer Number" as "c1", "datelookup"."calendaryear" as
>>> "c2", sum("aging_link"."Invoice Amount") as "m0" from "aging_link" as
>>> "aging_link", "datelookup" as "datelookup" where "aging_link"."Country
>>> Code" in ('BELGIUM', 'Belgium') and "datelookup"."calendaryear" in
>>> (2011, 2012, 2013) and "aging_link"."Invoice Date" =
>>> "datelookup"."daydate" group by "aging_link"."Country Code",
>>> "aging_link"."Customer Number", "datelookup"."calendaryear"]
>>> 2014-01-31 19:26:26,186 DEBUG [mondrian.sql] 132:
>>> SqlStatisticsProvider.getColumnCardinality: executing sql [select
>>> count(distinct "calendarmonth") from "datelookup"]
>>> 2014-01-31 19:26:26,221 DEBUG [mondrian.sql] 132: , exec 34 ms
>>> 2014-01-31 19:26:26,221 DEBUG [mondrian.sql] 132: , exec+fetch 35 ms, 1 rows
>>> 2014-01-31 19:26:26,222 DEBUG [mondrian.sql] 133: Segment.load:
>>> executing sql [select "aging_link"."Country Code" as "c0",
>>> "aging_link"."Customer Number" as "c1", "datelookup"."calendaryear" as
>>> "c2", "datelookup"."calendarmonth" as "c3", sum("aging_link"."Invoice
>>> Amount") as "m0" from "aging_link" as "aging_link", "datelookup" as
>>> "datelookup" where "aging_link"."Country Code" in ('BELGIUM', 'Belgium')
>>> and "datelookup"."calendaryear" in (2011, 2012, 2013) and
>>> "aging_link"."Invoice Date" = "datelookup"."daydate" group by
>>> "aging_link"."Country Code", "aging_link"."Customer Number",
>>> "datelookup"."calendaryear", "datelookup"."calendarmonth"]
>>> 2014-01-31 19:26:26,389 DEBUG [mondrian.sql] 131: , exec 210 ms
>>> 2014-01-31 19:26:26,434 DEBUG [mondrian.sql] 131: , exec+fetch 256 ms,
>>> 12447 rows
>>> 2014-01-31 19:26:26,485 DEBUG [mondrian.sql] 133: , exec 261 ms
>>> 2014-01-31 19:26:26,542 DEBUG [mondrian.sql] 133: , exec+fetch 319 ms,
>>> 18685 rows
>>> 2014-01-31 19:26:27,261 DEBUG [mondrian.sql] 134: Segment.load:
>>> executing sql [select "aging_link"."Country Code" as "c0",
>>> "aging_link"."Customer Number" as "c1", "datelookup"."calendaryear" as
>>> "c2", sum("aging_link"."Invoice Amount") as "m0" from "aging_link" as
>>> "aging_link", "datelookup" as "datelookup" where "aging_link"."Country
>>> Code" in ('Ireland', 'Sweden') and "aging_link"."Customer Number" in
>>> ('953724073', '953724440', '960814035') and "datelookup"."calendaryear"
>>> in (2011, 2012, 2013) and "aging_link"."Invoice Date" =
>>> "datelookup"."daydate" group by "aging_link"."Country Code",
>>> "aging_link"."Customer Number", "datelookup"."calendaryear"]
>>> 2014-01-31 19:26:27,261 DEBUG [mondrian.sql] 135: Segment.load:
>>> executing sql [select "aging_link"."Country Code" as "c0",
>>> "aging_link"."Customer Number" as "c1", "datelookup"."calendaryear" as
>>> "c2", "datelookup"."calendarmonth" as "c3", sum("aging_link"."Invoice
>>> Amount") as "m0" from "aging_link" as "aging_link", "datelookup" as
>>> "datelookup" where "aging_link"."Country Code" in ('Ireland', 'Sweden')
>>> and "aging_link"."Customer Number" in ('953724073', '953724440',
>>> '960814035') and "datelookup"."calendaryear" in (2011, 2012, 2013) and
>>> "aging_link"."Invoice Date" = "datelookup"."daydate" group by
>>> "aging_link"."Country Code", "aging_link"."Customer Number",
>>> "datelookup"."calendaryear", "datelookup"."calendarmonth"]
>>> 2014-01-31 19:26:27,265 DEBUG [mondrian.sql] 135: , exec 3 ms
>>> 2014-01-31 19:26:27,266 DEBUG [mondrian.sql] 135: , exec+fetch 5 ms, 5 rows
>>> 2014-01-31 19:26:27,266 DEBUG [mondrian.sql] 134: , exec 4 ms
>>> 2014-01-31 19:26:27,267 DEBUG [mondrian.sql] 134: , exec+fetch 6 ms, 3 rows
>>>
>>>
>>> The above is the entire query that is pushed to the database, as you can
>>> see, much quicker and cleaner.
>>>
>>> Anyway, there you go.
>>>
>>> Tom
>>> <schema.xml><ddl.txt>_______________________________________________
>>> Mondrian mailing list
>>> Mondrian at pentaho.org<mailto:Mondrian at pentaho.org>
>>> http://lists.pentaho.org/mailman/listinfo/mondrian
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org<mailto:Mondrian at pentaho.org>
>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org<mailto:Mondrian at pentaho.org>
>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org
>> http://lists.pentaho.org/mailman/listinfo/mondrian
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian
>



More information about the Mondrian mailing list