[Mondrian] Woefully slow query

Tom Barber(Alabs) tom at analytical-labs.com
Mon Feb 3 07:18:35 EST 2014


Alrighty then,

I've checked the cell batch size.

Whilst setting it up at 2M stops the terrible queries, the rendering 
time on Mondrian 4 is 12 seconds, and Mondrian 3 never rendered.

Setting it at 1M adds all the customer numbers into the where and 
performance continues to be awful (its been running for the last 10 
minutes with no result).

I've seen this before in a similar degenerate setup on Mondrian 3, at 
the time I assumed it was just the server sucking, clearly I was wrong.

Anyway, we have people on the Saiku IRC channel talking about the 
performance going backwards all the time. Something went wrong in the 
last 12 months on Mondrian, but I'm happy Mondrian 4 is showing such a 
great improvement and my cube required no migration, so I'll continue 
using that.

The problem with performance degredation is being able to demonstrate it 
in a way that developers can understand and figure out a fix against, 
and being able to test against it going forward it seems.

Tom

On 03/02/14 11:52, Matt Campbell wrote:
> That sounds like a cell batch issue to me, too, but it's curious that Mondrian 4 doesn't have the same performance problem with that query.  Default batch size is 100K for both Mondrian 3 and 4.
>
>
> ________________________________________
> From: mondrian-bounces at pentaho.org [mondrian-bounces at pentaho.org] On Behalf Of Luc Boudreau [lucboudreau at gmail.com]
> Sent: Sunday, February 02, 2014 9:26 PM
> To: Mondrian developer mailing list
> Subject: Re: [Mondrian] Woefully slow query
>
> I think Andy is right.
>
> Luc
>
> On Feb 2, 2014 5:51 PM, "Andy Grohe" <agrohe21 at gmail.com<mailto:agrohe21 at gmail.com>> wrote:
>
> If I recall correctly, this had/has to do with the cell batch size property.
>
> On Feb 2, 2014 4:48 PM, "Julian Hyde" <jhyde at pentaho.com<mailto:jhyde at pentaho.com>> wrote:
> Well, I guess I'm pleased that things have gotten a lot better in mondrian-4! I think there were a couple of bugs about generating lots of SQL statements. I don't recall which they were, or when they got fixed.
>
> Julian
>
>
> On Jan 31, 2014, at 11:27 AM, Tom Barber(Alabs) <tom at analytical-labs.com<mailto:tom at analytical-labs.com>> wrote:
>
>> Ok here is the issue:
>>
>> SELECT NON EMPTY {Hierarchize({{[Invoice Date].[Invoice
>> Date].[Year].Members}, {[Invoice Date].[Invoice
>> Date].[Month].Members}})} ON COLUMNS,
>> NON EMPTY {Hierarchize({[Customer].[Customer].[Customer].Members})} ON ROWS
>> FROM [Aging]
>>
>> Using the following schema and ddl and ~ 35k rows on Postgres.
>>
>> On  mondrian Trunk it does this for literally hours:
>>
>> 2014-01-31 16:54:16,950 DEBUG [mondrian.sql] 228: Segment.load:
>> executing sql [select "aging_link"."Country Code" as "c0",
>> "aging_link"."Customer Number" as "c1", "datelookup_1"."calendaryear" as
>> "c2", sum("aging_link"."Invoice Amount") as "m0" from
>> "public"."aging_link" as "aging_link", "public"."datelookup" as
>> "datelookup_1" where "aging_link"."Country Code" = 'Somewhere' and
>> "aging_link"."Customer Number" in ('272483545', '272483941',
>> '272484027', '272484605', '272485439', '272485822', '272485950',
>> '272486276', '272486317', '272486544', '272486867', '272487956',
>> '272488313', '272488876', '272488959', '272489022', '272490318',
>> '272490404', '272490420', '272491126', '272491270', '272491449',
>> '272491452', '272491861', '272492033', '272492541', '272493292',
>> '272493672', '272493870', '272494071', '272494899', '272496598',
>> '272497054', '272497249', '272497492', '272497588', '272497997',
>> '272498169', '272498718', '272499261', '272499571', '272500138',
>> '272501706', '272501889', '272502530', '272503207', '272503278',
>> '272504776', '272504789', '272505571', '272506037', '272506107',
>> '272507126', '272507407', '272509487', '272510830', '272512951',
>> '272513194', '272513219', '272513222', '272513251', '272513404',
>> '272513590', '272514423', '272514915', '272515554', '272516825',
>> '272516979', '272517000', '272517422', '272519192', '272519332',
>> '272520123', '272521296', '272521308', '272521692', '272522187') and
>> "aging_link"."Due Date" = "datelookup_1"."daydate" group by
>> "aging_link"."Country Code", "aging_link"."Customer Number",
>> "datelookup_1"."calendaryear"]
>>
>> Running 100's of queries just iterating through the whole fact table.
>>
>> If I run the identical schema on Mondrian 4 it runs a grand total of 10
>> queries and the whole thing finishes in 12 seconds (including rendering
>> 12500 rows in Saiku)
>>
>> 2014-01-31 19:26:24,327 DEBUG [mondrian.sql] 125:
>> SqlTupleReader.readTuples [[Invoice Date].[Invoice Date].[Year]]:
>> executing sql [select "datelookup"."calendaryear" as "c0" from
>> "aging_link" as "aging_link", "datelookup" as "datelookup" where
>> "aging_link"."Invoice Date" = "datelookup"."daydate" group by
>> "datelookup"."calendaryear" order by "datelookup"."calendaryear" ASC
>> NULLS LAST]
>> 2014-01-31 19:26:24,434 DEBUG [mondrian.sql] 125: , exec 106 ms
>> 2014-01-31 19:26:24,435 DEBUG [mondrian.sql] 125: , exec+fetch 108 ms, 3
>> rows
>> 2014-01-31 19:26:24,435 DEBUG [mondrian.sql] 126:
>> SqlTupleReader.readTuples [[Invoice Date].[Invoice Date].[Month]]:
>> executing sql [select "datelookup"."calendaryear" as "c0",
>> "datelookup"."calendarmonth" as "c1", "datelookup"."monthname" as "c2"
>> from "aging_link" as "aging_link", "datelookup" as "datelookup" where
>> "aging_link"."Invoice Date" = "datelookup"."daydate" group by
>> "datelookup"."calendaryear", "datelookup"."calendarmonth",
>> "datelookup"."monthname" order by "datelookup"."calendaryear" ASC NULLS
>> LAST, "datelookup"."calendarmonth" ASC NULLS LAST]
>> 2014-01-31 19:26:24,520 DEBUG [mondrian.sql] 126: , exec 85 ms
>> 2014-01-31 19:26:24,521 DEBUG [mondrian.sql] 126: , exec+fetch 86 ms, 31
>> rows
>> 2014-01-31 19:26:24,522 DEBUG [mondrian.sql] 127:
>> SqlTupleReader.readTuples [[Customer].[Customer].[Customer]]: executing
>> sql [select "aging_link"."Country Code" as "c0", "aging_link"."Customer
>> Number" as "c1", "aging_link"."Customer Name" as "c2" from "aging_link"
>> as "aging_link" group by "aging_link"."Country Code",
>> "aging_link"."Customer Number", "aging_link"."Customer Name" order by
>> "aging_link"."Country Code" ASC NULLS LAST, "aging_link"."Customer
>> Number" ASC NULLS LAST]
>> 2014-01-31 19:26:25,254 DEBUG [mondrian.sql] 127: , exec 732 ms
>> 2014-01-31 19:26:25,374 DEBUG [mondrian.sql] 127: , exec+fetch 852 ms,
>> 12382 rows
>> 2014-01-31 19:26:25,669 DEBUG [mondrian.sql] 128:
>> SqlStatisticsProvider.getColumnCardinality: executing sql [select
>> count(distinct "Country Code") from "aging_link"]
>> 2014-01-31 19:26:25,944 DEBUG [mondrian.sql] 128: , exec 274 ms
>> 2014-01-31 19:26:25,944 DEBUG [mondrian.sql] 128: , exec+fetch 275 ms, 1
>> rows
>> 2014-01-31 19:26:25,944 DEBUG [mondrian.sql] 129:
>> SqlStatisticsProvider.getColumnCardinality: executing sql [select
>> count(distinct "calendaryear") from "datelookup"]
>> 2014-01-31 19:26:25,966 DEBUG [mondrian.sql] 129: , exec 21 ms
>> 2014-01-31 19:26:25,966 DEBUG [mondrian.sql] 129: , exec+fetch 22 ms, 1 rows
>> 2014-01-31 19:26:25,966 DEBUG [mondrian.sql] 130:
>> SqlStatisticsProvider.getColumnCardinality: executing sql [select
>> count(distinct "Customer Number") from "aging_link"]
>> 2014-01-31 19:26:26,177 DEBUG [mondrian.sql] 130: , exec 210 ms
>> 2014-01-31 19:26:26,177 DEBUG [mondrian.sql] 130: , exec+fetch 211 ms, 1
>> rows
>> 2014-01-31 19:26:26,178 DEBUG [mondrian.sql] 131: Segment.load:
>> executing sql [select "aging_link"."Country Code" as "c0",
>> "aging_link"."Customer Number" as "c1", "datelookup"."calendaryear" as
>> "c2", sum("aging_link"."Invoice Amount") as "m0" from "aging_link" as
>> "aging_link", "datelookup" as "datelookup" where "aging_link"."Country
>> Code" in ('BELGIUM', 'Belgium') and "datelookup"."calendaryear" in
>> (2011, 2012, 2013) and "aging_link"."Invoice Date" =
>> "datelookup"."daydate" group by "aging_link"."Country Code",
>> "aging_link"."Customer Number", "datelookup"."calendaryear"]
>> 2014-01-31 19:26:26,186 DEBUG [mondrian.sql] 132:
>> SqlStatisticsProvider.getColumnCardinality: executing sql [select
>> count(distinct "calendarmonth") from "datelookup"]
>> 2014-01-31 19:26:26,221 DEBUG [mondrian.sql] 132: , exec 34 ms
>> 2014-01-31 19:26:26,221 DEBUG [mondrian.sql] 132: , exec+fetch 35 ms, 1 rows
>> 2014-01-31 19:26:26,222 DEBUG [mondrian.sql] 133: Segment.load:
>> executing sql [select "aging_link"."Country Code" as "c0",
>> "aging_link"."Customer Number" as "c1", "datelookup"."calendaryear" as
>> "c2", "datelookup"."calendarmonth" as "c3", sum("aging_link"."Invoice
>> Amount") as "m0" from "aging_link" as "aging_link", "datelookup" as
>> "datelookup" where "aging_link"."Country Code" in ('BELGIUM', 'Belgium')
>> and "datelookup"."calendaryear" in (2011, 2012, 2013) and
>> "aging_link"."Invoice Date" = "datelookup"."daydate" group by
>> "aging_link"."Country Code", "aging_link"."Customer Number",
>> "datelookup"."calendaryear", "datelookup"."calendarmonth"]
>> 2014-01-31 19:26:26,389 DEBUG [mondrian.sql] 131: , exec 210 ms
>> 2014-01-31 19:26:26,434 DEBUG [mondrian.sql] 131: , exec+fetch 256 ms,
>> 12447 rows
>> 2014-01-31 19:26:26,485 DEBUG [mondrian.sql] 133: , exec 261 ms
>> 2014-01-31 19:26:26,542 DEBUG [mondrian.sql] 133: , exec+fetch 319 ms,
>> 18685 rows
>> 2014-01-31 19:26:27,261 DEBUG [mondrian.sql] 134: Segment.load:
>> executing sql [select "aging_link"."Country Code" as "c0",
>> "aging_link"."Customer Number" as "c1", "datelookup"."calendaryear" as
>> "c2", sum("aging_link"."Invoice Amount") as "m0" from "aging_link" as
>> "aging_link", "datelookup" as "datelookup" where "aging_link"."Country
>> Code" in ('Ireland', 'Sweden') and "aging_link"."Customer Number" in
>> ('953724073', '953724440', '960814035') and "datelookup"."calendaryear"
>> in (2011, 2012, 2013) and "aging_link"."Invoice Date" =
>> "datelookup"."daydate" group by "aging_link"."Country Code",
>> "aging_link"."Customer Number", "datelookup"."calendaryear"]
>> 2014-01-31 19:26:27,261 DEBUG [mondrian.sql] 135: Segment.load:
>> executing sql [select "aging_link"."Country Code" as "c0",
>> "aging_link"."Customer Number" as "c1", "datelookup"."calendaryear" as
>> "c2", "datelookup"."calendarmonth" as "c3", sum("aging_link"."Invoice
>> Amount") as "m0" from "aging_link" as "aging_link", "datelookup" as
>> "datelookup" where "aging_link"."Country Code" in ('Ireland', 'Sweden')
>> and "aging_link"."Customer Number" in ('953724073', '953724440',
>> '960814035') and "datelookup"."calendaryear" in (2011, 2012, 2013) and
>> "aging_link"."Invoice Date" = "datelookup"."daydate" group by
>> "aging_link"."Country Code", "aging_link"."Customer Number",
>> "datelookup"."calendaryear", "datelookup"."calendarmonth"]
>> 2014-01-31 19:26:27,265 DEBUG [mondrian.sql] 135: , exec 3 ms
>> 2014-01-31 19:26:27,266 DEBUG [mondrian.sql] 135: , exec+fetch 5 ms, 5 rows
>> 2014-01-31 19:26:27,266 DEBUG [mondrian.sql] 134: , exec 4 ms
>> 2014-01-31 19:26:27,267 DEBUG [mondrian.sql] 134: , exec+fetch 6 ms, 3 rows
>>
>>
>> The above is the entire query that is pushed to the database, as you can
>> see, much quicker and cleaner.
>>
>> Anyway, there you go.
>>
>> Tom
>> <schema.xml><ddl.txt>_______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org<mailto:Mondrian at pentaho.org>
>> http://lists.pentaho.org/mailman/listinfo/mondrian
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org<mailto:Mondrian at pentaho.org>
> http://lists.pentaho.org/mailman/listinfo/mondrian
>
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org<mailto:Mondrian at pentaho.org>
> http://lists.pentaho.org/mailman/listinfo/mondrian
>
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian



More information about the Mondrian mailing list