[Mondrian] Making Pre-aggregated Tables Work

Tom Barber tom at analytical-labs.com
Mon Apr 4 13:34:49 EDT 2016


By the end of the day we were debugging the jdbc and sql statistics
classes. My theory was that if it couldn't determine the row count from
jdbc metadata it would then use select count (*) instead but I got told
via  Amit and some debugging that want the case, but I don't have access to
this phoenix box so I'm debugging by proxy.
On 4 Apr 2016 18:30, "Julian Hyde" <julianhyde at gmail.com> wrote:

> If the Phoenix dialect does not implement rowCount then it sounds as if
> aggregate tables never worked against Phoenix. There ought to be a test
> that checks whether aggregate tables work. Is there such a test, and did
> Phoenix ever pass that test?
>
> Julian
>
>
>
>
> On Apr 4, 2016, at 7:58 AM, Luc Boudreau <lucboudreau at gmail.com> wrote:
>
> > 1. Why does mondrian rely on the rowCount (...)
>
> The number of rows tells us which of the aggregate tables is the most
> efficient. There is not always a 1:1 match between a query's aggregation
> level and the aggregation table's. When faced with two aggregation table
> choices, both needing a supplemental grouping, the most efficient is the
> one with the least rows.
>
> > 2. Why does the code check if the aggregate table has more number of
> rows than the fact table
>
> Same as above. If an aggregation table is bigger than your fact table,
> then you're better off using the fact table.
>
>
> On Mon, Apr 4, 2016 at 10:24 AM, Tom Barber <tom at analytical-labs.com>
> wrote:
>
>> On top of Amits questions, here's a bunch more output:
>>
>> We have 2 schema:
>>
>> http://pastebin.com/XkPSFN3P and http://pastebin.com/YbANX0pB
>>
>> The foodmart one uses the aggregate tables, the phoenix one doesn't.
>>
>> I've not been able to devote loads of time to debugging it, but Amit has
>> been trying his best and we've been playing around with making sure phoenix
>> uses a stats provider etc, and still it picks the fact table over the
>> aggregate.
>>
>> http://pastebin.com/raw/MU6Xkqtk here's the mondrian logging output.
>>
>> Cheers
>>
>> Tom
>>
>> --------------
>>
>> Director Meteorite.bi - Saiku Analytics Founder
>> Tel: +44(0)5603641316
>>
>> (Thanks to the Saiku community we reached our Kickstart
>> <http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/>
>> goal, but you can always help by sponsoring the project
>> <http://www.meteorite.bi/products/saiku/sponsorship>)
>>
>> On 2 April 2016 at 11:10, Amit Shah <amits.84 at gmail.com> wrote:
>>
>>> Any inputs here? I'm sure having pre-aggregated tables must be a pretty
>>> common practice.
>>> I'm kind of blocked with this not working.
>>>
>>> Thanks,
>>> Amit.
>>>
>>> On Fri, Apr 1, 2016 at 5:34 PM, Tom Barber <tom at analytical-labs.com>
>>> wrote:
>>>
>>>> I'll extend this post slightly to add:
>>>>
>>>> We've enabled aggregates in mondrian.properties, and we added a
>>>> Aggregate only measure and using that one forced the use of the aggregate
>>>> table, so the aggregates are being parsed.
>>>>
>>>> --------------
>>>>
>>>> Director Meteorite.bi - Saiku Analytics Founder
>>>> Tel: +44(0)5603641316
>>>>
>>>> (Thanks to the Saiku community we reached our Kickstart
>>>> <http://kickstarter.com/projects/2117053714/saiku-reporting-interactive-report-designer/>
>>>> goal, but you can always help by sponsoring the project
>>>> <http://www.meteorite.bi/products/saiku/sponsorship>)
>>>>
>>>> On 1 April 2016 at 12:59, Amit Shah <amits.84 at gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I have been using Mondrian 4 with Saiku as an OLAP engine. Mondrian is
>>>>> integrated with apache phoenix (version 4.6) as the data source that used
>>>>> apache hbase as the data store. In order to speed up the queries I am
>>>>> trying to make pre-aggregated tables work but for some reason the actual
>>>>> fact table is getting queried. Here's how the mondrian schema definition
>>>>> looks like - http://pastebin.com/xquutKYX.
>>>>>
>>>>> The database schema for the dimension, facts and the pre-aggregated
>>>>> table is shared here - http://pastebin.com/EWhyNLTJ
>>>>>
>>>>> Selecting the total sales measure and the agent id as a row in the
>>>>> saiku builds up a query and tries to execute it on the facts table instead
>>>>> of the pre-aggregated table. The query from the sql logs is
>>>>>
>>>>> DEBUG [mondrian.sql] 850: Segment.load: executing sql [select
>>>>> "AGENTS"."A_ID" as "c0", sum("TRANSACTIONS"."TOTAL_SALES") as "m0" from
>>>>> "TRANSACTIONS" as "TRANSACTIONS", "AGENTS" as "AGENTS" where
>>>>> "TRANSACTIONS"."AGENT_ID" = "AGENTS"."A_ID" group by "AGENTS"."A_ID"]
>>>>>
>>>>> What could be missing in order to make the OLAP engine to use the
>>>>> AG_TER_PRO (pre-aggregated table)?
>>>>>
>>>>> Thanks,
>>>>> Amit.
>>>>>
>>>>> _______________________________________________
>>>>> Mondrian mailing list
>>>>> Mondrian at pentaho.org
>>>>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> Mondrian mailing list
>>>> Mondrian at pentaho.org
>>>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>>>
>>>>
>>>
>>> _______________________________________________
>>> Mondrian mailing list
>>> Mondrian at pentaho.org
>>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>>
>>>
>>
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org
>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>>
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian
>
>
>
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20160404/42fbb8bd/attachment.html 


More information about the Mondrian mailing list