[Mondrian] distinct-count and explicit aggregate table recognition

Tomasz Karlik tkarlik at gmail.com
Tue Jan 8 15:24:32 EST 2013


I suppose that without this change Mondrian will always use aggregate
table. Aggregate table column is handled by RolapAggregator.Sum, so
code created to deal with different distinct count scenarios is
omitted.

When using rule-based recognition Mondrian uses Recognizer.java code
(lines 782-792):

        // NOTE: This assumes that the aggregate table does not have an explicit
        // average column.
        if (factAgg == RolapAggregator.Avg) {
            String columnExpr = getFactCountExpr(aggUsage);
            return new RolapAggregator.AvgFromSum(columnExpr);
        } else if (factAgg == RolapAggregator.DistinctCount) {
            //return RolapAggregator.Count;
            return RolapAggregator.DistinctCount;
        } else {
            return factAgg;
        }
Are there really necessary separate methods to deal with both of
recognizers? How to change schema definition, to handle distinct-count
aggregated measures correctly when using explicit recognizer?


On Mon, Jan 7, 2013 at 3:46 PM, Matt Campbell <mkambol at gmail.com> wrote:
> Distinct count measures in aggregate tables are tricky and require care on
> the part of the schema author.  There are some cases where it is appropriate
> to pull a distinct count value from an aggregate table--i.e. when the
> attributes are dependent on the measure (like gender in the case of customer
> count).
>
> I'm hoping Julian or Luc can jump in, but I'd be worried your change would
> cause Mondrian to always query the fact table for count distincts, even when
> it may be appropriate to query the agg table.
>
>
> On Mon, Jan 7, 2013 at 3:47 AM, Tomasz Karlik <tkarlik at gmail.com> wrote:
>>
>> SQL seems to be wrong:
>>
>> select
>>     "agg_c_10_sales_fact_1997"."the_year" as "c0",
>>     sum("agg_c_10_sales_fact_1997"."customer_count") as "m0"
>> from
>>     "agg_c_10_sales_fact_1997" as "agg_c_10_sales_fact_1997"
>> where
>>     "agg_c_10_sales_fact_1997"."the_year" = 1997
>> group by
>>     "agg_c_10_sales_fact_1997"."the_year"
>>
>> Log file is in attachment.
>>
>> On Mon, Jan 7, 2013 at 9:19 AM, Julian Hyde <jhyde at pentaho.com> wrote:
>> > The easiest way to approach this is to look at the SQL Mondrian is
>> > generating. Is it asking the right question of the DBMS?
>> >
>> > Can you turn on SQL tracing and post the log you get during the run that
>> > produces the incorrect results.
>> >
>> > Julian
>> >
>> >
>> > On Jan 7, 2013, at 12:10 AM, Tomasz Karlik <tkarlik at gmail.com> wrote:
>> >
>> >> I am getting different results while running simple query on FoodMart
>> >> (PostgreSql db). Result differs and seems to be wrong when using
>> >> explicit aggregate. I've created simple test which shows this
>> >> behavior. Also I've made some patch, but i'm not sure if the problem
>> >> is in sources? The problem doesn't appear when aggregate table is
>> >> recognized by the default rules.
>> >>
>> >> query:
>> >>
>> >> select
>> >>
>> >> NON EMPTY {[Measures].[Customer Count]} ON COLUMNS,
>> >>
>> >> NON EMPTY {[Time].[Year].Members} ON ROWS
>> >>
>> >> from [Sales]
>> >>
>> >> aggregate table: agg_c_10_sales_fact_1997
>> >>
>> >>
>> >> expected result:
>> >>
>> >> Axis #0:
>> >>
>> >> {}
>> >>
>> >> Axis #1:
>> >>
>> >> {[Measures].[Customer Count]}
>> >>
>> >> Axis #2:
>> >>
>> >> {[Time].[1997]}
>> >>
>> >> Row #0: 5 581
>> >>
>> >>
>> >> actual result:
>> >>
>> >> Axis #0:
>> >>
>> >> {}
>> >>
>> >> Axis #1:
>> >>
>> >> {[Measures].[Customer Count]}
>> >>
>> >> Axis #2:
>> >>
>> >> {[Time].[1997]}
>> >>
>> >> Row #0: 16 947
>> >>
>> >> <ExplicitRecognizerTest.java><mondrian_rolap_aggmatcher_Recognizer.java.patch>_______________________________________________
>> >> Mondrian mailing list
>> >> Mondrian at pentaho.org
>> >> http://lists.pentaho.org/mailman/listinfo/mondrian
>> >
>> > _______________________________________________
>> > Mondrian mailing list
>> > Mondrian at pentaho.org
>> > http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org
>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>
>
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian
>


More information about the Mondrian mailing list