[Mondrian] All Members vs Aggregate
Thiyagu Palanisamy
tpalanis at thoughtworks.com
Tue Apr 24 07:32:50 EDT 2007
As of now mondrian uses multiple sql queries to get data for a MDX with
members at different levels. We are working on changing this behavior to
combine these multiple sqls to single one.
For an example this MDX generates 2 sqls, one for all gender and another
for male,female
select gender.members on 0 from [sales 2] where measures.[Sales Count];
sql:
1. select "time_by_day"."the_year" as "c0", "customer"."gender" as "c1",
count("sales_fact_1997"."product_id") as "m0"
from "time_by_day" "time_by_day", "sales_fact_1997" "sales_fact_1997",
"customer" "customer"
where
"sales_fact_1997"."time_id" = "time_by_day"."time_id" and
"time_by_day"."the_year" = 1997 and
"sales_fact_1997"."customer_id" = "customer"."customer_id"
group by "time_by_day"."the_year", "customer"."gender"
2. select "time_by_day"."the_year" as "c0",
count("sales_fact_1997"."product_id") as "m0"
from "time_by_day" "time_by_day", "sales_fact_1997" "sales_fact_1997"
where
"sales_fact_1997"."time_id" = "time_by_day"."time_id" and
"time_by_day"."the_year" = 1997
group by
"time_by_day"."the_year"
with our prototype, these sqls are grouped into one (this query will only
work with DBs that support grouping sets):
1. select "time_by_day"."the_year" as "c0", "customer"."gender" as "c1",
count("sales_fact_1997"."product_id") as "m0"
from "time_by_day" "time_by_day", "sales_fact_1997" "sales_fact_1997",
"customer" "customer"
where
"sales_fact_1997"."time_id" = "time_by_day"."time_id" and
"time_by_day"."the_year" = 1997 and
"sales_fact_1997"."customer_id" = "customer"."customer_id"
group by grouping sets(("time_by_day"."the_year", "customer"."gender"),
("time_by_day"."the_year"))
Although the grouping set implementation looks fine and works in most
cases, there is a tricky problem associated with it. The problem is
grouping "all" (aggregate) value got from rolling up is equal to aggregate
of the members involved and not the all value
to reproduce it, alter fact table and insert rows for null customers:
--modify fact table to allow nulls for customer_id
alter table "sales_fact_1997" modify "customer_id" null
--add sales records with no customers
insert into "sales_fact_1997" values (337,371,null,0,2,6.3455,2.3445,4)
insert into "sales_fact_1997" values (337,440,null,0,2,6.3355,2.335,2)
results:
remains same with grouping sql:
Axis #0:
{[Measures].[Sales Count]}
Axis #1:
{[Gender].[All Gender]}
{[Gender].[All Gender].[F]}
{[Gender].[All Gender].[M]}
Row #0: 86,837
Row #0: 42,831
Row #0: 44,006
without grouping (mondrian's current behavior):
...
Row #0: 86,839 //includes 2 records that don't belong to any
gender
Row #0: 42,831
Row #0: 44,006
In summary "All" in mondrian is not really an aggregate of all the
members, it also includes data from rows that don't join to fact table.
This behavior is not consistent with AS 2000 (MOLAP).
How should we handle this issue?
One of the options that we can think of is, to use outer joins. The
problem with this is it returns addition row for "null" gender.
Thanks,
Thiyagu & Raghu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20070424/c6743574/attachment.html
More information about the Mondrian
mailing list