[Mondrian] SQLs generated by Segment.load do not include measure filters

Robin Tharappel rtharappel at gmail.com
Mon Feb 9 20:52:24 EST 2009


Hello,

I am coming across a performance issue with an MDX that has a measure
filter. In my MDX the filter is evaluated natively and the
SqlTupleReader generates a SQL that returns 10,000 rows. However the
query for loading the Segment returns over 1 million rows. The main
reason for the difference between the number of rows returned between
SqlTupleReader and Segment is that the query generated for
Segment.load does not have the measure filter. The following MDX using
the FoodMart data set can illustrate the problem

select {[Measures].[Unit Sales], [Measures].[Store Sales]} ON COLUMNS,
Filter(NonEmptyCrossJoin([Product].[All Products].Children,
[Store].[All Stores].[USA].Children),
[Measures].[Unit Sales] > 20000) ON ROWS
from [Sales]

Below is the SQL generated by SqlTupleReader with the having clause
for the measure filter (returns 4 rows)


select
"product_class"."product_family" as "c0",
"store"."store_country" as "c1",
"store"."store_state" as "c2"
from
"product" "product",
"product_class" "product_class",
"agg_c_14_sales_fact_1997" "agg_c_14_sales_fact_1997",
"store" "store"
where
"product"."product_class_id" = "product_class"."product_class_id"
and "agg_c_14_sales_fact_1997"."product_id" = "product"."product_id"
and "agg_c_14_sales_fact_1997"."store_id" = "store"."store_id"
and "agg_c_14_sales_fact_1997"."the_year" = 1997
and ("store"."store_country" = 'USA')
group by "product_class"."product_family","store"."store_country","store"."store_state"
having (sum("agg_c_14_sales_fact_1997"."unit_sales") > 20000.0)
order by "product_class"."product_family" ASC,
"store"."store_country" ASC,
"store"."store_state" ASC

Below is the SQL generated by Segment without the measure filter
(returns 9 rows)

select
"store"."store_state" as "c0",
"agg_c_14_sales_fact_1997"."the_year" as "c1",
"product_class"."product_family" as "c2",
sum("agg_c_14_sales_fact_1997"."unit_sales") as "m0",
sum("agg_c_14_sales_fact_1997"."store_sales") as "m1"
from
"store" "store",
"agg_c_14_sales_fact_1997" "agg_c_14_sales_fact_1997",
"product_class" "product_class",
"product" "product"
where
"agg_c_14_sales_fact_1997"."store_id" = "store"."store_id" and
"store"."store_state" in ('CA', 'OR', 'WA') and
"agg_c_14_sales_fact_1997"."the_year" = 1997 and
"agg_c_14_sales_fact_1997"."product_id" = "product"."product_id"
 and
"product"."product_class_id" = "product_class"."product_class_id"
group by
"store"."store_state",
"agg_c_14_sales_fact_1997"."the_year",
"product_class"."product_family"

Although the difference in this example is small, with a larger
dataset a situation could occur where there is a lot of data loaded
into the Segment which is not needed to resolve the query.  Would it
be possible to have the measure filter applied to the Segment query
(add the having clause) ?

Thanks,

Robin




More information about the Mondrian mailing list