[Mondrian] Degenerate facts

John V. Sichi jsichi at gmail.com
Wed Jul 18 22:41:59 EDT 2007

Oops, my SQL/XML below is slightly inconsistent with itself and 
FoodMart, but hopefully the idea is clear.


John V. Sichi wrote:
> julianhyde at speakeasy.net wrote:
>> 3. As a future feature, we could extend the catalog.xml syntax to
>> allow measures to be defined in tables other than the fact table.
>> This is not trivial, because if they are in a different table, you
>> need to specify a join path so mondrian would know how to get to the
>> column. I've been thinking about separating the relational model
>> (which specifies tables, SQL expressions, and join paths) from the
>> dimensional model (which defines cubes and dimensions) within the
>> catalog.xml file, and that would provide that metadata. There are a
>> bunch of good reasons to do that separation, and this is one more.
> If your DBMS supports it (efficiently), you can define a measure on a 
> correlated subquery which looks up a value from a dimension table 
> (regardless of whether the dimension table is actually known as a 
> dimension to Mondrian).  The segment load measure computation SQL comes 
> out "interesting":
> select
>   warehouse_id,
>   max((select sqft from store_dim s where w.store_id=s.store_id))
> from warehouse_fact w
> group by warehouse_id;
> <Cube name="Warehouse">
>   <Table name="inventory_fact_1997" alias="w"/>
>   <Measure name="Max Store Sqftage" aggregator="max">
>     <MeasureExpression>
>       <SQL dialect="luciddb">
> (select sqft from store_dim s where w.store_id=s.store_id)
>       </SQL>
>     </MeasureExpression>
>   </Measure>
>   ...
> For max, it even makes sense in this example, though sum wouldn't due to 
> multi-counting all the fact rows with the same warehouse_id.  Seems like 
> granularity is the other hard part besides expressing join paths.
> Or if you trust your DBMS's optimizer to eliminate unused joins, you can 
> just create a join view and use that as your fact "table".
> Of course, having Mondrian know how to generate optimal SQL without 
> relying on any DBMS smarts would be ideal.

More information about the Mondrian mailing list