[Mondrian] Degenerate facts

John V. Sichi jsichi at gmail.com
Wed Jul 18 22:22:12 EDT 2007


julianhyde at speakeasy.net wrote:
> 3. As a future feature, we could extend the catalog.xml syntax to
> allow measures to be defined in tables other than the fact table.
> This is not trivial, because if they are in a different table, you
> need to specify a join path so mondrian would know how to get to the
> column. I've been thinking about separating the relational model
> (which specifies tables, SQL expressions, and join paths) from the
> dimensional model (which defines cubes and dimensions) within the
> catalog.xml file, and that would provide that metadata. There are a
> bunch of good reasons to do that separation, and this is one more.

If your DBMS supports it (efficiently), you can define a measure on a 
correlated subquery which looks up a value from a dimension table 
(regardless of whether the dimension table is actually known as a 
dimension to Mondrian).  The segment load measure computation SQL comes 
out "interesting":

select
   warehouse_id,
   max((select sqft from store_dim s where w.store_id=s.store_id))
from warehouse_fact w
group by warehouse_id;

<Cube name="Warehouse">
   <Table name="inventory_fact_1997" alias="w"/>
   <Measure name="Max Store Sqftage" aggregator="max">
     <MeasureExpression>
       <SQL dialect="luciddb">
(select sqft from store_dim s where w.store_id=s.store_id)
       </SQL>
     </MeasureExpression>
   </Measure>
   ...

For max, it even makes sense in this example, though sum wouldn't due to 
multi-counting all the fact rows with the same warehouse_id.  Seems like 
granularity is the other hard part besides expressing join paths.

Or if you trust your DBMS's optimizer to eliminate unused joins, you can 
just create a join view and use that as your fact "table".

Of course, having Mondrian know how to generate optimal SQL without 
relying on any DBMS smarts would be ideal.

JVS



More information about the Mondrian mailing list