[Mondrian] Degenerate facts

Matt Campbell mkambol at gmail.com
Thu Jul 19 15:06:55 EDT 2007


Thanks John and Julian.  We had considered Julian's option (1), but it is
undesirable because we still want to be able to join to the full set of
dimensions in the cube.  I spent some time thinking about the MDX for option
(2) but couldn't come up with concrete MDX that worked for our purpose.

I hadn't though about using Measure Expressions--that does seem promising.
I tried some quick tests against Oracle and it appears to perform pretty
well.

On 7/18/07, John V. Sichi <jsichi at gmail.com> wrote:
>
> Oops, my SQL/XML below is slightly inconsistent with itself and
> FoodMart, but hopefully the idea is clear.
>
> JVS
>
> John V. Sichi wrote:
> > julianhyde at speakeasy.net wrote:
> >> 3. As a future feature, we could extend the catalog.xml syntax to
> >> allow measures to be defined in tables other than the fact table.
> >> This is not trivial, because if they are in a different table, you
> >> need to specify a join path so mondrian would know how to get to the
> >> column. I've been thinking about separating the relational model
> >> (which specifies tables, SQL expressions, and join paths) from the
> >> dimensional model (which defines cubes and dimensions) within the
> >> catalog.xml file, and that would provide that metadata. There are a
> >> bunch of good reasons to do that separation, and this is one more.
> >
> > If your DBMS supports it (efficiently), you can define a measure on a
> > correlated subquery which looks up a value from a dimension table
> > (regardless of whether the dimension table is actually known as a
> > dimension to Mondrian).  The segment load measure computation SQL comes
> > out "interesting":
> >
> > select
> >   warehouse_id,
> >   max((select sqft from store_dim s where w.store_id=s.store_id))
> > from warehouse_fact w
> > group by warehouse_id;
> >
> > <Cube name="Warehouse">
> >   <Table name="inventory_fact_1997" alias="w"/>
> >   <Measure name="Max Store Sqftage" aggregator="max">
> >     <MeasureExpression>
> >       <SQL dialect="luciddb">
> > (select sqft from store_dim s where w.store_id=s.store_id)
> >       </SQL>
> >     </MeasureExpression>
> >   </Measure>
> >   ...
> >
> > For max, it even makes sense in this example, though sum wouldn't due to
> > multi-counting all the fact rows with the same warehouse_id.  Seems like
> > granularity is the other hard part besides expressing join paths.
> >
> > Or if you trust your DBMS's optimizer to eliminate unused joins, you can
> > just create a join view and use that as your fact "table".
> >
> > Of course, having Mondrian know how to generate optimal SQL without
> > relying on any DBMS smarts would be ideal.
> >
> > JVS
> >
>
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20070719/0195f099/attachment.html 


More information about the Mondrian mailing list