[Mondrian] A couple of "best practices" questions

Fri Jun 19 13:32:21 EDT 2009

A couple more tidbits from my porting work...

1) Forcing table indexes

For performance reasons, the code I'm porting includes a new attribute  
on the Table schema element: forcedIndex.  If set, it forces SqlQuery/ 
SqlTupleReader to use the named index when reading from the table.

Is this an enhancement that sounds appropriate to add in to the  
Mondrian mainline, or is there a better way to go about this using  
existing Mondrian capabilities?  If the former I'm happy to write some  
tests abd contribute it; if the latter I'd like to make sure we're  
doing it the right way instead.

2) Sparsely populated fact tables and sorting

One property of the datasets we're working with is that some of the  
dimension tables can be large (up to a million rows or so), but the  
fact tables tend to be much smaller and relatively sparse, meaning  
that in many cases there are no rows in the fact table that associate  
with a given dimension.

That in and of itself is fine, but in terms of business behavior we  
know that if there is no associated row in the fact table, it's  
equivalent to having a row full of zeros.  We'd like to see that  
knowledge carry through to sorting results, etcetera (i.e., if the  
value is nonexistent, sort it as if it were zero instead).

This is currently implemented by overriding the sort functions to  
treat null/empty as zero, which frankly makes all involved a little  
itchy.  Is there a better approach to this problem?

  -- Eric