[Mondrian] Dealing with NULLs in aggregate tables with AVG

Matt Campbell mcampbell at pentaho.com
Wed Aug 5 08:44:07 EDT 2015

The SQL avg function will compute the mean of non NULL values.   With Mondrian aggregate tables, we compute the average by taking the rolled up aggregate value and dividing by fact_count.  If that rolled up value happened to include any NULL values than the aggregate table avg differs from what Mondrian computes against the fact table, since fact_count is a count of every row.

I'm thinking the way to deal with this would be to introduce a measure-specific fact_count that counts non NULL events associated with the measure.  So there could be a "fact_count_unit_sales", for example, used to compute the Avg Unit Sales price.  Can anyone think of alternative approaches?

Right now, I think the only workaround for people trying to get accurate Avgs from aggregate tables when NULLs are involved is to use measure specific aggregate tables where the fact_count is calculated based on non NULL.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20150805/bbcba1ff/attachment.html 

More information about the Mondrian mailing list