[Mondrian] Dealing with NULLs in aggregate tables with AVG
greenlion at gmail.com
Wed Aug 5 18:46:27 EDT 2015
Perhaps store sum and count for average instead of the average itself?
Sent from my iPhone
> On Aug 5, 2015, at 1:24 PM, Julian Hyde <julianhyde at gmail.com> wrote:
> Sounds basically right.
> Calling it “fact_count_unit_sales” is a bit grandiose. It is just a "count(unit_sales)” measure - i.e. something we already know how to do.
> And if unit_sales is not null then we could (and should) use count(*) — i.e. the fact_count column — instead.
>> On Aug 5, 2015, at 5:44 AM, Matt Campbell <mcampbell at pentaho.com> wrote:
>> The SQL avg function will compute the mean of non NULL values. With Mondrian aggregate tables, we compute the average by taking the rolled up aggregate value and dividing by fact_count. If that rolled up value happened to include any NULL values than the aggregate table avg differs from what Mondrian computes against the fact table, since fact_count is a count of every row.
>> I’m thinking the way to deal with this would be to introduce a measure-specific fact_count that counts non NULL events associated with the measure. So there could be a “fact_count_unit_sales”, for example, used to compute the Avg Unit Sales price. Can anyone think of alternative approaches?
>> Right now, I think the only workaround for people trying to get accurate Avgs from aggregate tables when NULLs are involved is to use measure specific aggregate tables where the fact_count is calculated based on non NULL.
>> Mondrian mailing list
>> Mondrian at pentaho.org
> Mondrian mailing list
> Mondrian at pentaho.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Mondrian