[Mondrian] Adding Grouping Set support for Distinct Count measures

Julian Hyde jhyde at pentaho.org
Tue Jan 29 11:46:58 EST 2008

I do agree that MDX sets are basically lists - they are ordered, and may
contain duplicates. The only weird thing about them is that they
automatically eliminate null members and tuples.
But there are two sets here: "the size of the set of distinct customers in
the fact table records underlying the set of members [CA], [CA], [OR]"
The first set is the one we are concerned with: the "set of distinct
customers" which underlies the definition of the distinct-count measure is a
SQL style set, that is, duplicates are eliminated. (And as a forum post
noted recently, null values for the customer_id column are eliminated from
this set, just like in any SQL aggregate function.) It doesn't matter
whether MDX set/list of members contains duplicates; we must eliminate
duplicate values of customer_id before counting.
So I'm with JVS.


From: mondrian-bounces at pentaho.org [mailto:mondrian-bounces at pentaho.org] On
Behalf Of Matt Campbell
Sent: Tuesday, January 29, 2008 7:10 AM
To: Mondrian developer mailing list
Subject: Re: [Mondrian] Adding Grouping Set support for Distinct Count

Misguided, probably.  But an MDX Set does allow this sort of non-set like
behavior.  The following MDX, when run in Analysis Services, will produce 2
times the [unit sales] of [marital status].[m]:

with member [marital status].ASetIsNotASet as 
'Aggregate( {[Marital Status].[All Marital Status].[M], [Marital
Status].[All Marital Status].[M] }, measures.[unit sales] )' 
select {[marital status].ASetIsNotASet } on 0 from sales where
measures.[unit sales]

On Jan 28, 2008 5:33 PM, John V. Sichi <jsichi at gmail.com> wrote:

Ajit Vasudeo Joglekar wrote:
> 2) Aggregation of a normal (non distinct count) measure for select members
> It is possible to get this working since it is very similar to case 1).
> There is a issue here though. Lets say for whatever reason user wants to
> aggregate
> Aggregate([Store].[All Stores].[USA].[CA], [Store].[All
> Stores].[USA].[CA], [Store].[All Stores].[USA].[OR]) over
> [Measures].[Unit Sales]. The expected value here is (2 * CA + OR) for a
> non distinct count measure. The sql generated like above will not result
> in correct aggregation value

I don't see an issue here; the desire to "double-count" CA by including
it in the set twice would be misguided, since a set is a set (no dups).


Mondrian mailing list
Mondrian at pentaho.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20080129/2f305fd8/attachment.html 

More information about the Mondrian mailing list