<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

<HTML><HEAD>

<META http-equiv=Content-Type content="text/html; charset=us-ascii">

<META content="MSHTML 6.00.6001.17052" name=GENERATOR></HEAD>

<BODY>

<DIV dir=ltr align=left><SPAN class=600394116-29012008><FONT face=Verdana 

color=#000080 size=2>I do agree that MDX sets are basically lists - they are 

ordered, and may contain duplicates. The only weird thing about them is that 

they automatically eliminate null members and tuples.</FONT></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=600394116-29012008><FONT face=Verdana 

color=#000080 size=2></FONT></SPAN>&nbsp;</DIV>

<DIV dir=ltr align=left><SPAN class=600394116-29012008><FONT face=Verdana 

color=#000080 size=2>But there are two sets here: "the size of the set 

of&nbsp;distinct customers in the fact table records underlying the set of 

members [CA], [CA], [OR]"</FONT></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=600394116-29012008><FONT face=Verdana 

color=#000080 size=2></FONT></SPAN>&nbsp;</DIV>

<DIV dir=ltr align=left><SPAN class=600394116-29012008><FONT face=Verdana 

color=#000080 size=2>The first set is the one we are concerned with: the "set of 

distinct customers" which underlies the definition of the distinct-count measure 

is a SQL style set, that is, duplicates are eliminated. (And as a forum post 

noted recently, null values for the customer_id column are eliminated from this 

set, just like in any SQL aggregate function.) It doesn't matter whether MDX 

set/list of members contains duplicates; we must eliminate duplicate values of 

customer_id before counting.</FONT></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=600394116-29012008><FONT face=Verdana 

color=#000080 size=2></FONT></SPAN>&nbsp;</DIV>

<DIV dir=ltr align=left><SPAN class=600394116-29012008><FONT face=Verdana 

color=#000080 size=2>So I'm with JVS.</FONT></SPAN></DIV>

<DIV dir=ltr align=left><SPAN class=600394116-29012008><FONT face=Verdana 

color=#000080 size=2></FONT></SPAN>&nbsp;</DIV>

<DIV dir=ltr align=left><SPAN class=600394116-29012008><FONT face=Verdana 

color=#000080 size=2>Julian</FONT></SPAN></DIV><BR>

<BLOCKQUOTE 

style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000080 2px solid; MARGIN-RIGHT: 0px">

  <DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>

  <HR tabIndex=-1>

  <FONT face=Tahoma size=2><B>From:</B> mondrian-bounces@pentaho.org 

  [mailto:mondrian-bounces@pentaho.org] <B>On Behalf Of </B>Matt 

  Campbell<BR><B>Sent:</B> Tuesday, January 29, 2008 7:10 AM<BR><B>To:</B> 

  Mondrian developer mailing list<BR><B>Subject:</B> Re: [Mondrian] Adding 

  Grouping Set support for Distinct Count measures<BR></FONT><BR></DIV>

  <DIV></DIV>Misguided, probably.&nbsp; But an MDX Set does allow this sort of 

  non-set like behavior.&nbsp; The following MDX, when run in Analysis Services, 

  will produce 2 times the [unit sales] of [marital status].[m]:<BR><BR><BR>with 

  member [marital status].ASetIsNotASet as <BR>'Aggregate( {[Marital 

  Status].[All Marital Status].[M], [Marital Status].[All Marital Status].[M] }, 

  measures.[unit sales] )' <BR>select {[marital status].ASetIsNotASet } on 0 

  from sales where measures.[unit sales]<BR><BR>

  <DIV class=gmail_quote>On Jan 28, 2008 5:33 PM, John V. Sichi &lt;<A 

  href="mailto:jsichi@gmail.com">jsichi@gmail.com</A>&gt; wrote:<BR>

  <BLOCKQUOTE class=gmail_quote 

  style="PADDING-LEFT: 1ex; MARGIN: 0pt 0pt 0pt 0.8ex; BORDER-LEFT: rgb(204,204,204) 1px solid">

    <DIV class=Ih2E3d>Ajit Vasudeo Joglekar wrote:<BR>&gt; 2) Aggregation of a 

    normal (non distinct count) measure for select members<BR>&gt;<BR>&gt; It is 

    possible to get this working since it is very similar to case 1).<BR>&gt; 

    There is a issue here though. Lets say for whatever reason user wants 

    to<BR>&gt; aggregate<BR>&gt;<BR>&gt; Aggregate([Store].[All 

    Stores].[USA].[CA], [Store].[All<BR>&gt; Stores].[USA].[CA], [Store].[All 

    Stores].[USA].[OR]) over<BR>&gt; [Measures].[Unit Sales]. The expected value 

    here is (2 * CA + OR) for a<BR>&gt; non distinct count measure. The sql 

    generated like above will not result<BR>&gt; in correct aggregation 

    value<BR><BR></DIV>I don't see an issue here; the desire to "double-count" 

    CA by including<BR>it in the set twice would be misguided, since a set is a 

    set (no dups).<BR><FONT color=#888888><BR>JVS<BR></FONT>

    <DIV>

    <DIV></DIV>

    <DIV 

    class=Wj3C7c>_______________________________________________<BR>Mondrian 

    mailing list<BR><A 

    href="mailto:Mondrian@pentaho.org">Mondrian@pentaho.org</A><BR><A 

    href="http://lists.pentaho.org/mailman/listinfo/mondrian" 

    target=_blank>http://lists.pentaho.org/mailman/listinfo/mondrian</A><BR></DIV></DIV></BLOCKQUOTE></DIV><BR></BLOCKQUOTE></BODY></HTML>