<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<META content="MSHTML 6.00.6001.18148" name=GENERATOR></HEAD>
<BODY>
<DIV dir=ltr align=left><SPAN class=333525117-15102008><FONT face=Verdana
color=#000080 size=2>Will do. Many thanks for the fix!</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=333525117-15102008><FONT face=Verdana
color=#000080 size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=333525117-15102008><FONT face=Verdana
color=#000080 size=2>Julian</FONT></SPAN></DIV><BR>
<BLOCKQUOTE
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000080 2px solid; MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
<FONT face=Tahoma size=2><B>From:</B> Andreas_Voss@tonbeller.com
[mailto:Andreas_Voss@tonbeller.com] <BR><B>Sent:</B> Wednesday, October 15,
2008 9:35 AM<BR><B>To:</B> jhyde@pentaho.com; Mondrian developer mailing
list<BR><B>Subject:</B> Mondrian Bug 1961163 NON EMPTY and Filter() breaking
aggregate calculations<BR></FONT><BR></DIV>
<DIV></DIV><BR><FONT face=sans-serif size=2>Hi all,</FONT> <BR><BR><FONT
face=sans-serif size=2>I have analyzed and fixed that bug. It turned out that
I'd consider this a 'heavy' bug because it has the potential to return invalid
results for many NON EMPTY queries. Because I did not manage to upload a file
to the bug tracker :-( I send the patches here attached. Description of the
fix follows. Would be glad if someone (Juilan?) could review this.</FONT>
<BR><BR><FONT face=sans-serif size=2>Thanks,</FONT> <BR><FONT face=sans-serif
size=2>Andreas</FONT> <BR><BR><BR><BR><FONT face=sans-serif size=2>--
</FONT><BR><BR><BR><FONT face=sans-serif size=2>This bug consists of three
bugs (I think). I have attached a patch that fixes these bugs (hopefully). The
patch is against a mondrian 2.x version (probably the latest JDK-1.4 version).
Together with the description below it should be easy to apply the fix against
the head version of Mondrian.</FONT> <BR><BR><FONT face=sans-serif
size=2>Description of the fixes:</FONT> <BR><BR><BR><FONT face=sans-serif
size=2>1. SqlContextConstraint#isValidContext()</FONT> <BR><BR><FONT
face=sans-serif size=2>We can only restrict the result if we are in NON EMPTY
mode. So I added</FONT> <BR><BR><FONT face=sans-serif size=2>
// we can restrict the result in NON EMPTY mode only</FONT>
<BR><FONT face=sans-serif size=2> if
(!context.isNonEmpty()) {</FONT> <BR><FONT face=sans-serif size=2>
return false;</FONT> <BR><FONT
face=sans-serif size=2> }</FONT> <BR><BR><BR><FONT
face=sans-serif size=2>2. RolapEvaluator#getExpResultCacheKey()</FONT>
<BR><BR><FONT face=sans-serif size=2>The RolapEvaluator caches results that
have been computed for an expression. The cache key consideres the members of
those dimensions that the expression depends on. In NON EMPTY mode however,
the result may depend on any dimension. So I added:</FONT> <BR><BR><FONT
face=sans-serif size=2> /**</FONT> <BR><FONT face=sans-serif
size=2> * Creates a key which uniquely identifes an
expression and its</FONT> <BR><FONT face=sans-serif size=2>
* context. The context includes members of dimensions which the</FONT>
<BR><FONT face=sans-serif size=2> * expression is dependent
upon.</FONT> <BR><FONT face=sans-serif size=2> */</FONT>
<BR><FONT face=sans-serif size=2> private Object
getExpResultCacheKey(ExpCacheDescriptor descriptor) {</FONT> <BR><FONT
face=sans-serif size=2> List key = new
ArrayList();</FONT> <BR><FONT face=sans-serif size=2>
key.add(descriptor.getExp());</FONT> <BR><FONT face=sans-serif
size=2> </FONT><BR><FONT face=sans-serif
size=2> // in NON EMPTY mode the result depends on
everything, e.g.</FONT> <BR><FONT face=sans-serif size=2>
// "NON EMPTY [Customer].[Name].members" may return different
results</FONT> <BR><FONT face=sans-serif size=2> //
for 1997-01 and 1997-02</FONT> <BR><FONT face=sans-serif size=2>
if (nonEmpty) {</FONT> <BR><FONT face=sans-serif size=2>
for (int i = 0; i <
currentMembers.length; i++)</FONT> <BR><FONT face=sans-serif size=2>
key.add(currentMembers[i]);</FONT> <BR><FONT face=sans-serif size=2>
return key;</FONT> <BR><FONT
face=sans-serif size=2> }</FONT> <BR><BR><FONT
face=sans-serif size=2> ...</FONT> <BR><FONT
face=sans-serif size=2> </FONT><BR><BR><FONT
face=sans-serif size=2>3. check all invocations of
ListCalc#evaluateList()</FONT> <BR><BR><FONT face=sans-serif size=2>This is
the tricky one. The problem is that for a NON EMPTY axis expression the
expression and all of its sub expressions are evaluated in NON EMPTY mode.
Thats wrong. Lets look at the query excerpt</FONT> <BR><BR><FONT
face=sans-serif size=2>NON EMPTY Filter([Store].[Store Name].Members,
([Measures].[AvgRevenue] < [Measures].[Store Sales]))</FONT> <BR><BR><FONT
face=sans-serif size=2>Here the Filter() is evaluated in NON EMPTY context.
Its OK to evaluate the set expression [Store].[Store Name].Members in NON
EMPTY mode too, because the empty stores are removed from the result later
anyway. But its not ok to evaluate the condition in NON EMPTY mode.</FONT>
<BR><BR><FONT face=sans-serif size=2>During evaluation the Filter() function
places one Store after the other into the context and evaluates the condition.
Here the condition contains the expression</FONT> <BR><BR><FONT
face=sans-serif size=2>Avg([Store].[Store Name].Members, [Measures].[Store
Sales])</FONT> <BR><BR><FONT face=sans-serif size=2>If here the [Store].[Store
Name].Members argument is evaluated in NON EMPTY mode and there is a
particular store in the context, then the sql optimizer will return only that
store - no need to return them all. Thats wrong in the context of AVG, we want
to compute the average over all stores, so the Filter() has to cancel the NON
EMPTY mode when evaluating the condition.</FONT> <BR><BR><FONT face=sans-serif
size=2>In general, the NON EMPTY mode has to be cancelled when certain sub
expressions are evaluated. I found it difficult to decide for which sub
expressions the NON EMPTY mode has to be cancelled. A rule of thumb may
be:</FONT> <BR><BR><FONT face=sans-serif size=2>a) when a function returns a
set, then its input set argument may be evaluated in NON EMPTY MODE (e.g.
Filter, TopCount, CrossJoin)</FONT> <BR><FONT face=sans-serif size=2>
</FONT><BR><FONT face=sans-serif size=2>b) all expressions, that do not return
a set may not evaluate their set arguments in NON EMPTY mode (e.g.
Aggregate, Avg)</FONT> <BR><BR><FONT face=sans-serif size=2>c) non-set
evaluation does not care (e.g. boolean, number, member, tuple)</FONT>
<BR><BR><FONT face=sans-serif size=2>But there are exceptions, e.g.
Tail(<Set>). Here the set argument has to be evaluated with NON EMPTY
cancelled - (I think).</FONT> <BR><BR><FONT face=sans-serif size=2>I added to
RolapEvaluator</FONT> <BR><BR><BR><FONT face=sans-serif size=2>
public Evaluator push(boolean nonEmpty) {</FONT> <BR><FONT face=sans-serif
size=2> final RolapEvaluator evaluator =
_push();</FONT> <BR><FONT face=sans-serif size=2>
evaluator.setNonEmpty(nonEmpty);</FONT> <BR><FONT face=sans-serif
size=2> return evaluator;</FONT> <BR><FONT
face=sans-serif size=2> }</FONT> <BR><BR><BR><FONT
face=sans-serif size=2>Then I checked all invocations of
ListCalc#evaluateList(). For example in AvgFunDef I changed</FONT>
<BR><BR><FONT face=sans-serif size=2>
public Object evaluate(Evaluator evaluator) {</FONT> <BR><FONT face=sans-serif
size=2> List memberList
= listCalc.evaluateList(evaluator);</FONT> <BR><FONT face=sans-serif
size=2> return
avg(evaluator.push(), memberList, calc);</FONT> <BR><FONT face=sans-serif
size=2> }</FONT> <BR><BR><FONT
face=sans-serif size=2>to</FONT> <BR><BR><FONT face=sans-serif size=2>
public Object evaluate(Evaluator evaluator)
{</FONT> <BR><FONT face=sans-serif size=2>
List memberList =
listCalc.evaluateList(evaluator.push(false));</FONT> <BR><FONT face=sans-serif
size=2> return
avg(evaluator.push(false), memberList, calc);</FONT> <BR><FONT face=sans-serif
size=2> }</FONT> <BR><BR><BR><FONT
face=sans-serif size=2>so the AVG function always evaluates its arguments with
NON EMPTY cancelled.</FONT> <BR><BR><FONT face=sans-serif
size=2>Regards,</FONT>
<P><FONT face=sans-serif size=2><B>Andreas
Voss</B><BR>Softwareentwickler<BR>Tel: +49 6251 7000 - 354<BR>Fax: +49 6251
7000 - 140</FONT><FONT face=sans-serif color=blue size=2><U><BR></U></FONT><A
href="mailto:av@tonbeller.com"><FONT face=sans-serif color=blue
size=2><U>av@tonbeller.com</U></FONT></A><FONT face=sans-serif size=2>
</FONT><BR>
<DIV style="FONT-SIZE: 10pt; FONT-FAMILY: sans-serif">
<P><B>TONBELLER AG</B><BR>Werner-von-Siemens-Str. 2<BR>D-64625 Bensheim
<BR>Germany</P><A href="http://www.tonbeller.com"
target=_blank>www.tonbeller.com</A>
<P>Register Court: District Court Darmstadt<BR>Registration: HRB
21474<BR>Managing Board: Rutger Hetzler (CEO), Sebastian Hetzler, Torsten
Mayer<BR>Chairman of the Supervisory Board: Rüdiger Brand</P>
<HR style="MARGIN: 20px 0px" noShade SIZE=1>
<P>This message is for the designated recipient only and may contain
privileged, proprietary, or otherwise private information. If you have
received it in error, please notify the sender immediately and delete the
original. Any unauthorised copying or dissemination of this message is
strictly prohibited.</P>
<P>Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte
Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail
irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und
vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie die Weitergabe
dieser E-Mail ist nicht gestattet.</P></DIV><BR></BLOCKQUOTE></BODY></HTML>