[Mondrian] non empty eval when calc members change context

Mon Dec 8 16:41:55 EST 2014

Hey all,

I've been investigating a particularly bad bug (MONDRIAN-2022) which involves tuples being suppressed incorrectly.  It can be demonstrated easily:

WITH  member measures.[overrideContext] as '( measures.[unit sales], Time.[1997].Q1 )'
SELECT measures.[overrideContext] on 0,
NON EMPTY crossjoin( Time.[1998].Q1, [marital status].[marital status].members) on 1
FROM sales

This query will return no tuples, even though it should have a NON EMPTY value for the measure for each tuple on the rows axis.  There's no data in 1998.Q1, but that shouldn't matter since the measure changes the context to 1997.Q1.  This is a contrived example, but the problem can show up less obviously with things like YTD() calculations.

There are actually 2 separate problems, applicable to both Mondrian 3x and 4x.  First, the crossjoin optimizer logic (in CrossJoinFunDef.nonEmptyList) will pull out the base measures from any calculated members and use those when determining the nonEmptyList.  Unfortunately, for queries like the one above, this ignores the fact that the calculated member will change the context of evaluation to be [1997].Q1, overriding the [1998].Q1 context.  The second problem is similar, but involves native evaluation logic.  Disabling both the cj optimizer and native eval will cause the query to return correct results.

I had originally been thinking we could fix the issue with the crossjoin optimizer by having the calculated measures themselves pushed into context and evaluated when determining whether there is data present.  I.e. don't collect the base measures nested in the calculations and add them to the measureSet, grab the calculations themselves.  This works in the simple case, but fails in cases where evaluation of the measure requires evaluation of the CJ set itself.

As an alternative, I've been trying out an algorithm which extracts both the base measures and any hierarchy members referenced in calculations.  This list of members can then be used to determine which hierarchies should be set to the [All] member when checking for non-emptiness.  For example, with the MDX above the algorithm would extract Time.[1997].Q1, and then when doing a checkData(), we'd set context to be Time.[All].

I'm going to continue testing with this approach but thought I'd see if people have any feedback or suggestions.

Thanks!
Matt

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20141208/ef977f4d/attachment.html