[Mondrian] NON EMPTY crossjoin not optimized anymore

Julian Hyde julianhyde at speakeasy.net
Fri Oct 5 19:02:56 EDT 2007


Robin,

Did you see Checkin_7634.java? It has has a testcase for this optimization.
This behavior is enabled by the property
MondrianProperties.CrossJoinOptimizerSize,
mondrian.olap.fun.crossjoin.optimizer.size. By default the optimization
should be enabled. The problem is that this testcase is not run as part of
the regress.

See Richard Emberson's checkin comments:

http://perforce.eigenbase.org:8080/@md=d&cd=//&c=Tue@/8900?ac=10

http://perforce.eigenbase.org:8080/@md=d&cd=//&c=Tue@/8743?ac=10

They raise some issues about default members vs. members actually used in a
query.

Richard,

Is issue
http://sourceforge.net/tracker/index.php?func=detail&aid=1675585&group_id=35
302&atid=414613 fixed? If so, please mark it fixed.

I think you forgot to obsolete Bug.Checkin7641UseOptimizer. It's not used
anywhere.

Should Checkin_7634 be part of the regress?

Did you ever test this functionality against SSAS? I think that was the
sticking point why this issue was never resolved.

Julian

> -----Original Message-----
> From: mondrian-bounces at pentaho.org 
> [mailto:mondrian-bounces at pentaho.org] On Behalf Of Robin Tharappel
> Sent: Friday, October 05, 2007 1:20 PM
> To: mondrian at pentaho.org
> Subject: [Mondrian] NON EMPTY crossjoin not optimized anymore
> 
> In Mondrian 2.2.2 there was a optimization made to the evaluation of
> the non empty cross join. In the CrossJoinFunDef.crosJoin() method the
> following comment in the code describes the optimization:
> 
>         // Optimize nonempty(crossjoin(a,b)) ==
>         //  nonempty(crossjoin(nonempty(a),nonempty(b))
> 
> In the current Mondrian tip this comment is in the code however it
> does not appear the optimization is being made. Previously there was
> the following in 2.2.2 to allow this type of optimization:
> 
> 
> if (useOptimizer && size > opSize && evaluator.isNonEmpty()) {
> 	.
> 
>       list1 = nonEmptyList(evaluator, list1);
>       list2 = nonEmptyList(evaluator, list2);
> 
> 	.
> }
> 
> size = (long)list1.size() * (long)list2.size();
> 
> List result = new ArrayList((int) size);
> 
> // Code to perform cross join .
> 
> 
> Performing the nonempty check on dimension set A and B can help
> substantially where there is a large volume of dimension members but
> the data is sparse. For example if there was a non empty cross join
> between two dimension each having 100,000 members the cross product
> would be 10 billion. However if a non empty check was performed on
> each dimension set before the cross product is made the cross product
> can be reduced substantially depending on how sparse the data is.
> Using a native.nonempty could help here however depending on the MDX
> query it is not always guaranteed to be used (as described in previous
> posts)
> 
> This issue was originally discussed in tracker 1675585 (NON EMPTY
> cross join not optimized anymore) where the code change above is
> described.  I would like to add this optimization back in but wanted
> to check if there were any known issues for it being removed. It looks
> like it was removed in revision 45 of CrossJoinFunDef due to a test
> case failure. I think that might have been the case because there is
> not a check to determine if the evaluator is nonempty (since the
> crossJoin method is used by NonEmptyCrossJoin and CrossJoin
> functions).
> 
> Thanks
> 
> Robin
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian
> 




More information about the Mondrian mailing list