[Mondrian] perf regression from change for large dimensions

John V. Sichi jsichi at gmail.com
Sat May 24 05:14:27 EDT 2008


If I sync back to eigenchange 11048, the query below runs in about 22 
seconds on Derby for me:

with set necj as
NonEmptyCrossJoin(NonEmptyCrossJoin([Customers].[Name].members,[Store].[Store 
Name].members),[Product].[Product Name].members)
select
{[Measures].[Unit Sales]} on columns,
tail(intersect(necj,necj,ALL),5) on rows
from sales;

With eigenchange 11049 (large dimensions), it runs for ages.  The 
problem is the O(n^2) interaction between the LinkedList in 
mondrian.rolap.Target and the get(i) calls in TraversalList (which is 
expecting ArrayList random-access efficiency).  This is the stack where 
the time is burned:

     at java/util/LinkedList.get(LinkedList.java:313)[optimized]
     at mondrian/rolap/Target$1.get(Target.java:210)[inlined]
     at mondrian/rolap/Target$1.get(Target.java:231)[optimized]
     at mondrian/util/TraversalList.get(TraversalList.java:52)[inlined]
     at 
mondrian/util/TraversalList$1.hasNext(TraversalList.java:76)[optimized]
     at 
java/util/Collections$UnmodifiableCollection$1.hasNext(Collections.java:1009)[optimized]
     at 
mondrian/olap/fun/IntersectFunDef.buildSearchableCollection(IntersectFunDef.java:85)

(The query is contrived just as an isolated example; the same issue can 
occur in many places.)

I'm not sure about the way mondrian.rolap.Target relies on LinkedList as 
a FIFO queue, but if that needs to remain as is, then TraversalList 
needs to be changed to maintain iterators over the underlying lists for 
the case of sequential access.

JVS





More information about the Mondrian mailing list