[Mondrian] perf regression from change for large dimensions
John V. Sichi
jsichi at gmail.com
Sat May 24 05:14:27 EDT 2008
If I sync back to eigenchange 11048, the query below runs in about 22
seconds on Derby for me:
with set necj as
NonEmptyCrossJoin(NonEmptyCrossJoin([Customers].[Name].members,[Store].[Store
Name].members),[Product].[Product Name].members)
select
{[Measures].[Unit Sales]} on columns,
tail(intersect(necj,necj,ALL),5) on rows
from sales;
With eigenchange 11049 (large dimensions), it runs for ages. The
problem is the O(n^2) interaction between the LinkedList in
mondrian.rolap.Target and the get(i) calls in TraversalList (which is
expecting ArrayList random-access efficiency). This is the stack where
the time is burned:
at java/util/LinkedList.get(LinkedList.java:313)[optimized]
at mondrian/rolap/Target$1.get(Target.java:210)[inlined]
at mondrian/rolap/Target$1.get(Target.java:231)[optimized]
at mondrian/util/TraversalList.get(TraversalList.java:52)[inlined]
at
mondrian/util/TraversalList$1.hasNext(TraversalList.java:76)[optimized]
at
java/util/Collections$UnmodifiableCollection$1.hasNext(Collections.java:1009)[optimized]
at
mondrian/olap/fun/IntersectFunDef.buildSearchableCollection(IntersectFunDef.java:85)
(The query is contrived just as an isolated example; the same issue can
occur in many places.)
I'm not sure about the way mondrian.rolap.Target relies on LinkedList as
a FIFO queue, but if that needs to remain as is, then TraversalList
needs to be changed to maintain iterators over the underlying lists for
the case of sequential access.
JVS
More information about the Mondrian
mailing list