[Mondrian] High Cardinality for Mondrian

Luis F. Canals luis.canals at stratebi.com
Fri Feb 8 13:09:03 EST 2008


Dear Julian Hyde,

after the hard hard task of reprogramming all changes made for version
2.4.2.9831 of Mondrian to provide the capability to manage high
cardinality dimensions for head version present in Preforge, we can send
you the list of differences to be applied as a patch to mondrian version
present on Preforge.

Since we have no access to commit changes on Preforge, we will be very
happy if you apply these changes and comment us any problems you can
find that don't let the patch be applied.

All tests are passed (using mysql as database and Windows and Linux as
operating systems, on Java 5 and 6).

Some properties have been added to "mondrian.properties" to control high
cardinality and multhreading for queries behaviour:
    mondrian.result.highCardChunkSize indicates the number of elements
taken at the same time when a dimensions is marked as "highCardinality"
    mondrian.rolap.MaximumParallelThreads indicates the maximum number
of threads used to perform a query (since non dependant queries are now
parallelized)

In FoodMart.xml, we have made another change to identify "Promotions" on
cube "Sales Ragged" as high cardinality to test the system in this case.

There are some other points whould have taken into account now that
Mondrian is going to be able to manage ulimted dimensions:
    - avoid the use of ".size()" over the list of elements of a,
potentially, high cardinality dimension;
    - avoid the copy of elements iterating over the complete list of a,
potentially, high cardinality dimension
        (for example, things like
            "for(Member m:list) {
                ...
                anotherList.add(list);
                ...
            }")
    - instead, use FilteredIterableList idea
    - don't try to get the first element when you have been got the last
(i.e., doing "list.get(x)" after "list.get(y)" with y>>>x) over a list
of elements of a, potentially, high cardinality dimension
    - some functions need all the elements in memory (for example "order
by"); these functions are not going to run with high cardinality
dimensions and an exception will be thrown
    - if you don't need high cardinality dimension, simply don't set the
attribe "highCardinality" to true in schema (FoodMart.xml)

That's all!

Since we think is a quite powerful improvement (very useful for our
clients) we would like these changes to be included in the next release
of Mondrian. Could it be possible?

Best regards.

- Jorge/Javier/Luis
-------------- next part --------------
A non-text attachment was scrubbed...
Name: new-classes.tgz
Type: application/x-compressed-tar
Size: 17187 bytes
Desc: not available
Url : http://lists.pentaho.org/pipermail/mondrian/attachments/20080208/cedc7961/attachment.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: highcard.diff
Type: text/x-patch
Size: 609490 bytes
Desc: not available
Url : http://lists.pentaho.org/pipermail/mondrian/attachments/20080208/cedc7961/attachment-0001.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 252 bytes
Desc: OpenPGP digital signature
Url : http://lists.pentaho.org/pipermail/mondrian/attachments/20080208/cedc7961/attachment-0002.bin 


More information about the Mondrian mailing list