[Mondrian] Lots of Dimensions & Lots of Rows

Matt Campbell mkambol at gmail.com
Wed Feb 7 10:10:07 EST 2007


To add some anecdotal info-- we've successfully run Mondrian with nearly 700
dimensions using fact tables with 100s of millions of rows.  We've noticed
no performance degradation or instability from the number of dimensions.

On 2/6/07, Julian Hyde <julianhyde at speakeasy.net> wrote:
>
>
>
> > My concern is that the number of dimensions on top of the number of
> > facts will make the whole thing unworkable, since people are going to
> > expect to query the thing through a *lot* of different paths.  As an
> > added bonus, the deployment environment will be PostgreSQL.
>
> The number of dimensions is not a huge problem per se. If mondrian
> operating
> in a ROLAP mode (that is, generating a SQL query for each set of cells)
> then
> each dimension is a POTENTIAL thing to slice on but it's only the
> dimensions
> ACTUALLY sliced on which affect the performance of the SQL.
>
> If you create aggregate tables -- and you probably will need to, for that
> data volume -- a large number of dimensions becomes more of a problem -
> because you will need a correspondingly large number of aggregate tables.
>
> There may be some tricks you can use when designing your aggregate tables.
> If your DBMS supports special indexes for GIS (just the kind of thing that
> PostgreSQL does very well) you should try to design the agg tables so that
> those indexes get used.
>
> Also, if a lot of your queries are localized (e.g. queries for data within
> 10 km of a given town) index your fact table so that this data set can be
> readily retrieved.
>
> Databases -- mondrian included -- don't handle ranges as well as they
> handle
> discrete values. So, splitting spatial coordinates into the integral and
> fractional part (e.g. 34.56 N, 123.45 W becomes lat_whole=34
> lat_fraction=.56 long_whole=-123 long_fraction=.45) is a trick which might
> tend to create the right number and kind of 'buckets' in mondrian's
> workspace.
>
> There has been some research to extend mondrian for GIS applications: see
> "An open source and web based framework for geographic and
> multidimensional
> processing" (da Silva, Times, Salgado, 2006),
> http://portal.acm.org/citation.cfm?id=1141292
>
> Julian
>
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20070207/93316fcc/attachment.html 


More information about the Mondrian mailing list