To add some anecdotal info-- we've successfully run Mondrian with nearly 700 dimensions using fact tables with 100s of millions of rows. We've noticed no performance degradation or instability from the number of dimensions.
<br><br><div><span class="gmail_quote">On 2/6/07, <b class="gmail_sendername">Julian Hyde</b> <<a href="mailto:julianhyde@speakeasy.net">julianhyde@speakeasy.net</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<br><br>> My concern is that the number of dimensions on top of the number of<br>> facts will make the whole thing unworkable, since people are going to<br>> expect to query the thing through a *lot* of different paths. As an
<br>> added bonus, the deployment environment will be PostgreSQL.<br><br>The number of dimensions is not a huge problem per se. If mondrian operating<br>in a ROLAP mode (that is, generating a SQL query for each set of cells) then
<br>each dimension is a POTENTIAL thing to slice on but it's only the dimensions<br>ACTUALLY sliced on which affect the performance of the SQL.<br><br>If you create aggregate tables -- and you probably will need to, for that
<br>data volume -- a large number of dimensions becomes more of a problem -<br>because you will need a correspondingly large number of aggregate tables.<br><br>There may be some tricks you can use when designing your aggregate tables.
<br>If your DBMS supports special indexes for GIS (just the kind of thing that<br>PostgreSQL does very well) you should try to design the agg tables so that<br>those indexes get used.<br><br>Also, if a lot of your queries are localized (
e.g. queries for data within<br>10 km of a given town) index your fact table so that this data set can be<br>readily retrieved.<br><br>Databases -- mondrian included -- don't handle ranges as well as they handle<br>discrete values. So, splitting spatial coordinates into the integral and
<br>fractional part (e.g. 34.56 N, 123.45 W becomes lat_whole=34<br>lat_fraction=.56 long_whole=-123 long_fraction=.45) is a trick which might<br>tend to create the right number and kind of 'buckets' in mondrian's
<br>workspace.<br><br>There has been some research to extend mondrian for GIS applications: see<br>"An open source and web based framework for geographic and multidimensional<br>processing" (da Silva, Times, Salgado, 2006),
<br><a href="http://portal.acm.org/citation.cfm?id=1141292">http://portal.acm.org/citation.cfm?id=1141292</a><br><br>Julian<br><br>_______________________________________________<br>Mondrian mailing list<br><a href="mailto:Mondrian@pentaho.org">
Mondrian@pentaho.org</a><br><a href="http://lists.pentaho.org/mailman/listinfo/mondrian">http://lists.pentaho.org/mailman/listinfo/mondrian</a><br></blockquote></div><br>