[Mondrian] Lots of Dimensions & Lots of Rows

Matt Campbell mkambol at gmail.com
Wed Feb 7 13:27:50 EST 2007


We were running Mondrian on a dual core Win2003 server with 3Gig of memory.
The RDBMs for the tests with 100s of millions of fact rows was Teradata,
which gives very good performance for the sorts of queries that Mondrian
fires.


On 2/7/07, Paul Ramsey <pramsey at refractions.net> wrote:
>
> Matt, thanks for the anecdote, that is amazing (and great news)! What
> kind of hardware do you have backing your system?
>
> Julian, thanks for the detailed response, it will be interesting trying
> to fit our problem neatly into the OLAP space, but I see lots of
> advantages to doing so in terms of pre-existing functionality.  On the
> spatial side, by the time stuff hits the fact table, it is essentially
> non-spatial. All the spatial relationships (cell->region,
> cell->watershed) are expressed in the dimensions at that point.
>
> Yours,
>
> Paul
>
> Matt Campbell wrote:
> > To add some anecdotal info-- we've successfully run Mondrian with nearly
> > 700 dimensions using fact tables with 100s of millions of rows.  We've
> > noticed no performance degradation or instability from the number of
> > dimensions.
> >
> > On 2/6/07, *Julian Hyde* <julianhyde at speakeasy.net
> > <mailto:julianhyde at speakeasy.net>> wrote:
> >
> >
> >
> >      > My concern is that the number of dimensions on top of the number
> of
> >      > facts will make the whole thing unworkable, since people are
> going to
> >      > expect to query the thing through a *lot* of different paths.  As
> an
> >      > added bonus, the deployment environment will be PostgreSQL.
> >
> >     The number of dimensions is not a huge problem per se. If mondrian
> >     operating
> >     in a ROLAP mode (that is, generating a SQL query for each set of
> >     cells) then
> >     each dimension is a POTENTIAL thing to slice on but it's only the
> >     dimensions
> >     ACTUALLY sliced on which affect the performance of the SQL.
> >
> >     If you create aggregate tables -- and you probably will need to, for
> >     that
> >     data volume -- a large number of dimensions becomes more of a
> problem -
> >     because you will need a correspondingly large number of aggregate
> >     tables.
> >
> >     There may be some tricks you can use when designing your aggregate
> >     tables.
> >     If your DBMS supports special indexes for GIS (just the kind of
> >     thing that
> >     PostgreSQL does very well) you should try to design the agg tables
> >     so that
> >     those indexes get used.
> >
> >     Also, if a lot of your queries are localized ( e.g. queries for data
> >     within
> >     10 km of a given town) index your fact table so that this data set
> >     can be
> >     readily retrieved.
> >
> >     Databases -- mondrian included -- don't handle ranges as well as
> >     they handle
> >     discrete values. So, splitting spatial coordinates into the integral
> >     and
> >     fractional part (e.g. 34.56 N, 123.45 W becomes lat_whole=34
> >     lat_fraction=.56 long_whole=-123 long_fraction=.45) is a trick which
> >     might
> >     tend to create the right number and kind of 'buckets' in mondrian's
> >     workspace.
> >
> >     There has been some research to extend mondrian for GIS
> >     applications: see
> >     "An open source and web based framework for geographic and
> >     multidimensional
> >     processing" (da Silva, Times, Salgado, 2006),
> >     http://portal.acm.org/citation.cfm?id=1141292
> >
> >     Julian
> >
> >     _______________________________________________
> >     Mondrian mailing list
> >     Mondrian at pentaho.org <mailto:Mondrian at pentaho.org>
> >     http://lists.pentaho.org/mailman/listinfo/mondrian
> >
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > Mondrian mailing list
> > Mondrian at pentaho.org
> > http://lists.pentaho.org/mailman/listinfo/mondrian
>
>
> --
>
>    Paul Ramsey
>    Refractions Research
>    http://www.refractions.net
>    pramsey at refractions.net
>    Phone: 250-383-3022
>    Cell: 250-885-0632
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20070207/9771f930/attachment.html 


More information about the Mondrian mailing list