[Mondrian] Issue with Cross Join 10 dimensions

Venkatesh U venkatesh20 at gmail.com
Wed Oct 20 12:41:50 EDT 2010


Thanks Kurt and Julian,
   Julian as you pointed out rightly, end-user does not want to see such a
huge data. The idea here is prepopulate the cube for all combination of
dimensions, so that any MDX query is answered from the cache hit rather than
hitting the DB. I can make this prepopulation a nightly job. I ve almost 10
dimensions, on an average each dimension has 15 rows and 750,000 rows in a
fact table. I Use 128 GB RAM and mondrian is connected to Oracle  database.

   I am attempting to cache all the data in RAM, please advise if this
approach is valid or not ?? and also i would like to understand the way non
empty cross join works. from what i observed from the log, data structures
for storing the results are prepared before issuing the SQL.Does the below
step prepare/create the data structure for the combination 6 dimensions
mentioned below?

*DEBUG http-8080-4 mondrian.rolap.FastBatchingCellReader -
FastBatchingCellReader:
bitkey=0x0000000000000000000000000000000000000000000000000000011011110000
  Service (4): "SERVICE_DIMENSION"."SERVICE_NAME"
  Product (5): "PRODUCT_DIMENSION"."PRODUCT_NAME"
  Year (6): "MONTH_DIMENSION"."CALENDER_YEAR"
  Quarter (7): "MONTH_DIMENSION"."CALENDER_QUARTER"
  Month (Key) (9): "MONTH_DIMENSION"."CALENDER_MONTH_NUMBER_IN_YEAR"
  TOD (10): "TIME_OF_DAY_DIMENSION"."TOD_NAME" *

  There is a sharp increase in RAM usage, when (nonempty)crossjoin is
attempted for more than 6 dimensions ( I tried up to ten) and no SQL queries
are issued at this stage yet. What is mondrian doing at this moment ?? I
have also enable native execution for non empty cross join , still I face
this issue.

Does mondrian attempt to perform a cartesian join in memory if a plain cross
join is attempted? If i use NonEmptyCrossJoin how are the missing
combinations handled? if i request for a combination of dimensions which is
not available in the cache ( and DB as well ), does it fire a SQL to check
the availability in the DB or based on the NonEmptyCrossJoin attempted
earlier, it returns an empty value?
How does NonEmptyCrossJoin and CrossJoin differ? is data structure
preparation is required in both the cases?

I am not sure if I am making sense, please respond incase i am not clear.
Thanks a lot.

Thanks,
Venki


On Wed, Oct 20, 2010 at 9:24 PM, Julian Hyde <jhyde at pentaho.com> wrote:

>  It isn't really surprising that big crossjoins take longer: they produce
> more tuples. The number of tuples is exponential in the number of sets being
> combined.
>
> I'm guessing that your query produces many thousands, possibly millions of
> rows. I doubt that any end-user wants to see that many rows.
>
> Learn MDX, and find a way to express the query that the end-user actually
> wants to see.
>
> Julian
>
>  ------------------------------
> *From:* mondrian-bounces at pentaho.org [mailto:mondrian-bounces at pentaho.org]
> *On Behalf Of *Venkatesh U
> *Sent:* Wednesday, October 20, 2010 4:07 AM
> *To:* mondrian at pentaho.org
> *Subject:* [Mondrian] Issue with Cross Join 10 dimensions
>
> Hi ,
>
>   What I observed generally is it takes very longer when i try to do a
> crossjoin with more than 7 dimensions. I am trying to run the below query it
> runs for a very longer time
>
> select {
>
> CrossJoin([Service].Members,CrossJoin([Product].Members,CrossJoin([Month].Members,CrossJoin([TOD].Members,CrossJoin([IO].Members,[ForeignNetwork].Members)))))
> } on rows,
> {[Measures].Members}
> on columns
> from [ServiceFactsByMonthWC]
>
> Below is the tail of log file where mondrian is spending lot of time. What
> is it doing? Is there any way to optimize cross joins? . I am trying to use
> cross join as an option to prepopulate the Cube with a single MDX query, so
> that the subsequent executions will be faster, is there any better way to
> prepopulate the Cube, please let me know if there are any,
>
>
> DEBUG http-8080-4 mondrian.rolap.FastBatchingCellReader -
> FastBatchingCellReader:
> bitkey=0x0000000000000000000000000000000000000000000000000001101011110000
>   Service (4): "SERVICE_DIMENSION"."SERVICE_NAME"
>   Product (5): "PRODUCT_DIMENSION"."PRODUCT_NAME"
>   Year (6): "MONTH_DIMENSION"."CALENDER_YEAR"
>   Quarter (7): "MONTH_DIMENSION"."CALENDER_QUARTER"
>   Month (Key) (9): "MONTH_DIMENSION"."CALENDER_MONTH_NUMBER_IN_YEAR"
>   IO (11): "INCOMING_OUTGOING_DIMENSION"."DIRECTION"
>   ForeignNetworkName (12): "FOREIGN_NETWORK_DIMENSION"."FND_NAME"
>
> DEBUG http-8080-4 mondrian.rolap.FastBatchingCellReader -
> FastBatchingCellReader:
> bitkey=0x0000000000000000000000000000000000000000000000000000011011110000
>   Service (4): "SERVICE_DIMENSION"."SERVICE_NAME"
>   Product (5): "PRODUCT_DIMENSION"."PRODUCT_NAME"
>   Year (6): "MONTH_DIMENSION"."CALENDER_YEAR"
>   Quarter (7): "MONTH_DIMENSION"."CALENDER_QUARTER"
>   Month (Key) (9): "MONTH_DIMENSION"."CALENDER_MONTH_NUMBER_IN_YEAR"
>   TOD (10): "TIME_OF_DAY_DIMENSION"."TOD_NAME"
>
> DEBUG http-8080-4 mondrian.rolap.FastBatchingCellReader -
> FastBatchingCellReader:
> bitkey=0x0000000000000000000000000000000000000000000000000001011011110000
>   Service (4): "SERVICE_DIMENSION"."SERVICE_NAME"
>   Product (5): "PRODUCT_DIMENSION"."PRODUCT_NAME"
>   Year (6): "MONTH_DIMENSION"."CALENDER_YEAR"
>   Quarter (7): "MONTH_DIMENSION"."CALENDER_QUARTER"
>   Month (Key) (9): "MONTH_DIMENSION"."CALENDER_MONTH_NUMBER_IN_YEAR"
>   TOD (10): "TIME_OF_DAY_DIMENSION"."TOD_NAME"
>   ForeignNetworkName (12): "FOREIGN_NETWORK_DIMENSION"."FND_NAME"
>
> DEBUG http-8080-4 mondrian.rolap.FastBatchingCellReader -
> FastBatchingCellReader:
> bitkey=0x0000000000000000000000000000000000000000000000000000111011110000
>   Service (4): "SERVICE_DIMENSION"."SERVICE_NAME"
>   Product (5): "PRODUCT_DIMENSION"."PRODUCT_NAME"
>   Year (6): "MONTH_DIMENSION"."CALENDER_YEAR"
>   Quarter (7): "MONTH_DIMENSION"."CALENDER_QUARTER"
>   Month (Key) (9): "MONTH_DIMENSION"."CALENDER_MONTH_NUMBER_IN_YEAR"
>   TOD (10): "TIME_OF_DAY_DIMENSION"."TOD_NAME"
>   IO (11): "INCOMING_OUTGOING_DIMENSION"."DIRECTION"
>
> DEBUG http-8080-4 mondrian.rolap.FastBatchingCellReader -
> FastBatchingCellReader:
> bitkey=0x0000000000000000000000000000000000000000000000000001111011110000
>   Service (4): "SERVICE_DIMENSION"."SERVICE_NAME"
>   Product (5): "PRODUCT_DIMENSION"."PRODUCT_NAME"
>   Year (6): "MONTH_DIMENSION"."CALENDER_YEAR"
>   Quarter (7): "MONTH_DIMENSION"."CALENDER_QUARTER"
>   Month (Key) (9): "MONTH_DIMENSION"."CALENDER_MONTH_NUMBER_IN_YEAR"
>   TOD (10): "TIME_OF_DAY_DIMENSION"."TOD_NAME"
>   IO (11): "INCOMING_OUTGOING_DIMENSION"."DIRECTION"
>   ForeignNetworkName (12): "FOREIGN_NETWORK_DIMENSION"."FND_NAME"
>
>
>
>
>
>
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20101020/e586fc2d/attachment.html 


More information about the Mondrian mailing list