[Mondrian] Multi-threading SQL execution

Julian Hyde julianhyde at speakeasy.net
Mon Feb 12 20:38:24 EST 2007


Option #1 (ROLLUP/CUBE BY) is viable and useful. If there are differences
between DBMS vendors in how they implement this support, let's stick to the
letter of the SQL:2003 standard.
 
To implement option #1, someone will have to get their hands dirty
understanding how cell requests are turned into SQL queries. The hardest
part is to look at a collection of cell requests and figure out whether they
can be satisfied using the same query.
 
Is there a chance that a ROLLUP query will compute exponentially more
results than individual GROUP BY queries? If so, we will need to do a
cost:benefit analysis before issuing a ROLLUP query.
 
Option #2 (parallel query execution) is also viable, and is useful if option
#1 is implemented, because certain queries, especially those on virtual
cubes, may generate queries which are not a rollup of each other.
 
Implementing option #2 it requires a modest amount of coding, mainly
introducing a multi-threaded request queue, and a significant amount of
testing for threading issues.
 
A third option is to support rollup within cache. If mondrian notices that
there is a request for ([Time].[1997].[Q1], ... [Q4], [Product].[Beer]) and
also a request for ([Time].[1997], [Product].[Beer]) then it should execute
request #1 then answer request #2 by rolling up the results of request #1.
 
ALL of these options will benefit mondrian and each offers something that
the other two do not, so it's difficult to choose between them. My instinct
is that option #2 is slightly less work than option #1, but has less
benefit. Take your pick!
 
Julian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20070212/c8f6a987/attachment.html 


More information about the Mondrian mailing list