<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=Content-Type content="text/html; charset=us-ascii">
<META content="MSHTML 6.00.6000.16414" name=GENERATOR></HEAD>
<BODY>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>Thiyagu,</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>The details are there, but some of the larger design issues
still need to be solved. In short, we need a top-down design. Here are a few
ideas.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>To start with, I would like to see a description of the
design that a typical DBA would understand. 'Here is a simple MDX statement, and
here is the SQL with GROUPING SETS it generates'.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>A slightly more complex case. Consider the case where we
load CA, OR, and all cities in WA. What will the SQL look like for that query? I
don't think it's possible to constrain some grouping sets without constraining
others, so we would end up loading all cities in CA and OR. That might be
problem; maybe we can generate SQL to avoid that, or just not merge the batches
together.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>What happens if a query requests batches A, B, and C, and B
is already being loaded?</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>What happens if a query requests batches A, B, C,
D, and B and D are better satisfied using aggregate tables? Suppose
further that B and D can be satisfied using the same aggregate table; should we
generate GROUPING SETS syntax against aggregate tables?</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>If a query only has one GROUPING SET, will be use the
GROUPING SET syntax, or just the regular GROUP BY?</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>Concurrency control is a huge issue to be designed.
Previously, load would lock an Aggregate. That is not longer feasible, because
load will affect more than one Aggregate. The POC code continues to lock the
first </FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>Several places in the code take an array of Aggregations. I
think that they should take an array of Segments instead. This is valid because
- I think - the Segments in a given load will all belong to different
Aggregations. This array of Segments could be refactored into a class, which
could hold some of the newly-static methods.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>You have added tests in CmdRunnerTest and
FastBatchingCellReaderTest. I think the MDX-level tests should be in
BasicQueryTest - you are not testing new functionality in CmdRunner, after all.
FastBatchingCellReaderTest could be useful; I look forward to seeing it filled
out. If I were writing this feature, I would be adding tests very similar to
those in TestAggregationManager</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>Specific code reviews.</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2></FONT></SPAN> </DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2></FONT></SPAN><SPAN class=692030817-08052007><FONT
face=Verdana color=#000080 size=2>AggQuerySpec:</FONT></SPAN></DIV>
<UL dir=ltr>
<LI>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>Use collections (e.g. lists) for modern code. They are
definitely clearer than 2D arrays. For example, 'SqlQuery.ClauseList[]
getGroupingSets(RolapStar.Column[][] columns, SqlQuery sqlQuery)' would
certainly benefit from this.</FONT></SPAN></DIV></LI></UL>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>Aggregation:</FONT></SPAN></DIV>
<UL dir=ltr>
<LI>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>The method<BR> public synchronized void
load(<BR>
RolapStar.Column[]
columns,<BR>
RolapStar.Measure[]
measures,<BR>
StarColumnPredicate[]
predicates,<BR>
RolapAggregationManager.PinSet
pinnedSegments,<BR>
List<Aggregation> grouping<BR> )<BR>is effectively
static now you have added the 'grouping' parameter. Which means that the
'synchronized' keyword is now dubious: if you're loading segments A, B, and C
it is not valid to just lock A.</FONT></SPAN></DIV></LI>
<LI>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>Check the formatting for parameters in this method and
other new methods in this file.</FONT></SPAN></DIV></LI>
<LI>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>setColumns: Making columns mutable seems to be asking for
trouble. I note that setColumns is called from FastBatchingCellReader.merge,
but what if the aggregation concerned has existing segments? This needs to be
re-thought.</FONT></SPAN></DIV></LI>
<LI>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>A lot of Aggregation methods have become static. I don't
like static methods. My hunch is that we can refactor out a new class which
represents a collection of segments being loaded at the same
time.</FONT></SPAN></DIV></LI>
<LI>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>getSegment is poorly named since it returns more than one
segment</FONT></SPAN></DIV></LI></UL>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>AggregationManager:</FONT></SPAN></DIV>
<UL dir=ltr>
<LI>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>loadAggregation has a parameter constrainedColumnsBitKey,
and uses it to call Rolapstar.lookupOrCreateAggregation. That is no longer
appropriate: the aggregations in the grouping will each have a different bit
key. This is a symptom of a larger problem: the aggregations in the grouping
should be treated symmetrically, and there should be no 'primary' aggregation
to call 'load' against.</FONT></SPAN></DIV></LI>
<LI>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>'spec.generateSqlQuery(grouping)' : shouldn't the
grouping be part of the QuerySpec?</FONT></SPAN></DIV></LI></UL>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>Segment:</FONT></SPAN></DIV>
<UL dir=ltr>
<LI>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>exec() and getObjects() should return Object[][] not
Object[]. Even better, return List<Object[]>.</FONT></SPAN></DIV></LI>
<LI>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>getObjects: Why does this throw QueryCanceledException?
There are many reasons why ResultSet.next() might
throw.</FONT></SPAN></DIV></LI></UL>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>DrillThroughQuerySpec:</FONT></SPAN></DIV>
<UL dir=ltr>
<LI>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>should assert that groupingSet array is
empty</FONT></SPAN></DIV></LI></UL>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>CmdRunnerTest:</FONT></SPAN></DIV>
<UL dir=ltr>
<LI>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>move tests out of CmdRunnerTest into BasicQueryTest or
similar. You're not really testing functionality in
CmdRunner.</FONT></SPAN></DIV></LI></UL>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>SqlQuery:</FONT></SPAN></DIV>
<UL dir=ltr>
<LI>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>Rather than making ClauseList public, I suggest you
create a specific inner class GroupingSetClause. Some of the other helper
methods could be moved to this
class.</FONT></SPAN></DIV></FONT></SPAN></DIV></LI>
<LI>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>SqlQuery.groupingSets should accept StringBuilder rather
than return String.</FONT></SPAN></DIV></LI>
<LI>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>'grouping set' makes terminology confusing - not your
fault - but I suggest that you rename 'setGroupingSet(ClauseList[]
groupingSets)' to 'setGroupingSetList(List<ClauseList>
groupingSetList)</FONT></SPAN></DIV></LI>
<LI><SPAN class=692030817-08052007></SPAN><FONT face=Verdana><FONT
color=#000080><FONT size=2>G<SPAN class=692030817-08052007>rouping set names
are identifiers. They need to be quoted according to the
dialect.</SPAN></FONT></FONT></FONT></LI></UL>
<DIV dir=ltr><FONT face=Verdana><FONT color=#000080><FONT size=2><SPAN
class=692030817-08052007>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>General:</FONT></SPAN></DIV>
<DIV dir=ltr align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>
<UL>
<LI>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2>use StringBuilder rather than StringBuffer for all new
code</FONT></SPAN></FONT></SPAN></DIV></LI>
<LI>
<DIV align=left><SPAN class=692030817-08052007><FONT face=Verdana
color=#000080 size=2><SPAN class=692030817-08052007></SPAN>use foreach-style
'for' loops wherever possible</FONT></SPAN></DIV></LI>
<LI>
<DIV align=left><SPAN
class=692030817-08052007>javadoc</SPAN></FONT></SPAN></DIV></LI></UL></DIV></SPAN></FONT></FONT></FONT></DIV>
<DIV dir=ltr><FONT><FONT color=#000080><FONT size=2><SPAN
class=692030817-08052007></SPAN></FONT></FONT></FONT><SPAN
class=692030817-08052007></SPAN><FONT face=Verdana><FONT color=#000080><FONT
size=2>J<SPAN
class=692030817-08052007>ulian</SPAN></FONT></FONT></FONT><BR></DIV>
<BLOCKQUOTE
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000080 2px solid; MARGIN-RIGHT: 0px">
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left>
<HR tabIndex=-1>
</DIV>
<DIV class=OutlookMessageHeader lang=en-us dir=ltr align=left><FONT
face=Tahoma size=2><B>From:</B> mondrian-bounces@pentaho.org
[mailto:mondrian-bounces@pentaho.org] <B>On Behalf Of </B>Thiyagu
Palanisamy<BR><B>Sent:</B> Tuesday, May 08, 2007 7:30 AM<BR><B>To:</B>
mondrian@pentaho.org<BR><B>Subject:</B> Re: [Mondrian] All Members vs
Aggregate<BR></FONT><BR></DIV>
<DIV></DIV><BR><FONT face=sans-serif size=2>Hello,</FONT> <BR><BR><FONT
face=sans-serif size=2>For implementing Sql with Grouping Set we need to
change the way the following classes interact during loading of Cell request
data.</FONT> <BR><BR><FONT face=sans-serif size=2>1.
fastBatchingCellReader.loadAggregations() -> batch.loadAggregation()</FONT>
<BR><FONT face=sans-serif size=2>2. batch.loadAggregation() ->
aggregationManager.loadAggregation()</FONT> <BR><FONT face=sans-serif
size=2>3. aggregationManager.loadAggregation() -> aggregation.load()</FONT>
<BR><FONT face=sans-serif size=2>4. aggregation.load() ->
Segment.load()</FONT> <BR><BR><FONT face=sans-serif size=2>Using SQL with
Grouping Set function needs to be decided at the Batch level, once we identify
the batches that can be grouped then we need to tag them together and pass
this information down to Segment.load() to fire single SQL for multiple
batches.</FONT> <BR><BR><FONT face=sans-serif size=2>Segment.load() static
method needs information about all the segment instances which are going to be
queried together along with some additional information from Aggregation/Batch
class.</FONT> <BR><BR><FONT face=sans-serif size=2>So we need to put together
all the segment instances from multiple grouped batches and pass it to
Segment.load().</FONT> <BR><BR><FONT face=sans-serif size=2>We are thinking of
the following approach to achieve this,</FONT> <BR><BR><FONT face=sans-serif
size=2>Instead of aggregation.load() calling Segment.load() with its Segments
list, aggregation.load() need to return Segment list of each batch to the
FBCR(or class representing Grouped Batch).</FONT> <BR><FONT face=sans-serif
size=2>Then FBCR will group the segments of grouped batches and make a call to
Segment.load static method to fire single sql with grouping sets and populate
all those Segments.</FONT> <BR><BR><FONT face=sans-serif size=2>We are close
to finishing a working prototype based on the above approach.</FONT> <BR><FONT
face=sans-serif size=2>Hence would like to get your early feedback to ensure
that we are on right track.</FONT> <BR><BR><FONT face=sans-serif size=2>Code
posted at http://forums.pentaho.org/showthread.php?t=53644 is POC for logic of
grouping the Batch and extracting information from a grouping set sql to
multiple segments of multiple batches.</FONT> <BR><BR><FONT face=sans-serif
size=2>Thanks,</FONT> <BR><FONT face=sans-serif size=2>Thiyagu</FONT>
<BR><BR><FONT face=sans-serif size=2>
</FONT><BR></BLOCKQUOTE></BODY></HTML>