[Mondrian] change 10256 [was Re: change 9710: aggregating count-distinct over compound cells]
Thiyagu Palanisamy
tpalanis at thoughtworks.com
Fri Dec 7 10:39:41 EST 2007
<Rushan>(1) Grouping sets: these are already disabled when distinct count
aggregates are present. This change does not extent the usage of
grouping set when building groups to load. Does any one know happen to
know why this is disabled? </Rushan>
>> Grouping set is disabled if there is a distinct count, because DB
(Oracle) took almost same time to execute a grouping set query with
distinct count and set of queries without distinct count. Also query with
distinct count would require different type of grouping, because of
special treatment to distinct count (some DB allows only one DC some has x
no. restrictions).
Rushan Chen <rchen at lucidera.com>
Sent by: mondrian-bounces at pentaho.org
06/12/2007 10:20
Please respond to
Mondrian developer mailing list <mondrian at pentaho.org>
To
Mondrian developer mailing list <mondrian at pentaho.org>
cc
Subject
[Mondrian] change 10256 [was Re: change 9710: aggregating count-distinct
over compound cells]
I just checked in change list 10256 the improvement of distinct count
aggregate loading, as proposed here:
http://www.eigenbase.org/wiki/index.php/MondrianDistinctCountAggregateImprovement
A few notes besides what is outlined in the document:
(1) Grouping sets: these are already disabled when distinct count
aggregates are present. This change does not extent the usage of
grouping set when building groups to load. Does any one know happen to
know why this is disabled? If grouping set is enabled one day for
distinct count, change 10256 will allow that to be extended to queries
with "compound constraints" commonly expressed using Agg function.
(2) Cache Flushing: the algorithm to derived an "overlapping" region is
not aware of the compound constraints so cells might be flushed when
they do not need to , for example, when the region to flush is
[Time].[1998] but the constraint limit the aggregate for a cell to only
looking at values that are in {[1997].[Q1], [1997].[Q3]}. There can be
future improvement in this area.
I also added a new property to help with unit tests that expects SQL
patterns.
mondrian.test.WarnIfNoPatternForDialect
Sometimes a test expecting a sql pattern is not available in all
dialects, and setting this property to that dialect name will print out
warning if a test is missing a sql pattern. This way users can be
alerted if sql tests do not cover the dialects of interests. By default
it is set to NONE which is no warning.
There's also a new ant target
ant test-list
which lets you print out what tests will be run, and their ordinals in
the running sequence. So if there's any error inside a particular suite,
it is easy to locate the offending test methods, after some "dot
counting" of the test output.
Lastly, the change is tested fairly well on derby, mysql, oracle xe,
luciddb, however, there could be misses still in the sql tests for other
DBs. If you use Mondrian primarily with a DB not listed, or with non
default parameter settings, I would recommend running the regression
suite to be sure after syncing the latest code.
Rushan
Matt Campbell wrote:
> Rushan,
> These changes sound like something that could help us a lot, too. Do
> you have any guess about when you might be implementing the change?
>
> Thanks,
> Matt
>
> On Nov 25, 2007 6:55 PM, Julian Hyde < jhyde at pentaho.org
> <mailto:jhyde at pentaho.org>> wrote:
>
> > Rushan Chen wrote:
> >
> > I have drafted a design doc based on the "CellContext" idea
> > to improve
> > the performance of aggregate loading for cells with
> > "compound" constraints.
> >
> >
>
http://www.eigenbase.org/wiki/index.php/MondrianDistinctCountAggregateImprov
> ement
> >
> > This proposal requires pretty far-reaching code change so I did
> some
> > prototyping to make sure this idea would work. So far, despite the
> > sizable changes required, the basic functionalities(batch loading,
> > caching) are working with some careful extraction of code and
> > streamlining of interfaces. This hopefully will make aggregate
> > loading/caching more modular and using the new set of interfaces
> less
> > error prone.
> >
> > Since I have done just some prototyping, there could be design
> flaws
> > lurking still. I would really appreciate your input and/or
> > your comments
> > on how to better test this improvement.
>
> Rushan,
>
> Thanks for the heads up and the detailed design document. I've
> read it
> through once, and can't find fault with it. It looks like it will
> work, and
> it's more elegant than what I have now.
>
> I will read it in more detail on my plane journey from London.
>
> Julian
>
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org <mailto:Mondrian at pentaho.org>
> http://lists.pentaho.org/mailman/listinfo/mondrian
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian
>
--
Rushan Chen
rchen at lucidera.com
Read <http://tinyurl.com/ypc73a> our customer reviews: "LucidEra is a
must have tool for any company that extensively uses salesforce.com"
Test drive <http://www.lucidera.com/test-drive.php> LucidEra Revenue
Cycle Analysis
Comment <http://www.lucidera.com/blog/> on our "Keep it Simple" blog
_______________________________________________
Mondrian mailing list
Mondrian at pentaho.org
http://lists.pentaho.org/mailman/listinfo/mondrian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20071207/ccfb72bc/attachment.html
More information about the Mondrian
mailing list