[Mondrian] Infobright Distinct Count

Julian Hyde jhyde at pentaho.com
Tue Feb 8 12:45:52 EST 2011


The method that controls this is Dialect.allowsCompoundCountDistinct(). 

This method returns false by default, MySqlDialect overrides to return true,
and InfobrightDialect overrides that to return false.

I added InfobrightDialect in change 12402, fixing bug
http://jira.pentaho.com/browse/MONDRIAN-512. I must have made it return
false for a good reason. Looking at
DialectTest.testAllowsCompoundCountDistinct, it appears that older versions
of Infobright returned the following error:

  "The query includes syntax that is not supported by the Infobright
Optimizer. Either restructure the query with supported syntax, or enable the
MySQL Query Path in the brighthouse.ini file to execute the query with
reduced performance."

My guess is that you are running a later version of Infobright. What version
are you running?

If so, the fix would be for the method to return a different value depending
on the version of infobright. Run the full mondrian test suite through
infobright, and let me know the results. It will be interesting if other
DialectTest methods fail... each of those failures may indicate a feature of
infobright that we can take advantage of.

I have logged http://jira.pentaho.com/browse/MONDRIAN-889 to track this.

Julian

> -----Original Message-----
> From: mondrian-bounces at pentaho.org 
> [mailto:mondrian-bounces at pentaho.org] On Behalf Of Tom Barber
> Sent: Tuesday, February 08, 2011 4:10 AM
> To: mondrian at pentaho.org
> Subject: [Mondrian] Infobright Distinct Count
> 
> Hey Guys,
> 
> I'm working with Pedro on a project involving 
> Infobright(3.4.2). Pedro noticed that Mondrian is executing 
> the following:
> 
> select count(*) as `c0` from (select distinct 
> `dim_marketing_sources`.`sublevel2` as `c0`, 
> `dim_marketing_sources`.`sublevel1` as `c1`, 
> `dim_marketing_sources`.`mkt_channel` as `c2` from 
> `dim_marketing_sources` as `dim_marketing_sources`) as `init`;
> 
> which takes 18 seconds to execute where as,
> 
> select count(distinct `dim_marketing_sources`.`sublevel2` , 
> `dim_marketing_sources`.`sublevel1`, 
> `dim_marketing_sources`.`mkt_channel`) as `c` from 
> `dim_marketing_sources`;
> 
> takes less than 1.
> 
> Is there any technical reason why the top count is preferred 
> and if not, what needs to occur for the dialect to change? :)
> 
> Thanks
> 
> Tom
> 




More information about the Mondrian mailing list