<div dir="ltr">I&#39;ve actually been working on a branch on my weekends to address ranges.<div><br></div><div>It can be found here: <a href="https://github.com/pentaho/mondrian/tree/smr">https://github.com/pentaho/mondrian/tree/smr</a></div>


<div><br></div><div>These changes would help quite a bit. Not sure about your particular query, since I believe this is a segment query, and I have not found a way to nativize ranges for these yet. You can grab it, build it and let us know if it helps.</div>


<div><br></div><div>Luc</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Jul 21, 2014 at 3:51 PM, Julian Hyde <span dir="ltr">&lt;<a href="mailto:julianhyde@gmail.com" target="_blank">julianhyde@gmail.com</a>&gt;</span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word">That’s pretty yucky SQL. I sympathize.<div><br></div><div>Can you try refactoring the SQL to other queries that gives the same answer? </div>


<div><br></div><div>1. Move <span style="color:rgb(153,51,0)">`dim_rul_band`.`band` = &#39;RUL 39-30</span><font color="#993300">’</font>, up as an AND clause, because it is common in both branches of the OR.</div><div><br>


</div><div>2. Try removing the OR, and dealing with Jun and Jul in one clause:</div><div><br></div><div><blockquote type="cite"><div bgcolor="#FFFFFF" text="#000000"><font color="#993300"><div class=""><tt>         (</tt><tt><br>


</tt><tt>            `dim_date`.`DAY_DATE`, `dim_rul_band`.`band`</tt><tt><br></tt><tt>         )</tt><tt><br></tt><tt>         in</tt><tt><br></tt><tt>         (</tt><tt><br></tt></div><tt>            (DATE &#39;2014-06-01&#39;, &#39;RUL 39-30’),</tt><tt><br>


</tt></font></div></blockquote>...<div class=""><br><blockquote type="cite"><div bgcolor="#FFFFFF" text="#000000"><font color="#993300"><tt>            (DATE &#39;2014-07-08&#39;, &#39;RUL 39-30&#39;)</tt><tt><br></tt><tt>         )</tt><tt><br>


</tt></font></div></blockquote></div></div><div><br></div><div><br></div><div>3. Try ‘dim_date.DAY_DATE BETWEEN DATE ‘2014-06-01’ AND DATE ‘2014-07-08’. (Mondrian is not currently capable of making that optimization — see <a href="http://jira.pentaho.com/browse/MONDRIAN-1494" target="_blank">http://jira.pentaho.com/browse/MONDRIAN-1494</a> — but it would be interesting to see the performance difference.)</div>


<div><br></div><div>Post the performance numbers for the various queries and we will see which one we’d like to generate. (In an ideal world… making Mondrian generate those queries is another matter… I’m not promising anything…)</div>


<div><br></div><div>Julian</div><div><br></div><div><br><div><div><div class="h5"><div>On Jul 21, 2014, at 6:50 AM, Ricardo Fradinho &lt;<a href="mailto:ricardo.fradinho@webdetails.pt" target="_blank">ricardo.fradinho@webdetails.pt</a>&gt; wrote:</div>


<br></div></div><blockquote type="cite"><div><div class="h5">

  
  <div bgcolor="#FFFFFF" text="#000000">

    Hi,<br>

    <br>

    I have a performance problem caused by the queries that Mondrian

    generates.<br>

    These queries always force full table scans because there&#39;s no

    indexing that can be done do help these queries.<br>

    <br>

    The problem happens when I select my time range as (2014-06-01 :

    2014-07-08) and Mondrian rewrites this as (2014-06) + (2014-07-01 :

    2014-07-08) which is translated into this nasty SQL:<br>

    <br>

    <tt>select</tt><tt><br>

    </tt><tt>  `dim_site`.`SITE` as `c0`,</tt><tt><br>

    </tt><tt>  count(distinct `fact_tetrapak_500k`.`dim_function_key`)

      as `m0`</tt><tt><br>

    </tt><tt>from</tt><tt><br>

    </tt><tt>  `dim_site` as `dim_site`,</tt><tt><br>

    </tt><tt>  `fact_tetrapak_500k` as `fact_tetrapak_500k`,</tt><tt><br>

    </tt><tt>  `dim_date` as `dim_date`,</tt><tt><br>

    </tt><tt>  `dim_rul_band` as `dim_rul_band`</tt><tt><br>

    </tt><tt>where `fact_tetrapak_500k`.`dim_site_key` =

      `dim_site`.`dim_site_key`</tt><tt><br>

    </tt><tt> and `fact_tetrapak_500k`.`dateToProcess` =

      `dim_date`.`DATE_SK`</tt><tt><br>

    </tt><tt> and `fact_tetrapak_500k`.`dim_rul_band_key` =

      `dim_rul_band`.`DIM_RUL_BAND_KEY`</tt><tt><br>

    </tt><tt> and</tt><tt><br>

    </tt><tt> (</tt><tt><br>

    </tt><font color="#993300"><tt>   (</tt><tt><br>

      </tt><tt>      `dim_date`.`MONTH_NAME` = &#39;June&#39;</tt><tt><br>

      </tt><tt>      and `dim_date`.`YEAR_NUMBER` = 2014</tt><tt><br>

      </tt><tt>      and `dim_rul_band`.`band` = &#39;RUL 39-30&#39;</tt><tt><br>

      </tt><tt>   )</tt></font><tt><br>

    </tt><tt>   or</tt><tt><br>

    </tt><tt>   (</tt><tt><br>

    </tt><font color="#993300"><tt>      (</tt><tt><br>

      </tt><tt>         (</tt><tt><br>

      </tt><tt>            `dim_date`.`DAY_DATE`, `dim_rul_band`.`band`</tt><tt><br>

      </tt><tt>         )</tt><tt><br>

      </tt><tt>         in</tt><tt><br>

      </tt><tt>         (</tt><tt><br>

      </tt><tt>            (DATE &#39;2014-07-01&#39;, &#39;RUL 39-30&#39;),</tt><tt><br>

      </tt><tt>            (DATE &#39;2014-07-02&#39;, &#39;RUL 39-30&#39;),</tt><tt><br>

      </tt><tt>            (DATE &#39;2014-07-03&#39;, &#39;RUL 39-30&#39;),</tt><tt><br>

      </tt><tt>            (DATE &#39;2014-07-04&#39;, &#39;RUL 39-30&#39;),</tt><tt><br>

      </tt><tt>            (DATE &#39;2014-07-05&#39;, &#39;RUL 39-30&#39;),</tt><tt><br>

      </tt><tt>            (DATE &#39;2014-07-06&#39;, &#39;RUL 39-30&#39;),</tt><tt><br>

      </tt><tt>            (DATE &#39;2014-07-07&#39;, &#39;RUL 39-30&#39;),</tt><tt><br>

      </tt><tt>            (DATE &#39;2014-07-08&#39;, &#39;RUL 39-30&#39;)</tt><tt><br>

      </tt><tt>         )</tt><tt><br>

      </tt><tt>      )</tt></font><tt><br>

    </tt><tt>   )</tt><tt><br>

    </tt><tt>)</tt><tt><br>

    </tt><tt>group by `dim_site`.`SITE`</tt><br>

    <br>

    There is no indexing (on MySQL) that can prevent a full table scan.<br>

    I implemented a workaround by changing my [Date] dimension to have

    only the Day level, which is fine for the timerange selection we

    use. But this voids the use of aggregation tables.<br>

    <br>

    I tried mondrian.rolap.aggregates.optimizePredicates=false

    (Boolean property that determines whether Mondrian optimizes

    predicates)<br>

    but as the docs say, &quot;[...] If false, Mondrian still optimizes

    queries that involve all members of a dimension&quot;.<br>

    <br>

    Also tried setting mondrian.rolap.EnableInMemoryRollup=false but

    didn&#39;t help either.<br>

    <br>

    Do you know if there&#39;s any setting that can prevent this

    optimization ?<br>

    <br>

    <big>Thanks,<br>

      Ricardo Fradinho.</big>

  </div></div></div>


_______________________________________________<br>Mondrian mailing list<br><a href="mailto:Mondrian@pentaho.org" target="_blank">Mondrian@pentaho.org</a><br><a href="http://lists.pentaho.org/mailman/listinfo/mondrian" target="_blank">http://lists.pentaho.org/mailman/listinfo/mondrian</a><br>


</blockquote></div><br></div></div><br>_______________________________________________<br>

Mondrian mailing list<br>

<a href="mailto:Mondrian@pentaho.org">Mondrian@pentaho.org</a><br>

<a href="http://lists.pentaho.org/mailman/listinfo/mondrian" target="_blank">http://lists.pentaho.org/mailman/listinfo/mondrian</a><br>

<br></blockquote></div><br></div>