[Mondrian] Kylin Top N with mondrian

Jian Zhong zhongjian at apache.org
Sun Nov 13 08:29:43 EST 2016


I have fixed this issue from kylin side, Kylin can accept both SQL1 and
SQL2 sinc V1.5.4.1

thank you, every one.

On Thu, Sep 29, 2016 at 1:41 PM, Jian Zhong <zhongjian at apache.org> wrote:

> Thank you, Luc
> My point is before execute topN sql, Mondrian will run SQL1 first, kylin
> will throw exception since there's no data stored.
> Kylin only support the full SQL2, do you have any suggestion about how to
> create right schema to make it works in mondrian.
>
> Or the update should be on Kylin side?
>
> Best Regards
>
> On Wed, Sep 28, 2016 at 10:31 PM, Luc Boudreau <lucboudreau at gmail.com>
> wrote:
>
>> To implement Top, we use the JDBC method java.sql.Stetement.setMaxRows().
>> The LIMIT keyword is not standard ANSI SQL, so we do not use it.
>>
>> If Kylin returns more rows than expected, it must be because setMaxRows
>> is not implemented properly in the Kylin driver.
>>
>> On Wed, Sep 28, 2016 at 5:40 AM, Jian Zhong <zhongjian at apache.org> wrote:
>>
>>> Hi all,
>>>
>>> I'm trying to use Apache Kylin with Mondrian.
>>>
>>> everything works fine before I meet TOP_N in kylin.I did some research
>>> but cannot fix that.
>>>
>>> Hope someone give me a guide.
>>>
>>>
>>> Say we have table KYLIN_SALES, and dimension SELLER_ID, measure column
>>> PRICE (sum), And I put SELLER_ID as <Attribute> in <Dimension>. the full
>>> Query SQL should be like this "SELECT SUM(PRICE),SELLER_ID from KYLIN_SALES
>>> group by SELLER_ID "
>>> by default, Mondrian will generate sql like SQL1(attached below)  get
>>> all SELLER_ID info first. it's fine before meet TOP_N in Kylin
>>>
>>> If SELLER_ID is a extra highCardinality column, and user only want to
>>> know TOP 10 SUM(PRICE)  Sellers, Kylin will not store all distinct
>>> SELLER_ID in cuboid by default.
>>> So Kylin will not support SQL1 ,
>>> But full sql SQL2 will be supported.
>>>
>>> Is there any way to avoid MONDRIAN send SQL1 , but send SQL2 directly?
>>>
>>>
>>> SQL1:"SELECT SELLER_ID FROM KYLIN_SALES GROUP BY SELLER_ID"
>>> SQL2:"SELECT SUM(PRICE),SELLER_ID from KYLIN_SALES group by SELLER_ID
>>> order by SUM(PRICE) limit 10"
>>>
>>> About what's TopN in Kylin, see here
>>> http://kylin.apache.org/blog/2016/03/19/approximate-topn-measure/
>>>
>>>
>>> Thank you!
>>> Best Regards
>>>
>>>
>>> _______________________________________________
>>> Mondrian mailing list
>>> Mondrian at pentaho.org
>>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>>
>>>
>>
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org
>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20161113/7f1d8e1e/attachment.html 


More information about the Mondrian mailing list