[Mondrian] Kylin Top N with mondrian

Jian Zhong zhongjian at apache.org
Wed Sep 28 05:40:26 EDT 2016


Hi all,

I'm trying to use Apache Kylin with Mondrian.

everything works fine before I meet TOP_N in kylin.I did some research but
cannot fix that.

Hope someone give me a guide.


Say we have table KYLIN_SALES, and dimension SELLER_ID, measure column
PRICE (sum), And I put SELLER_ID as <Attribute> in <Dimension>. the full
Query SQL should be like this "SELECT SUM(PRICE),SELLER_ID from KYLIN_SALES
group by SELLER_ID "
by default, Mondrian will generate sql like SQL1(attached below)  get all
SELLER_ID info first. it's fine before meet TOP_N in Kylin

If SELLER_ID is a extra highCardinality column, and user only want to know
TOP 10 SUM(PRICE)  Sellers, Kylin will not store all distinct SELLER_ID in
cuboid by default.
So Kylin will not support SQL1 ,
But full sql SQL2 will be supported.

Is there any way to avoid MONDRIAN send SQL1 , but send SQL2 directly?


SQL1:"SELECT SELLER_ID FROM KYLIN_SALES GROUP BY SELLER_ID"
SQL2:"SELECT SUM(PRICE),SELLER_ID from KYLIN_SALES group by SELLER_ID order
by SUM(PRICE) limit 10"

About what's TopN in Kylin, see here
http://kylin.apache.org/blog/2016/03/19/approximate-topn-measure/


Thank you!
Best Regards
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20160928/a5e51e95/attachment.html 


More information about the Mondrian mailing list