[Mondrian] Except function performance

Hilario Fernandes hilario.fernandes at cortex-intelligence.com
Wed Jun 17 15:30:21 EDT 2015


I attached some stacktraces from the queries I described interrupted after
a few minutes.

On Wed, Jun 17, 2015 at 4:08 PM, Hilario Fernandes <
hilario.fernandes at cortex-intelligence.com> wrote:

> So, i think this reproduces the problem. The query is as follows:
>
> with
> member [Measures].[Total] as
> aggregate(
>     CrossJoin(
>         [product].currentmember,
>         Crossjoin(
>             [gender].[all gender],
>             Crossjoin(
>                 [filter],
>                 [Store Type].[all store types]
>             )
>         )
>     ),
>     measures.[unit sales]
> )
> set [filter] as
> cache(except(
>     [Store].[Store Name].Members,
>     {
>         [Store].[Canada].[BC].[Vancouver].[Store 19],
>         [Store].[Canada].[BC].[Victoria].[Store 20],
>         [Store].[Mexico].[DF].[Mexico City].[Store 9],
>         [Store].[Mexico].[DF].[San Andres].[Store 21],
>         [Store].[Mexico].[Guerrero].[Acapulco].[Store 1]
>     }
> ))
> select
> NON EMPTY Crossjoin(
>     [product].[product name].members,
>     Crossjoin(
>         [Gender].[Gender].members,
>         Crossjoin(
>             [filter],
>             [Store Type].[Store Type].members
>         )
>     )
> ) on 1,
> { measures.total } on 0
> from [Sales]
>
>
> This query fails to return any result and we did wait for quite a few
> minutes.
>
> Then when we changed the [filter] set to:
>
> set [filter] as
> cache({
>     [Store].[Mexico].[Jalisco].[Guadalajara].[Store 5],
>     [Store].[Mexico].[Veracruz].[Orizaba].[Store 10],
>     [Store].[Mexico].[Yucatan].[Merida].[Store 8],
>     [Store].[Mexico].[Zacatecas].[Camacho].[Store 4],
>     [Store].[Mexico].[Zacatecas].[Hidalgo].[Store 12],
>     [Store].[Mexico].[Zacatecas].[Hidalgo].[Store 18],
>     [Store].[USA].[CA].[Alameda].[HQ],
>     [Store].[USA].[CA].[Beverly Hills].[Store 6],
>     [Store].[USA].[CA].[Los Angeles].[Store 7],
>     [Store].[USA].[CA].[San Diego].[Store 24],
>     [Store].[USA].[CA].[San Francisco].[Store 14],
>     [Store].[USA].[OR].[Portland].[Store 11],
>     [Store].[USA].[OR].[Salem].[Store 13],
>     [Store].[USA].[WA].[Bellingham].[Store 2],
>     [Store].[USA].[WA].[Bremerton].[Store 3],
>     [Store].[USA].[WA].[Seattle].[Store 15],
>     [Store].[USA].[WA].[Spokane].[Store 16],
>     [Store].[USA].[WA].[Tacoma].[Store 17],
>     [Store].[USA].[WA].[Walla Walla].[Store 22],
>     [Store].[USA].[WA].[Yakima].[Store 23]
> })
>
> this still didn't return anything. So we went with:
>
> set [filter] as
> [Store].[store name].members
>
> And this returned a result after a while, and since the members were
> cached, the previous versions of the query returned also if executed again,
> but still taking a lot of time to finish.
>
>
>
> On Wed, Jun 17, 2015 at 3:35 PM, Julian Hyde <julianhyde at gmail.com> wrote:
>
>> Hilario,
>>
>> Can you generate a few stack traces (by sending "kill -QUIT <process
>> id>”) while it is running, then send the log? If we are hitting a
>> performance problem like a quadratic for-loop or a bad hash key, then it
>> should show up on the traces.
>>
>> Julian
>>
>> On Jun 17, 2015, at 5:27 AM, Matt Campbell <mcampbell at pentaho.com> wrote:
>>
>> The Cache function includes evaluation context in its key, and looking
>> back at the query I was thinking the Except() might be evaluated at each
>> projected intersection.  But actually- since it’s in a named set it should
>> be evaluated once in the context of the slicer.
>>
>> Hilario- if you could reproduce the poor performance with Foodmart that
>> could help us diagnose.
>>
>> *From:* mondrian-bounces at pentaho.org [mailto:mondrian-bounces at pentaho.org
>> <mondrian-bounces at pentaho.org>]*On Behalf Of *Luc Boudreau
>> *Sent:* Tuesday, June 16, 2015 4:30 PM
>> *To:* Mondrian developer mailing list
>> *Subject:* Re: [Mondrian] Except function performance
>>
>> The full docs about Cache() is here.
>>
>>
>> http://mondrian.pentaho.com/documentation/performance.php#Optimizing_Calculations_with_the_Expression_Cache
>>
>> Not sure why it wouldn't work. It's a trivial function implementation.
>>
>> On Tue, Jun 16, 2015 at 4:08 PM, Hilario Fernandes <
>> hilario.fernandes at cortex-intelligence.com> wrote:
>>
>> Unfortunately using the Cache() around the except in the FILTERED_D set
>> seems no make no difference... Any reason why it shouldn't?
>>
>>
>>
>> On Tue, Jun 16, 2015 at 1:47 PM, Brandon Jackson <usbrandon at gmail.com>
>> wrote:
>>
>> Luc!  Thanks for that golden nugget.
>>
>> Sent from my iPhone
>>
>>
>> On Jun 16, 2015, at 10:00 AM, Luc Boudreau <lucboudreau at gmail.com> wrote:
>>
>> If you know that a given mdx expression will always return the same
>> value, you can wrap it in a Cache() function.
>> On Jun 16, 2015 10:55, "Hilario Fernandes" <
>> hilario.fernandes at cortex-intelligence.com> wrote:
>>
>> The set returned by the Except() is small, 3 members. If we replace the
>> except with those 3 members, the performance is closer to expected.
>>
>> Is there some way we can avoid the except from being recalculated on each
>> iteration?
>>
>> On Mon, Jun 15, 2015 at 2:44 PM, Matt Campbell <mcampbell at pentaho.com>
>> wrote:
>>
>> NonEmptyCrossJoin and Crossjoin on a NON EMPTY axis are mapped to the
>> same native evaluator, so there should be no difference.
>>
>> How big is the set returned by Except?  That Aggregate() needs to be
>> evaluated for each tuple on the rows, which can be time consuming for
>> larger sets.
>> Just out of curiosity, if you replace the named set containing the Except
>> with the same enumerated members the except() would return, do you see much
>> difference?
>>
>> *From:* mondrian-bounces at pentaho.org [mailto:mondrian-bounces at pentaho.org
>> ] *On Behalf Of *Hilario Fernandes
>> *Sent:* Monday, June 15, 2015 1:36 PM
>> *To:* Mondrian developer mailing list
>>
>> *Subject:* Re: [Mondrian] Except function performance
>>
>> Thank you for your reply Matt!
>>
>> We are using the CrossJoin() function in our mdx, i just used the '*'
>> operator for readability. Will NonEmptyCrossjoin() be more efficient than
>> CrossJoin() considering that we are using NonEmpty on the axis?
>>
>> Unfortunately we already have the ExpandNonNative property set to true.
>>
>>
>>
>>
>> On Mon, Jun 15, 2015 at 2:26 PM, Matt Campbell <mcampbell at pentaho.com>
>> wrote:
>>
>> Except() is not handled natively.  You can try setting
>> “mondrian.native.ExpandNonNative=true”, which will evaluate the except
>> non-natively and attempt to include the resulting tuples in a native
>> context.  That property is enabled by default in Pentaho biserver, but
>> false by default in Mondrian.
>>
>> Another thing that caught my eye- the ‘*’ crossjoin operator in your
>> query is not mapped to native crossjoin (MONDRIAN-2284).  So that could
>> also be a factor.  Try replacing the crossjoins with nested
>> NonEmptyCrossJoin().
>>
>>
>>
>>
>> *From:* mondrian-bounces at pentaho.org [mailto:mondrian-bounces at pentaho.org
>> ] *On Behalf Of *Hilario Fernandes
>> *Sent:* Monday, June 15, 2015 12:41 PM
>> *To:* Mondrian mailing list
>> *Subject:* Re: [Mondrian] Except function performance
>>
>> Hello!
>>
>> I've sent this question a while ago without any replay, bringing it up
>> again.
>>
>> As I could see, using something like except in that crossjoin, it cannot
>> use a native evaluation and that slows things really a lot. This is as far
>> as i've gotten, maybe someone that knows how things work more in dept can
>> throw some comments in?
>>
>> Thanks
>>
>>
>> On Wed, Mar 4, 2015 at 2:15 PM, Hilario Fernandes <
>> hilario.fernandes at cortex-intelligence.com> wrote:
>>
>> Hi!
>>
>> I'm having somewhat of a performance problem when trying execute the
>> following query:
>>
>> WITH
>> MEMBER [Measures].[TotalA] AS
>> IIF((NOT IsEmpty([Measures].[M1])), Aggregate({[A].CurrentMember} *
>> {[B].[Total B]} * {[C].[Total C]} * {[FILTERED_D]} * {[E].[Total E]},
>> [Measures].[M1]), NULL)
>>
>> SET [FILTERED_D] AS
>> Except({[D].[D].Members},{[D].[D].[member1], [D].[D].[member2]})})
>>
>> SELECT
>> NON EMPTY ({[Measures].[M1], [Measures].[TotalA]}) ON COLUMNS,
>> NON EMPTY [A].[A].Members * [B].[B].Members * [C].[C].Members
>> * [FILTERED_D] * [E].[E].Members ON ROWS
>> FROM [C]
>>
>> The point of the TotalA measure is to return the total of the measure for
>> every A member, repeating it for the rest of the other dimension members on
>> the axis. The problem is this measure must respect the filters in that
>> axis, so the except is placed both on the rows and used on the aggregate.
>>
>> Thing is, when the except is used in the TotalA calculation its makes the
>> evaluation a lot slower. This only happens with except, if something like
>> Filter(members, measure >100) is used then there is no major difference.
>>
>> Can anyone shed some light on why this happens? And maybe some ideas to
>> work around it with some other way to achieve the same result.
>>
>> Thanks
>>
>>
>> --
>> Hilario Fernandes
>>
>>
>>
>>
>> --
>> Hilario Fernandes
>>
>>
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org
>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>>
>>
>>
>> --
>> Hilario Fernandes
>>
>>
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org
>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>>
>>
>>
>> --
>> Hilario Fernandes
>>
>>
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org
>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org
>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>>
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org
>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>>
>>
>>
>> --
>> Hilario Fernandes
>>
>>
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org
>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>>
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org
>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>>
>>
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org
>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>>
>
>
> --
> Hilario Fernandes
>



-- 
Hilario Fernandes
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20150617/f9e8148e/attachment-0001.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: stacktrace-except-example.log
Type: text/x-log
Size: 121442 bytes
Desc: not available
Url : http://lists.pentaho.org/pipermail/mondrian/attachments/20150617/f9e8148e/attachment-0002.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: stacktrace-listmembers-example.log
Type: text/x-log
Size: 70179 bytes
Desc: not available
Url : http://lists.pentaho.org/pipermail/mondrian/attachments/20150617/f9e8148e/attachment-0003.bin 


More information about the Mondrian mailing list