[Mondrian] Queries being executed in serial in mondrian 3.4

Luc Boudreau lucboudreau at gmail.com
Fri Jun 8 11:53:28 EDT 2012


Ok. We've found the problem. Add all this info to a Jira case and we'll fix
it.

Luc

On Fri, Jun 8, 2012 at 11:52 AM, Pedro Vale <pedro.vale at webdetails.pt>wrote:

> I had a breakpoint on line 628 of RolapConnection (the one that calls
> executeInternal) and this is not reached in parallel. That's what you mean,
> right?
>
>
>  Pedro Vale                             pedro.vale at webdetails.pt<pedro.alves at webdetails.pt>
> WebDetails Consulting                    http://www.webdetails.pt
>
>
>
>
> A 2012/06/08, às 16:44, Luc Boudreau escreveu:
>
> Ok. Waiting on Task.get() is normal and expected. The code relays the
> execution on another thread from this point on. Can you check if the tasks
> get executed in parallel? Put a breakpoint in
> RolapConnection.executeInternal() and launch two queries. Both should be
> able to reach this point in parallel. If they can't, then we have found the
> problem.
>
> Luc
>
> On Fri, Jun 8, 2012 at 11:32 AM, Pedro Vale <pedro.vale at webdetails.pt>wrote:
>
>> Did not change that and confirmed that it was 10 (I was hoping to find a
>> 1 hardcoded somewhere :-) )
>>
>> The thread is waiting on task.get (line 131 on RolapResultShepherd).
>>
>> Added a screenshot with call stack to MONDRIAN-1161
>>
>>
>> thanks,
>>
>>  Pedro Vale                             pedro.vale at webdetails.pt<pedro.alves at webdetails.pt>
>> WebDetails Consulting                    http://www.webdetails.pt
>>
>>
>>
>>
>> A 2012/06/08, às 16:18, Luc Boudreau escreveu:
>>
>> Can you confirm that the user thread is waiting on the
>> RolapResultShepherd executor? Did you override the value of
>> "mondrian.rolap.maxQueryThreads"? It defaults to 10 (meaning that 10 user
>> queries can be processed simultaneously per mondrian instance).
>>
>> On Fri, Jun 8, 2012 at 11:15 AM, Pedro Vale <pedro.vale at webdetails.pt>wrote:
>>
>>> Hi, Luc.
>>>
>>> Here's the detail for the test case we have so far (tested against
>>> Mondrian trunk):
>>>
>>> Open Analysis View - Create connection over steel-wheels and issue a top
>>> count query. We have a breakpoint set in RolapNativeTopCount so execution
>>> stops there.
>>>
>>> Open another analysis View - Create connection over SampleData (another
>>> cube which should rule out cache sharing, I guess) .
>>> Mondrian is called to get that first screen, execution goes into
>>> RolapConnection.execute, shepherdExecution method is called but the
>>> executeInternal (line 628) is not called - only after execution is resumed
>>> on the previous breakpoint.
>>>
>>>
>>> hmmm.... actually, let me rephrase that last sentence. Apparently, if I
>>> wait long enough (about 4 minutes), the second thread will go into
>>> executeInternal due to a new request coming in from JPivot... weird...
>>>
>>> As Pedro said in another email, this does not happen in Mondrian 3.3.
>>>
>>> I'll file a JIRA with this info so you guys can analyze this. Let me
>>> know if you need more info.
>>>
>>> cheers,
>>>
>>>  Pedro Vale                             pedro.vale at webdetails.pt<pedro.alves at webdetails.pt>
>>> WebDetails Consulting                    http://www.webdetails.pt
>>>
>>>
>>>
>>>
>>> A 2012/06/08, às 15:13, Luc Boudreau escreveu:
>>>
>>> My guess is that dashboard B waits on some cells that are fetched by
>>> dashboard A before rendering. As of Mondrian 3.4, we do a more efficient
>>> cell sharing scheme; the cells are only loaded once, whereas before, two
>>> concurrent queries could end up executing the same request if both were
>>> issued before the cells were written to cache.
>>>
>>> Luc
>>>
>>> On Fri, Jun 8, 2012 at 9:31 AM, Pedro Alves <pmgalves at gmail.com> wrote:
>>>
>>>>
>>>> I don't think that's the case (as we actually increased that number a
>>>> lot) but no harm done trying
>>>>
>>>>
>>>> However - We just tested adding a breakpoint to a topcount def fun,
>>>> stopping there and then opening another analysis view - It was stuck
>>>> there until we pressed continue !!
>>>>
>>>>
>>>> I'll test this in 3.3.0 or whatever was on pentaho 4.0
>>>>
>>>>
>>>>
>>>> -pedro
>>>>
>>>>
>>>>
>>>> On 06/08/2012 02:27 PM, Brian Hagan wrote:
>>>> > Pedro,
>>>> >
>>>> > It does ring a bell. See http://jira.pentaho.com/browse/MONDRIAN-1080.
>>>> Maybe you can try 3.4.3.
>>>> >
>>>> > - Brian
>>>> > On Jun 8, 2012, at 9:02 AM, Pedro Alves wrote:
>>>> >
>>>> >>
>>>> >>
>>>> >> Hey there.
>>>> >>
>>>> >>
>>>> >> I really don't know how to ask this. I'm not even sure there's an
>>>> issue,
>>>> >> just a set of coincidences.
>>>> >>
>>>> >>
>>>> >> As most of you know, we develop a lot of dashboards that rely on mdx.
>>>> >> Lately, we've been having a few complains that the dashboards are
>>>> >> incredibly slow.
>>>> >>
>>>> >>
>>>> >> And we can't find, in the mondrian / sql logs reasons for that to be
>>>> slow.
>>>> >>
>>>> >>
>>>> >> Until we noticed somethng weird:
>>>> >>
>>>> >> 1. User A accesses a dashboard that is not on cache
>>>> >> 2. User B accesses a dashboard that should be on cache
>>>> >> 3. We see on the logs the queries for user A
>>>> >> 4. User B is stuck, no activity
>>>> >> 5. User A's queries end, dashboard rendered
>>>> >> 6. User B's dashboard is immediately rendered
>>>> >>
>>>> >>
>>>> >> I saw this the first time 2 weeks ago in a mondrian over lucid. I
>>>> >> thought it had to do with lucid's connection polling.
>>>> >>
>>>> >>
>>>> >> I saw this again *today* with mondrian with mysql.
>>>> >>
>>>> >>
>>>> >> Common factor: They were both on pentaho 4.5 (with mondrian 3.4.1).
>>>> >>
>>>> >>
>>>> >> Commented this with Jan Aertsen and he said "weird - someone
>>>> mentioned
>>>> >> something very similar a few days ago".
>>>> >>
>>>> >>
>>>> >> I'm not saying this is a regression cause I don't know what it is. I
>>>> >> know some work has been done on query execution. But it's definitely
>>>> >> something that was not there before (or all our dashboards would be
>>>> >> basically unusable).
>>>> >>
>>>> >>
>>>> >> Does this ring any bells to anyone? We'll try to see if we can
>>>> somehow
>>>> >> reproduce this, but it's incredibly hard to reproduce concurrency
>>>> >> issues.... :(
>>>> >>
>>>> >>
>>>> >>
>>>> >> -pedro
>>>> >> _______________________________________________
>>>> >> Mondrian mailing list
>>>> >> Mondrian at pentaho.org
>>>> >> http://lists.pentaho.org/mailman/listinfo/mondrian
>>>> >
>>>> > _______________________________________________
>>>> > Mondrian mailing list
>>>> > Mondrian at pentaho.org
>>>> > http://lists.pentaho.org/mailman/listinfo/mondrian
>>>> _______________________________________________
>>>> Mondrian mailing list
>>>> Mondrian at pentaho.org
>>>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>>>
>>>
>>> _______________________________________________
>>> Mondrian mailing list
>>> Mondrian at pentaho.org
>>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>>
>>>
>>>
>>> _______________________________________________
>>> Mondrian mailing list
>>> Mondrian at pentaho.org
>>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>>
>>>
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org
>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>>
>>
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org
>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>>
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian
>
>
>
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20120608/290ab81e/attachment-0001.html 


More information about the Mondrian mailing list