[Mondrian] hanging application: segment cache manager and schema load?

Wed Mar 30 10:34:03 EDT 2016

This stack tells us that the Actor has received a notification from an
external event. A new segment must be indexed. The Actor is waiting on the
RolapSchemaPool to free up so that it can grap a Star instance and pin the
segment.

What other threads are waiting on that same monitor? (0x00007f1ad965f000)

On Wed, Mar 30, 2016 at 9:14 AM, Wright, Jeff <
jeff.s.wright at truvenhealth.com> wrote:

> We’ve seen our application hang a couple times. Looking at thread dumps,
> I’m suspicious of this excerpt:
>
>
>
> "mondrian.rolap.agg.SegmentCacheManager$ACTOR" daemon prio=10
> tid=0x00007f1b34170000 nid=0xf25 waiting for monitor entry
> [0x00007f1ad965f000]
>
>    java.lang.Thread.State: BLOCKED (on object monitor)
>
>                 at
> mondrian.rolap.RolapSchemaPool.getRolapSchemas(RolapSchemaPool.java:420)
>
>                 - waiting to lock <0x00000005e23c2028> (a
> mondrian.rolap.RolapSchemaPool)
>
>                 at
> mondrian.rolap.RolapSchema.getRolapSchemas(RolapSchema.java:930)
>
>                 at
> mondrian.rolap.agg.SegmentCacheManager.getStar(SegmentCacheManager.java:1621)
>
>                 at
> mondrian.rolap.agg.SegmentCacheManager$Handler.visit(SegmentCacheManager.java:661)
>
>                 at
> mondrian.rolap.agg.SegmentCacheManager$ExternalSegmentCreatedEvent.acceptWithoutResponse(SegmentCacheManager.java:1222)
>
>                 at
> mondrian.rolap.agg.SegmentCacheManager$Actor.run(SegmentCacheManager.java:1019)
>
>                 at java.lang.Thread.run(Thread.java:724)
>
>
>
> I don’t fully understand SegmentCacheManager, but based on Julian’s 2012
> blog post I get the impression the Actor thread is supposed to run very. If
> that corresponds to the stack trace above, that’s a big problem - we see
> schema loads take minutes.
>
>
>
> I also see that there were some code changes in August last year for
> MONDRIAN-2390, to make locking for schema load lower level. We don’t have
> that code.
>
>
>
> Btw we have a distributed cache.
>
>
>
> Does it sound like I’m on to a problem in our environment? Maybe even a
> general problem?
>
>
>
> --Jeff Wright
>
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20160330/b5dc7ea9/attachment-0001.html