[Mondrian] Thread interrupts

Julian Hyde jhyde at pentaho.com
Thu Aug 30 16:30:22 EDT 2012


I've been looking into http://jira.pentaho.com/browse/MONDRIAN-1217 and its causes. I've set up a test case that executes one of three MDX statements repeatedly, and another thread that invokes cancel on the running statement. After running it for a while, the system load goes through the roof, with MySQL taking about 20 active threads, all presumably running SQL statements that we should have cancelled but did not.

I've been seeing failures when an actor thread tries to send a message to talk to another actor. (E.g. a SQL statement executor saying that it has finished its statement.) The thread is in interrupted state and therefore ArrayBlockingQueue.put throws.

The calling code is not expecting this (it throws an internal error) and then I think internal state gets messed up.

In case you're not that familiar with threads and interrupts in Java, here's a refresher. There is a boolean variable in each thread that says whether it is interrupted. You generally set it by calling Thread.interrupt(). If you call the method Thread.interrupted(), it clears the flag, and returns the previous value. Key JDK methods, such as those doing IO or writing to queues, will check that flag and throw InterruptedException. This article gives more details: http://www.ibm.com/developerworks/java/library/j-jtp05236/index.html

It look me a long time to find out why threads were in interrupted state. (None of our code calls Thread.interrupt, except when handing an InterruptedException... but that doesn't explain where the first InterruptedException comes from.) Turns out that we call Future.cancel(true) when we want to cancel a task. The "true" flag causes the thread to enter interrupted state. But the Mondrian code isn't diligent about checking whether the thread is interrupted -- so the task fails unexpectedly next time it makes a system call.

I think it is too big a task to check for interrupted state throughout the relevant Mondrian code. I think the code should just call Future.isCancelled() at appropriate intervals. And we should pass "false" whenever we call Future.cancel(boolean mayInterruptIfRunning).

Thoughts?

Julian



Julian Hyde
jhyde at pentaho.com<mailto:jhyde at pentaho.com>



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20120830/fbb58eab/attachment.html 


More information about the Mondrian mailing list