[Mondrian] Mondrian Query Timeout

Joe Barnett thejoe at gmail.com
Mon Aug 8 17:04:19 EDT 2011


Ok, I've had a little bit of time to play with a trunk snapshot (up to
perforce changelist 14501 right now).  Seems like a configurable
ThreadFactory is, in fact, what we need, so I've attached a quick
patch to allow this.  Let me know if you'd rather it be in JIRA, or if
we need more discussion first (here or in JIRA) instead.

A couple of questions I've come across not directly related to the
shepherd thread:

1) We're still using mondrian.olap.* rather than the olap4j API, and
are starting to get some deprecation warnings with 3.3 code.  Fine for
now, but looking into moving over to olap4j instead, I don't see an
obvious replacement for m.o.DriverManager#getConnection(PropertyList,
CatalogLocator, DataSource).  We basically want to get an
OlapConnection and pass it an existing, configured JDBC DataSource to
use for db access.

2) What's the best way to get the QueryTiming information after a
result has been returned?  Right now, I'm casting the
mondrian.olap.Result into a RolapResult and calling
RolapResult#getExecution().getQueryTiming(), but seems like there
might be a more elegant way of accessing it without needing a cast?

Thanks,
-Joe

On Fri, Jul 22, 2011 at 12:10 PM, Joe Barnett <thejoe at gmail.com> wrote:
> On Thu, Jul 14, 2011 at 7:56 AM, Luc Boudreau <lucboudreau at gmail.com> wrote:
>>
>>> Do you have a sense of the differences in response time between
>>> cancel()ing the request without this extra thread vs/ with it?  Seems
>>> like if the code is able to poll often enough
>>
>> Our experience has shown that there are cases where the cancellation cannot
>> occur as expected. When Mondrian is issuing a long running SQL query, there
>> was no way to make the MDX request timeout correctly. As we want to avoid
>> doing a hard kill of threads (we'd rather have the JDBC resources freed
>> properly in the background so as not to hose the RDBMS and whatever other
>> symptoms), we needed to split the user thread from the execution thread.
>
> ok, I guess that JDBC's Statement#cancel() is not sufficiently
> implemented in enough drivers/dbs to make this threading worthwhile.
>
>> Julian mentioned Execution.peek(). He should have mentioned Locus.peek()
>> instead. Using this static accessor, your UDF will be capable of accessing
>> the Execution instance of your particular thread by doing
>> "Locus.peek().execution". Let's work together to figure out which
>> information the Execution state should contain.
>>
>
> I'll take a look in the next few weeks; I need to do some testing of
> our app vs/ the latest 3.3 code anyway, to check for performance
> regressions, etc.  Since it's not actually only within our UDF, but
> also our jdbc datasource implementation that needs to read from these
> ThreadLocals, I'm not sure the Locus.peek().execution will work (our
> datasource impl has no knowledge of mondrian code as it's used
> throughout our codebase, and should stay that way).  However, I'd
> guess that as long as we can configure our own ThreadFactory for the
> execution thread creation, we can set up the needed context there.
> Again, will have more detailed/specific feedback in the next few weeks
> once I carve out some time to do a quick 3.3 experimental port.  Also
> possible I'll see a quick and easy way to use (or modify)
> Locus.peek().execution to fit our needs; will keep the list updated.
>
> Thanks,
> -Joe
>
>> Luc
>>
>> On Wed, Jul 13, 2011 at 6:35 PM, Julian Hyde <jhyde at pentaho.com> wrote:
>>>
>>> Take a look at Execution.peek(). I want to reduce the use of thread-locals
>>> in mondrian because they make it more difficult to have a parallel
>>> execution
>>> model. As much state as possible should be stored in Execution. However, I
>>> know I can't eradicate thread-locals overnight.
>>>
>>> Execution.peek() will allow you to retrieve the current Execution object.
>>> Under the covers it uses -- you guessed it -- a thread-local. (So you
>>> don't
>>> have to.)
>>>
>>> Use Execution.peek() when your thread starts work, and store the Execution
>>> object in its local state.
>>>
>>> Execution isn't a supported API yet, but we can evolve it into one once
>>> we've proved the approach is viable.
>>>
>>> Julian
>>>
>>>
>>> > -----Original Message-----
>>> > From: mondrian-bounces at pentaho.org
>>> > [mailto:mondrian-bounces at pentaho.org] On Behalf Of Joe Barnett
>>> > Sent: Wednesday, July 13, 2011 3:25 PM
>>> > To: Mondrian developer mailing list
>>> > Subject: Re: [Mondrian] Mondrian Query Timeout
>>> >
>>> > One potential concern I have with this is that we have a UDF that
>>> > essentially relies upon ThreadLocals that are set up by our
>>> > application to control routing to the appropriate database.  Actually,
>>> > the DataSources we pass to mondrian rely on those ThreadLocals as
>>> > well, so not actually UDF specific.  It sounds like this approach
>>> > would have mondrian run in a separate thread context, and hence not
>>> > have the appropriate ThreadLocal values set.
>>> >
>>> > Are you planning on adding any hooks to the thread fork where we might
>>> > be able to set up the thread appropriately?  We'd also probably have
>>> > to adjust some things (connection pools, etc) to expect an additional
>>> > thread per thread-that-calls-into-mondrian, but would need to study
>>> > the exact execution patterns a little more closely to see if that's
>>> > really necessary.
>>> >
>>> > Do you have a sense of the differences in response time between
>>> > cancel()ing the request without this extra thread vs/ with it?  Seems
>>> > like if the code is able to poll often enough (and
>>> > interrupt()/cancel() any underlying IO), the difference should be
>>> > negligible enough?  But not sure exactly where all the
>>> > cancel-checking-hooks are.
>>> >
>>> > Other thoughts?
>>> >
>>> > -Joe
>>> >
>>> > On Tue, Jul 12, 2011 at 7:35 AM, Luc Boudreau
>>> > <lucboudreau at gmail.com> wrote:
>>> > >
>>> > > Good. That's exactly what I had in mind.
>>> > >
>>> > > As for b), we are using backport-util-concurrent so I think
>>> > we're good.I'll
>>> > > make sure to test thoroughly.
>>> > >
>>> > > Luc
>>> > >
>>> > >
>>> > >
>>> > > On Mon, Jul 11, 2011 at 6:21 PM, Julian Hyde
>>> > <jhyde at pentaho.com> wrote:
>>> > >>
>>> > >> However we solve this, we need a data structure that
>>> > records all SQL
>>> > >> statements that are running in the service of a particular
>>> > MDX statement.
>>> > >> The Execution class that I added on Friday (see
>>> > >> http://p4web.eigenbase.org/@md=d&c=6PU@/14436?ac=10 ) is
>>> > perfect for this.
>>> > >>
>>> > >> I couldn't tell whether you were suggesting this, but just
>>> > in case, let me
>>> > >> state for the record: I think that killing Java threads is
>>> > a bad idea. It is
>>> > >> very likely to make a mess. I think that Java code under
>>> > mondrian's control
>>> > >> should continue to poll the 'canceled' flag. (Which on
>>> > Friday moved from
>>> > >> Query to Execution.)
>>> > >>
>>> > >> For SQL statements, mondrian should definitely call
>>> > cancel. If the JDBC
>>> > >> driver of the database concerned is well-implemented, it
>>> > will do something
>>> > >> sensible with the cancel request. (Probably three things:
>>> > cancel the
>>> > >> synchronous call fairly quicky, throw a SQLException
>>> > indicating that the
>>> > >> statement is canceled, and send a message to the deeper
>>> > parts of the
>>> > >> infrastructure to cancel asynchronous processing, slower but more
>>> > >> exhaustively.)
>>> > >>
>>> > >> Since Mondrian is itself an JDBC server (by virtue of
>>> > olap4j being an
>>> > >> extension to JDBC), we should do something similar. It's
>>> > worth asking: what
>>> > >> MondrianOlap4jStatement.executeOlapQuery() should do when
>>> > canceled? I expect
>>> > >> that Execute.cancel will be polled every ~millisecond, so
>>> > cancel the
>>> > >> statement should come back fairly quickly. We could make
>>> > it come back
>>> > >> quicker still if executeOlapQuery were making a call to
>>> > another thread. Then
>>> > >> we could interrupt that call without damaging either of
>>> > the threads.
>>> > >>
>>> > >> Let me be clear that this is a "nice to have". And there are some
>>> > >> implementation details I haven't solved:
>>> > >>
>>> > >> (a) I am speculating that there is an 'cancelable future'
>>> > in JDK 1.6
>>> > >> off-the-shelf. I mean: create a Future, have another
>>> > thread (from an
>>> > >> ExecutionService presumably) do the work, and allow any
>>> > 3rd thread to cancel
>>> > >> that call. Is this available in the JDK?
>>> > >>
>>> > >> (b) If the calls you need are not available in JDK 1.4 or
>>> > JDK 1.5, it's OK
>>> > >> if they have inferior behavior. But it still needs to
>>> > build & run in those
>>> > >> JDKs.
>>> > >>
>>> > >> (c) MondrianOlap4jStatement.executeOlapQuery is not
>>> > necessarily the right
>>> > >> place in the architecture to make the asynchronous
>>> > 'slice'. Find the ideal
>>> > >> place, maybe a little ways below it. But note that
>>> > Locus.peek() etc. uses a
>>> > >> thread-local.
>>> > >>
>>> > >> Julian
>>> > >>
>>> > >> ________________________________
>>> > >> From: Luc Boudreau [mailto:lucboudreau at gmail.com]
>>> > >> Sent: Monday, July 11, 2011 8:57 AM
>>> > >> To: Julian Hyde
>>> > >> Cc: Mondrian developer mailing list
>>> > >> Subject: Mondrian Query Timeout
>>> > >>
>>> > >> Julian,
>>> > >>
>>> > >>
>>> > >>
>>> > >> Following your commit on the statements refactoring, I
>>> > have started to
>>> > >> think about how we would now enforce a proper query
>>> > timeout. (MONDRIAN-415).
>>> > >>
>>> > >>
>>> > >>
>>> > >> What I had in mind is a simple system where, when creating
>>> > RolapResult
>>> > >> objects (in RolapConnection.execute()) we fork the
>>> > execution to a second
>>> > >> thread (let’s call it the execution thread). The first
>>> > thread (let’s call it
>>> > >> the user thread) will wait a predetermined amount of time
>>> > (the actual
>>> > >> timeout) and if the execution thread has not completed,
>>> > the user thread will
>>> > >> perform the following.
>>> > >>
>>> > >> Stop all SQL statements currently associated to this Execution.
>>> > >> Do house cleaning.
>>> > >> Bake a cake. (optional)
>>> > >> Throw back a timeout exception.
>>> > >>
>>> > >> This has several advantages. First off, we could remove
>>> > all checks for
>>> > >> query timeout in the evaluator and centralize everything
>>> > in a single place.
>>> > >> Second, as it stands now, there is no way to do a “hard
>>> > stop” of any SQL
>>> > >> statements running for a given Execution as we sit in the
>>> > same thread. As
>>> > >> long as the SQL Statement has not returned, we have no way
>>> > to check for a
>>> > >> timeout. As of now, the best we can do is to set an
>>> > arbitrary JDBC timeout,
>>> > >> but since there is no telling how many SQL statements will
>>> > be issued, nor
>>> > >> their priority, we cannot determine the proper JDBC
>>> > timeout value to apply.
>>> > >> By having the user thread overview the execution process,
>>> > it becomes much
>>> > >> simpler to manage and cleanup all the resources used by a
>>> > query execution.
>>> > >>
>>> > >> Should I try this approach and submit a first POC?
>>> > >
>>> > >
>>> > > _______________________________________________
>>> > > Mondrian mailing list
>>> > > Mondrian at pentaho.org
>>> > > http://lists.pentaho.org/mailman/listinfo/mondrian
>>> > >
>>> > >
>>> > _______________________________________________
>>> > Mondrian mailing list
>>> > Mondrian at pentaho.org
>>> > http://lists.pentaho.org/mailman/listinfo/mondrian
>>> >
>>>
>>> _______________________________________________
>>> Mondrian mailing list
>>> Mondrian at pentaho.org
>>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>>
>> _______________________________________________
>> Mondrian mailing list
>> Mondrian at pentaho.org
>> http://lists.pentaho.org/mailman/listinfo/mondrian
>>
>>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configurableThreadFactory.diff
Type: text/x-patch
Size: 3054 bytes
Desc: not available
Url : http://lists.pentaho.org/pipermail/mondrian/attachments/20110808/414d034e/attachment-0001.bin 


More information about the Mondrian mailing list