[Mondrian]Re:VirtualCubeTest.testCalculatedMemberAcrossCubesfailingon SMP

Pappyn Bart Bart.Pappyn at vandewiele.com
Wed Jan 24 03:18:08 EST 2007


Hierarchy cache is already controllable with the data source listener plug-in.
 
Indeed, there are a lot of entry points, especially when it comes down to non mdx query executing
related stuff.  xmla is the easiest one, because behavior is controlled by mondrian itself, but jpivot
native is calling things directly.
 
RolapConnection is already storing a jdbc connection, so it is better to use this one instead of inventing a new
principle. And since a transaction should live as long as a RolapConnection, that should be the right path
to take.  For now, agg loaders are not able to use this one, but patching is easy.  The query object that
contains the connection is already available in FastBatchingCellReader.loadAggregates(), so it is only a
matter of taking it further down.
 
For the moment RolapUtil.executeQuery only allows one query at a time to be executed.  At the moment this
is a bottleneck in multi-user applications.  But it was required since all threads where sharing the same jdbc connections.
 
For non-mdx query related functions, regarding xmla discover, mondrian native calls to hierarchy cache, there is a bigger
problem, since there is no single entry point and I cannot see everything clear for the moment.  It is not clear how long
transactions should live and how to make sure hierarchy cache is following the data integrity of the mdx query and is in sync.
 
There is also a problem about aggregate statistics about size and volume.  After my upcoming changes, mondrian will be
able to flush the cache, but the statistics will not be loaded again, so mondrian might take the wrong decisions.
 
Another thought :  For the moment, all threads executing the queries are filling the cache.  Query logic and data loading
is now very much woven in each other and is tightly coupled. This makes things difficult and results in hierarchy cache and aggregate cache 
live their own lives and not being in sync.  I think in the future it might be better to have threads executing queries and asking a central cache manager 
for required data.  The query threads would just wait until the cache comes available, the central cache manager is in control of loading 
all the data (both aggregates as hierarchy data) and checking the data source for modifications.  The central cache manager could have 
multiple worker threads to load aggregates and hierarchy data, to be able to scale better on SMP hardware and multi-user applications.  
This way, most problems would be vanished and there would be a much better control of data integrity.  It would be much easier to control 
database transactions and jdbc connections. Central cache management would also open doors for cold start support and so on...  
 
For Julian : Could this become a roadmap topic ?
 
Bart


________________________________

From: mondrian-bounces at pentaho.org [mailto:mondrian-bounces at pentaho.org] On Behalf Of michael bienstein
Sent: dinsdag 23 januari 2007 17:16
To: Mondrian developer mailing list
Subject: Re : Re : Re :[Mondrian]Re:VirtualCubeTest.testCalculatedMemberAcrossCubesfailingon SMP


My comments in your text.


----- Message d'origine ----
De : Pappyn Bart <Bart.Pappyn at vandewiele.com>
À : Mondrian developer mailing list <mondrian at pentaho.org>
Envoyé le : Mardi, 23 Janvier 2007, 16h13mn 42s
Objet : RE: Re : Re : [Mondrian]Re:VirtualCubeTest.testCalculatedMemberAcrossCubesfailing on SMP


Michael,
 
The exact details of transactions and connections I have not looked at.  You might be right it is OK to use
a transaction per query.  When the moment is there I will take a look.
 
Do note that it is the responsibility of the plug-in to detect changes, not mondrian itself.  The plug-in can share
the transaction of the query [Good] and see changes before the transaction started and flush the cache involved. [Looking forward to seeing how]
 
First I will finish my current changes to RolapStar.  If everything is working fine, I might take a look at connections
and transactions.
 
But there is one big problem for the moment : most connections to the database occur with the default connection
of the star and not with the jdbc connection of the RolapConnection.  [That's why I proposed the QueryContext idea.  They should all go through the one Connection and we should be able to plug the creation of that Connection in via different objects so that the transaction charateristics can be modified]
 
Another problem is the hierarchy cache.  In my case, hierarchy data is changing all the time, especially because
I have real-time data that is in properties of a dimension.  But because the reading of the hierarchies is not in sync
with a mdx query connection, there could be problems.  For example, jpivot can ask for hierarchy data without an
actual mdx query being executed.  Not sure how to solve this one.  [I'm not sure either.  There are 2 problems: consistency between SQL results in the same MDX or XMLA Members query and the second is making the hierachy cache more controllable like you've done with the cell cache.  The first is handled by QueryContext if we can put this around all the entry points in Mondrian.  The second I don't know.  I tried to put something like QueryContext into Mondrian in my sandbox in the summer and got bogged down with RolapConnection and RolapCube's constructors calling each other as well as hunting down all the entry points.  Julian told me to hold off until we get OLAP4J up to scratch so that we hav an API to go with.  That's taking longer than hoped for.]  Maybe this cache need to live on its own and
use a transaction per sql query?
 
Bart
 
________________________________

From: mondrian-bounces at pentaho.org [mailto:mondrian-bounces at pentaho.org] On Behalf Of michael bienstein
Sent: dinsdag 23 januari 2007 15:55
To: Mondrian developer mailing list
Subject: Re : Re : [Mondrian]Re:VirtualCubeTest.testCalculatedMemberAcrossCubesfailing on SMP


Bart,
 
Transactions on an OLTP Database should be short lived.  You say " 
When changes are detected, than the transaction for - this thread only - should be stopped and a new
one should be taken".  You can't detect changes in a transaction.  That's the Atomic property of ACID transactions.  You *have to* use a separate transaction to detect changes.
 
Now how you keep cached values from one transaction to simplify the requirements of a second transaction is a complex question.  ORM tools like Hibernate do this.  I assumed that at least in a first pass we wouldn't.  I thought of the following: 
1) MDX Query begins - obtain a transaction context for the query.
2) MDX Query wants to pull data from a RolapStar or AggStar or just a table.
2a) Is this table a hot-updated table or a relatively stable data store?  Stable data stores can be chaced globally but hot-cached tables use only transaction-local cache.
2b) If hot updatable - do we have the data in the transaction local cache?  If yes, we're done.  Otherwise go to 3.
2c) If stable table, does the global cache have the information?  If yes then we're done.  Otherwise go to 3.
3) Obtain the open connection in the current DB transaction if there is one.  If not, make one, associate it with the query context/transaction context for future requests.
4) Query the data we want and place it into the cache (global or transaction local depending on the nature of the source of the data).
5) Keep going using this system until all data is found.
6) At end of query, dispose the transaction context - this will simply throw the transaction-local cache in the bin.
 
Michael
 
 

________________________________

Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! Profitez des connaissances, des opinions et des expériences des internautes sur Yahoo! Questions/Réponses <http://fr.rd.yahoo.com/evt=42054/*http://fr.answers.yahoo.com> . 
______________________________________________________________________
This email has been scanned by the Email Security System.
______________________________________________________________________

_______________________________________________
Mondrian mailing list
Mondrian at pentaho.org
http://lists.pentaho.org/mailman/listinfo/mondrian


________________________________

Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! Profitez des connaissances, des opinions et des expériences des internautes sur Yahoo! Questions/Réponses <http://fr.rd.yahoo.com/evt=42054/*http://fr.answers.yahoo.com> . 
______________________________________________________________________
This email has been scanned by the Email Security System.
______________________________________________________________________

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20070124/ac67aaa1/attachment.html 


More information about the Mondrian mailing list