[Mondrian] Re: Mondrian: Question about your change 8710

Rushan Chen rchen at lucidera.com
Thu Sep 27 21:01:32 EDT 2007


I made the change in changelist 9929.

The cache needs to be in RolapSchema as RolapStar are not shared across 
cubes(on different fact tables). Also, used a column hash key as lookup 
key rather than the Column object itself because there could be 
different copies of them, one for each cube.

However I overlooked the issue of concurrent access to this cache. I am 
considering ConcurrentHashMap class (a JDK 1.5 feature). Is this 
compatible with the commonly used JDK setting? The implementation seems 
to exists in the backport-util-concurrent.jar. Does it mean usage of 
this class is backward compatible?

Rushan

Julian Hyde wrote:
> I agree that we need a cache. The ideal place for this cache would be at the
> RolapStar (relational) level rather than RolapSchema (dimensional). Or at
> least, the keys of the cache would be RolapStar.Column objects. I'm not so
> fussed where the cache lives.
>
>   
>> -----Original Message-----
>> From: Rushan Chen [mailto:rchen at lucidera.com] 
>> Sent: Thursday, September 27, 2007 1:19 PM
>> To: Julian Hyde
>> Cc: mondrian at pentaho.org
>> Subject: Re: Mondrian: Question about your change 8710
>>
>> Currently there's actually no "cardinality" cache in the form of 
>> (column, cardinality) pairs for columns referenced in a relational 
>> constraint. What we have is storing the cardinality result in the 
>> in-memory structure representing the column. If there are multiple 
>> copies of this representation for the same column, we could not take 
>> advantage of previously calculated cardinality result.
>>
>> An example how this might happen is when a MDX references a 
>> virtual cube 
>> and selects measures from different base cubes, the aggregate loading 
>> are against different fact tables but with the same column 
>> constraints. 
>> Because each cube has its set of column representations, the 
>> cardinality 
>> queries are repeated. In this case, only the first load should issue 
>> issue the cardinality queries and the subsequent load should 
>> be able to 
>> reuse that result. Similarly, if a user were to issue two 
>> MDXs against 
>> different cubes but using some shared dimensions, the second 
>> MDX should 
>> be able to see the cached cardinality from the first MDX.
>>
>> I am going to check in a fix to add the (column, cardinality) 
>> cache like 
>> you expected. This is stored in the RolapSchema so it can be shared 
>> across different cubes.
>>
>> Rushan
>>
>> Julian Hyde wrote:
>>     
>>> The cache is relational not dimensional - it works in terms 
>>>       
>> of (column,
>>     
>>> value) pairs rather then members - so it made sense to switch to a
>>> relational constraint. I think we use dimensional 
>>>       
>> constraints occasionally,
>>     
>>> but I want to move away from that.
>>>
>>> I had no idea that we were losing the caching. We should 
>>>       
>> reinstate that. Can
>>     
>>> you log a bug for that?
>>>
>>> It starts me thinking - yet again - that we should have a 
>>>       
>> performance
>>     
>>> regression test. In this case, it would help to have a test 
>>>       
>> which records
>>     
>>> every single SQL statement executed for a particular query. 
>>>       
>> We couldn't
>>     
>>> maintain too many such tests, but a small number would help 
>>>       
>> to raise flags
>>     
>>> when something fundamental has changed.
>>>
>>> Cc:ing the other developers in case they have some ideas.
>>>
>>> Julian
>>>
>>>
>>>   
>>>       
>>>> -----Original Message-----
>>>> From: Rushan Chen [mailto:rchen at lucidera.com] 
>>>> Sent: Wednesday, September 26, 2007 10:51 AM
>>>> To: julianhyde at speakeasy.net
>>>> Subject: Mondrian: Question about your change 8710
>>>>
>>>> Hi Julian,
>>>>
>>>> Recently I have been doing some comparison of generated SQLs 
>>>> between two 
>>>> mondrian versions we use here at LucidEra. The two 
>>>>         
>> versions are based 
>>     
>>>> off the branches //open/lu/mondrian (call it Release A)and 
>>>> //open/lu/release/mondrian/countzero(Release B)
>>>>
>>>> One of the differences I noticed is the increased number of 
>>>> this query  
>>>> issued to optimize  column constraint  pushdown(into SQL) during 
>>>> aggregate loading:
>>>>
>>>> select count(distinct levelColumn) from  dimensionTable;
>>>>
>>>> Previously in Release A, with the same connection, there 
>>>>         
>> will just be 
>>     
>>>> one such query per dimension column; In release B, these 
>>>>         
>> queries are 
>>     
>>>> repeated even for the same column. One of the reasons that 
>>>> the result is 
>>>> no longer cached is that the constraint type has changed from 
>>>> MemberColumnConstraint to ValueColumnConstraint, which 
>>>>         
>> does not have 
>>     
>>>> associated RolapLevel to cache the result into. This change 
>>>> was made in 
>>>> changelist 8710, to file RolapLevel.java(revision # 48, 
>>>> search for "new 
>>>> MemberColumnConstraint"). The code currently has a 
>>>>         
>> permanent "false" 
>>     
>>>> condition so MemberColumnConstraint will never be used.
>>>>
>>>> Can you recall the reason for this change? I am looking to 
>>>> make caching 
>>>> work again for column cardinality queries; and would like to 
>>>> understand 
>>>> what problems these changes in the past are trying to solve 
>>>> so as not to 
>>>> break anything.
>>>>
>>>> Thanks,
>>>>
>>>> Rushan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>     
>>>>         
>>>   
>>>       
>>
>>     
>
>   





More information about the Mondrian mailing list