[Mondrian] Re: Mondrian: Question about your change 8710
Rushan Chen
rchen at lucidera.com
Thu Sep 27 21:01:32 EDT 2007
I made the change in changelist 9929.
The cache needs to be in RolapSchema as RolapStar are not shared across
cubes(on different fact tables). Also, used a column hash key as lookup
key rather than the Column object itself because there could be
different copies of them, one for each cube.
However I overlooked the issue of concurrent access to this cache. I am
considering ConcurrentHashMap class (a JDK 1.5 feature). Is this
compatible with the commonly used JDK setting? The implementation seems
to exists in the backport-util-concurrent.jar. Does it mean usage of
this class is backward compatible?
Rushan
Julian Hyde wrote:
> I agree that we need a cache. The ideal place for this cache would be at the
> RolapStar (relational) level rather than RolapSchema (dimensional). Or at
> least, the keys of the cache would be RolapStar.Column objects. I'm not so
> fussed where the cache lives.
>
>
>> -----Original Message-----
>> From: Rushan Chen [mailto:rchen at lucidera.com]
>> Sent: Thursday, September 27, 2007 1:19 PM
>> To: Julian Hyde
>> Cc: mondrian at pentaho.org
>> Subject: Re: Mondrian: Question about your change 8710
>>
>> Currently there's actually no "cardinality" cache in the form of
>> (column, cardinality) pairs for columns referenced in a relational
>> constraint. What we have is storing the cardinality result in the
>> in-memory structure representing the column. If there are multiple
>> copies of this representation for the same column, we could not take
>> advantage of previously calculated cardinality result.
>>
>> An example how this might happen is when a MDX references a
>> virtual cube
>> and selects measures from different base cubes, the aggregate loading
>> are against different fact tables but with the same column
>> constraints.
>> Because each cube has its set of column representations, the
>> cardinality
>> queries are repeated. In this case, only the first load should issue
>> issue the cardinality queries and the subsequent load should
>> be able to
>> reuse that result. Similarly, if a user were to issue two
>> MDXs against
>> different cubes but using some shared dimensions, the second
>> MDX should
>> be able to see the cached cardinality from the first MDX.
>>
>> I am going to check in a fix to add the (column, cardinality)
>> cache like
>> you expected. This is stored in the RolapSchema so it can be shared
>> across different cubes.
>>
>> Rushan
>>
>> Julian Hyde wrote:
>>
>>> The cache is relational not dimensional - it works in terms
>>>
>> of (column,
>>
>>> value) pairs rather then members - so it made sense to switch to a
>>> relational constraint. I think we use dimensional
>>>
>> constraints occasionally,
>>
>>> but I want to move away from that.
>>>
>>> I had no idea that we were losing the caching. We should
>>>
>> reinstate that. Can
>>
>>> you log a bug for that?
>>>
>>> It starts me thinking - yet again - that we should have a
>>>
>> performance
>>
>>> regression test. In this case, it would help to have a test
>>>
>> which records
>>
>>> every single SQL statement executed for a particular query.
>>>
>> We couldn't
>>
>>> maintain too many such tests, but a small number would help
>>>
>> to raise flags
>>
>>> when something fundamental has changed.
>>>
>>> Cc:ing the other developers in case they have some ideas.
>>>
>>> Julian
>>>
>>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: Rushan Chen [mailto:rchen at lucidera.com]
>>>> Sent: Wednesday, September 26, 2007 10:51 AM
>>>> To: julianhyde at speakeasy.net
>>>> Subject: Mondrian: Question about your change 8710
>>>>
>>>> Hi Julian,
>>>>
>>>> Recently I have been doing some comparison of generated SQLs
>>>> between two
>>>> mondrian versions we use here at LucidEra. The two
>>>>
>> versions are based
>>
>>>> off the branches //open/lu/mondrian (call it Release A)and
>>>> //open/lu/release/mondrian/countzero(Release B)
>>>>
>>>> One of the differences I noticed is the increased number of
>>>> this query
>>>> issued to optimize column constraint pushdown(into SQL) during
>>>> aggregate loading:
>>>>
>>>> select count(distinct levelColumn) from dimensionTable;
>>>>
>>>> Previously in Release A, with the same connection, there
>>>>
>> will just be
>>
>>>> one such query per dimension column; In release B, these
>>>>
>> queries are
>>
>>>> repeated even for the same column. One of the reasons that
>>>> the result is
>>>> no longer cached is that the constraint type has changed from
>>>> MemberColumnConstraint to ValueColumnConstraint, which
>>>>
>> does not have
>>
>>>> associated RolapLevel to cache the result into. This change
>>>> was made in
>>>> changelist 8710, to file RolapLevel.java(revision # 48,
>>>> search for "new
>>>> MemberColumnConstraint"). The code currently has a
>>>>
>> permanent "false"
>>
>>>> condition so MemberColumnConstraint will never be used.
>>>>
>>>> Can you recall the reason for this change? I am looking to
>>>> make caching
>>>> work again for column cardinality queries; and would like to
>>>> understand
>>>> what problems these changes in the past are trying to solve
>>>> so as not to
>>>> break anything.
>>>>
>>>> Thanks,
>>>>
>>>> Rushan
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>
More information about the Mondrian
mailing list