[Mondrian] RE: VisualTotals + Distinct-count measure gives wrong results

Sergey Mazin sergey.mazin at skype.net
Thu Feb 4 14:16:55 EST 2010


Hi Julian,

 

Here is fix for #1 and #3. It is not the best way to do it, but it works.

 

Implementation for tuples is still missing. L

 

Best regards,

Sergey

 

 

From: Sergey Mazin [mailto:sergey.mazin at skype.net] 
Sent: Monday, January 25, 2010 1:32 PM
To: 'jhyde at pentaho.com'; 'Mondrian developer mailing list'
Subject: RE: VisualTotals + Distinct-count measure gives wrong results

 

As it turned out there are 3 problems with VisualTotals at the moment:

1.       RolapAggregationManager.makeCompoundGroup method doesn't check
VisualTotalMembers for constraints. I think I have a fix for that.

2.       DisplayInfo for VisualTotalMembers is sometimes 0. In Excel it
means that it doesn't allow to drilldown to next level (Excel thinks there
are no any children for that level). I think I will have a fix for that also
when issue #3 is fixed.

3.       VisualTotals don't check constraints of children of their children.

Let's say we have query:

 

WITH SET [XL_Row_Dim_0] AS 
'VisualTotals(Distinct(Hierarchize({ 
Ascendants([Store].[All Stores].[USA].[OR]), 
Descendants([Store].[All Stores].[USA].[OR]), 
Ascendants([Store].[All Stores].[USA].[CA]), 
Descendants([Store].[All Stores].[USA].[CA])})))' 
SELECT NON EMPTY Hierarchize(Intersect({DrilldownLevel({ 
[Store].[All Stores] 
})}, [XL_Row_Dim_0])) ON COLUMNS 
FROM [HR] WHERE ([Measures].[Number of Employees])

 

It should return:

Axis #0: 
{[Measures].[Number of Employees]} 
Axis #1: 
{[Store].[All Stores]} 
{[Store].[All Stores].[USA]} 
Row #0: 329 
Row #0: 329

 

But it returns:

Axis #0: 
{[Measures].[Number of Employees]} 
Axis #1: 
{[Store].[All Stores]} 
{[Store].[All Stores].[USA]} 
Row #0: 616
Row #0: 329

 

SQL queries look like:

 

For {[Store].[All Stores].[USA]}

7438 [http-8080-1] DEBUG mondrian.sql  - 21: Segment.load: executing sql
[select "time_by_day"."the_year" as "c0", count(distinct
"salary"."employee_id") as "m0" from "time_by_day" as "time_by_day",
"salary" as "salary", "store" as "store", "employee" as "employee" where
"salary"."pay_date" = "time_by_day"."the_date" and "time_by_day"."the_year"
= 1997 and "salary"."employee_id" = "employee"."employee_id" and
"employee"."store_id" = "store"."store_id" and "store"."store_state" in
('CA', 'OR') group by "time_by_day"."the_year"]

 

For {[Store].[All Stores]}

7453 [http-8080-1] DEBUG mondrian.sql  - 22: Segment.load: executing sql
[select "time_by_day"."the_year" as "c0", count(distinct
"salary"."employee_id") as "m0" from "time_by_day" as "time_by_day",
"salary" as "salary", "store" as "store", "employee" as "employee" where
"salary"."pay_date" = "time_by_day"."the_date" and "time_by_day"."the_year"
= 1997 and "salary"."employee_id" = "employee"."employee_id" and
"employee"."store_id" = "store"."store_id" and "store"."store_country" =
'USA' group by "time_by_day"."the_year"]

 

For {[Store].[All Stores]} level it doesn't add {[Store].[All Stores].[USA]}
constraints.

It is very important for as to get it fixed, we are using Excel 2007 as main
tool to access cubes and it sends queries in that format.

 

Regards,

Sergey Mazin

 

From: Julian Hyde [mailto:jhyde at pentaho.com] 
Sent: Friday, January 22, 2010 9:31 AM
To: 'Sergey Mazin'; 'Mondrian developer mailing list'
Subject: RE: VisualTotals + Distinct-count measure gives wrong results

 

I'd put a breakpoint in the expression inside the visual totals function. It
should be aggregate( < set of members >).

 

Distinct-count measures are very difficult to aggregate. You have to go back
to the underlying database. We aggregate distinct-count measures correctly
in other places (e.g. if you create a compound slicer) but it's tricky.

 


  _____  


From: Sergey Mazin [mailto:sergey.mazin at skype.net] 
Sent: Thursday, January 21, 2010 8:17 AM
To: 'Mondrian developer mailing list'
Cc: jhyde at pentaho.com
Subject: VisualTotals + Distinct-count measure gives wrong results

Hi Julian,

 

http://jira.pentaho.com/browse/MONDRIAN-682

 

[Measures].[Number of Employees] - is distinct count measure. 

Example: 

WITH SET [XL_Row_Dim_0] AS 
'VisualTotals(Distinct(Hierarchize({ 
Ascendants([Store].[All Stores].[USA].[OR]), 
Descendants([Store].[All Stores].[USA].[OR]), 
Ascendants([Store].[All Stores].[USA].[CA]), 
Descendants([Store].[All Stores].[USA].[CA])})))' 
SELECT NON EMPTY Hierarchize(Intersect({DrilldownLevel({ 
[Store].[All Stores] 
})}, [XL_Row_Dim_0])) ON COLUMNS 
FROM [HR] WHERE ([Measures].[Number of Employees]) 

Query should return: 

Axis #0: 
{[Measures].[Number of Employees]} 
Axis #1: 
{[Store].[All Stores]} 
{[Store].[All Stores].[USA]} 
Row #0: 329 
Row #0: 329 

But returns empty set. 

The same query with NOT distinct measure works.

 

In distinct query cells are not being evaluated properly and no queries is
sent to database.

Could you please advise what is the best place to start investigating.

 

Best regards,

Sergey

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20100204/528d062d/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: RolapAggregationManager.java
Type: text/java
Size: 28805 bytes
Desc: not available
Url : http://lists.pentaho.org/pipermail/mondrian/attachments/20100204/528d062d/attachment.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: IntersectFunDef.java
Type: text/java
Size: 14650 bytes
Desc: not available
Url : http://lists.pentaho.org/pipermail/mondrian/attachments/20100204/528d062d/attachment-0001.bin 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: testcases.txt
Url: http://lists.pentaho.org/pipermail/mondrian/attachments/20100204/528d062d/attachment.txt 


More information about the Mondrian mailing list