[Mondrian] Re: Aggregate Tables Not Being Used

Nikolas Everett nik9000 at gmail.com
Wed Jan 14 11:05:58 EST 2009


OK, I've finally got mondrian to use aggregate tables.

I was setting
mondrian.rolap.aggregates.Use=true
and
mondrian.rolap.aggregates.Read=true
in the mondrian.properties file that came with
mondrian-3.0.4.11371-derby.zip.  I think that file is in the wrong place.
At least, when I move the file from
WEB-INF/
to
WEB-INF/classes
the aggregate tables as picked up.

Please have a look at your demo application.

I'm still curious if there is some standard tool to do an explain on an MDX
query to tell me what SQL the query will produce.

Thanks

--Nik

On Tue, Jan 13, 2009 at 10:15 AM, Nikolas Everett <nik9000 at gmail.com> wrote:

> I'm experimenting with Mondrian to see if it will fit our reporting needs.
> I've set up a small fact table backed into Postgres 8.3 with 241116 rows.
> Thats about 1 day's worth of data for this fact table.  I'm definitely going
> to need aggregate tables to make this sensible.  I won't have a problem
> making the tables and have a great server to host the production system, but
> I need to be sure that I don't hit the actual fact table when I don't want
> to.  Is there some utility that I can use to plan my queries to see what
> tables they'll hit?
>
> Also, I'm having trouble getting aggregate tables to work at all.  I'd be
> super appreciative if someone could help me with that.
>
> This is an abbreviated cube:
>
>>   <Cube name="Signals">
>>     <Table name="signalreportdata">
>>       <AggName name="signalreportdata_hour">
>>         <AggFactCount column="fact_count"/>
>>         <AggMeasure name="[Measures].[Count]" column="count"/>
>>         <AggForeignKey factColumn="hour_id" aggColumn="hour_id"/>
>>       </AggName>
>>     </Table>
>>     <DimensionUsage name="Request" source="Request"
>> foreignKey="request_id"/>
>>     <Dimension name="Response">
>>       <Hierarchy hasAll="true" allMemberName="All Responses">
>>         <Level name="Response" column="response" uniqueMambers="false"/>
>>       </Hierarchy>
>>     </Dimension>
>>     <DimensionUsage name="OutPeer" source="Peer" foreignKey="outpeer_id"
>> usagePrefix="Out"/>
>>     <DimensionUsage name="InPeer" source="Peer" foreignKey="inpeer_id"
>> usagePrefix="In"/>
>>     <DimensionUsage name="Timestamp" source="Hour" foreignKey="hour_id"/>
>>     <DimensionUsage name="File" source="File" foreignKey="file_id"/>
>>     <Measure name="Count" column="count" aggregator="sum"
>> datatype="Integer"/>
>>   </Cube>
>>
>
> The fact table look like:
>
>> CREATE TABLE signalreportdata
>> (
>>   id bigint NOT NULL,
>>   "version" bigint NOT NULL,
>>   count bigint NOT NULL,
>>   response integer NOT NULL,
>>   file_id bigint NOT NULL,
>>   hour_id bigint NOT NULL,
>>   inpeer_id bigint NOT NULL,
>>   outpeer_id bigint NOT NULL,
>>   request_id bigint NOT NULL,
>>   CONSTRAINT signalreportdata_pkey PRIMARY KEY (id),
>>   CONSTRAINT fkfe865be64c75fae7 FOREIGN KEY (hour_id)
>>       REFERENCES "hour" (id) MATCH SIMPLE
>>       ON UPDATE NO ACTION ON DELETE NO ACTION,
>>   CONSTRAINT fkfe865be6548c3162 FOREIGN KEY (inpeer_id)
>>       REFERENCES peer (id) MATCH SIMPLE
>>       ON UPDATE NO ACTION ON DELETE NO ACTION,
>>   CONSTRAINT fkfe865be6936dccef FOREIGN KEY (file_id)
>>       REFERENCES signalfile (id) MATCH SIMPLE
>>       ON UPDATE NO ACTION ON DELETE NO ACTION,
>>   CONSTRAINT fkfe865be69f63f48d FOREIGN KEY (request_id)
>>       REFERENCES request (id) MATCH SIMPLE
>>       ON UPDATE NO ACTION ON DELETE NO ACTION,
>>   CONSTRAINT fkfe865be6a2e07f19 FOREIGN KEY (outpeer_id)
>>       REFERENCES peer (id) MATCH SIMPLE
>>       ON UPDATE NO ACTION ON DELETE NO ACTION
>> )
>> WITH (OIDS=FALSE);
>>
>
> The aggregate table looks like:
>
>> CREATE TABLE signalreportdata_hour
>> (
>>   hour_id bigint,
>>   count numeric
>> )
>> WITH (OIDS=FALSE);
>>
>
> I brought this online by dropping the mondrian-embedded war into jetty and
> removing FoodMart and adding my schema.  I also hacked up the JSPs to point
> to my schema and dropped the postgres jdbc driver in the WEB-INF/lib
> directory.
>
> I've covinced mondrian to be super verbose about its logging.  I think this
> is the relevant bit:
>
>> 443  [1473444918 at qtp0-7] DEBUG mondrian.rolap.aggmatcher.ExplicitRules  -
>>
>> ExplicitRules.Group:
>>   name=signalreportdata
>>   TableDefs: [
>>     ExplicitRules.NameTableDef:
>>       id=0
>>       ignoreCase=true
>>       Levels: [
>>       ]
>>       Measures: [
>>         Measure:
>>           name=[Measures].[Count]
>>           column=count
>>       ]
>>       name=signalreportdata_hour
>>   ]
>>
>>
>> 446  [1473444918 at qtp0-7] DEBUG mondrian.rolap.aggmatcher.AggTableManager
>> -
>> RolapStar:
>>   Table:
>>     alias=signalreportdata
>>     relation=signalreportdata
>>     Columns:
>>       Response (1): "signalreportdata"."response"
>>       Count (7): sum("signalreportdata"."count")
>>     Table:
>>       alias=request
>>       relation=request
>>       Columns:
>>         Request (0): "request"."name"
>>       Condition:
>>         left="signalreportdata"."request_id"
>>         right="request"."id"
>>     Table:
>>       alias=peer
>>       relation=peer
>>       Columns:
>>         Hostname (2): "peer"."hostname"
>>       Condition:
>>         left="signalreportdata"."outpeer_id"
>>         right="peer"."id"
>>     Table:
>>       alias=peer_1
>>       relation=peer
>>       Columns:
>>         Hostname (3): "peer_1"."hostname"
>>       Condition:
>>         left="signalreportdata"."inpeer_id"
>>         right="peer_1"."id"
>>     Table:
>>       alias=hour
>>       relation=hour
>>       Columns:
>>         Timestamp (4): "hour"."timestamp"
>>       Condition:
>>         left="signalreportdata"."hour_id"
>>         right="hour"."id"
>>     Table:
>>       alias=signalfile_view
>>       relation=signalfile_view
>>       Columns:
>>         Server (5): "signalfile_view"."server"
>>         File (6): "signalfile_view"."filename"
>>       Condition:
>>         left="signalreportdata"."file_id"
>>         right="signalfile_view"."id"
>>
>>
>> 446  [1473444918 at qtp0-7] DEBUG mondrian.rolap.RolapSchema  - Pool.get:
>> create schema
>> "file:/home/nik/Desktop/jetty/jetty-6.1.14/webapps/mondrian/WEB-INF/queries/Signals.xml"
>> 451  [1473444918 at qtp0-7] DEBUG mondrian.rolap.RolapConnection  -
>>
>> select
>>   [Measures].[Count] on columns,
>>   {([Timestamp].[All Timestamps], [InPeer].[All InPeers], [OutPeer].[All
>> OutPeers], [Request].[All Requests], [Response].[All Responses], [File].[All
>> Files])} ON rows
>> from Signals
>>
>>
>
> The query above gets evaluated run as
>
> select sum("signalreportdata"."count") as "m0" from "signalreportdata" as
> "signalreportdata"
>
> when it probably ought to have been
> select sum("signalreportdata_hour"."count") as "m0" from
> "signalreportdata_hour" as "signalreportdata_hour"
>
> Its not like there is a big speed difference now, but there will be when my
> primary fact table has over one hundred million rows.
>
> Thanks so much your help, good list.
>
> Nik Everett
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20090114/9653a8a1/attachment.html 


More information about the Mondrian mailing list