[Mondrian] Re: TPCH-H for Mondrian

John V. Sichi jsichi at gmail.com
Sat Dec 22 15:02:39 EST 2007


How about a mondrian/benchmark/tpch-star directory for checking in 
frameworks and vendor-specific examples?  Once that's set up, I'll add 
LucidDB.

Since there are strict rules about reporting TPC-H results of any kind, 
we should make it clear via disclaimers and naming that this is an 
unofficial multidimensional adaptation and not the real TPC-H benchmark.

BTW, there's already an //open/dev/thirdparty/tpch.tar.gz with the DBGEN 
code, so if you want, you can just point there in the setup instructions 
instead of duplicating it.  We use it for generating data to be used for 
the SQL-level testing in //open/dev/luciddb/test/sql/tpch.  At the 10GB 
scale, if you have trouble with DBGEN and files larger than 4GB, the 
workaround we use is to generate smaller chunks and then cat them into 
the full-size file.

JVS

Sherman Wood wrote:
> Good paper, thanks.
> 
> I have done a lot of the schema optimizations he talks about, but am not
> going to change DBGEN, as I think it is important to stay close to the
> data values in the test.
> 
> Views certainly help with databases that support them well. Looks like we
> will be doing tests with Oracle, MySQL (different storage engines?),
> Postgres and Ingres, so the database tuning specifics will be different.
> 
> 
> Sherman
> 
> -----Original Message-----
> From: mondrian-bounces at pentaho.org [mailto:mondrian-bounces at pentaho.org]
> On Behalf Of John V. Sichi
> Sent: Saturday, December 22, 2007 6:16 PM
> To: Mondrian developer mailing list
> Subject: [Mondrian] Re: TPCH-H for Mondrian
> 
> This paper describes something similar to what you're talking about (a 
> version of TPC-H modified to be more amenable to cubing).  Views can 
> also help for getting things into a form Mondrian likes (rather than 
> trying to express all table->cube mapping at the Mondrian schema level).
> 
> http://www.cs.umb.edu/~poneil/StarSchemaB.PDF
> 
> JVS
> 
> Sherman Wood wrote:
>> Yeah, we talked about doing a scalability test using TPC-H a while ago, 
>> and now I am getting some help with that from a partner who has a large 
>> environment - machines, disk, database.
>>
>>  
>>
>> I am just developing the schema now. I had few issues with the 
>> snowflakey-ness, not the least of which is that it is a transactional 
>> model not optimized in the way that you would like for Mondrian. I had a
> 
>> few issues with Mondrian dealing with snowflakes - it was not clear how 
>> to do joins of joins, getting the aliases right etc.
>>
>>  
>>
>> With the schema almost together, I am working on simulating the TPC-H 
>> test queries in MDX, most of which are easy.
>>
>>  
>>
>> I could contribute this back, but how best to do it? It would make sense
> 
>> to have an example TPC-H database in Mondrian in the same way as we have
> 
>> Foodmart today, and a test suite against that.
>>
>>  
>>
>> I am also looking at using a few simple transforms of the TPC-H database
> 
>> that make it more suitable for Mondrian - less snowflakey, making the 
>> linetiem table a real fact table by adding keys on it and adding a few 
>> indexes. These additions could also be included with a separate schema.
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian
> 
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian
> 




More information about the Mondrian mailing list