[Mondrian] Changing the mondrian development process to prevent performance slippages

John V. Sichi jsichi at gmail.com
Sun Jun 8 15:46:59 CDT 2008


Julian Hyde wrote:
> Since LucidEra and Thomson/Thoughtworks are the two largest groups besides
> Pentaho who have an interest in developing mondrian, I would like those two
> groups in particular to step up with suggestions and offers of help. Pentaho
> can provide resources to run the process and publish results, but can only
> offer limited leadership.

Below is a little script which will sync a particular version of 
Mondrian, build it, run a query via cmdrunner using a particular 
properties file, and then grep out the execution time.  We could start 
with that as a baby step of automated single-user no-cache regression 
detection.

If Pentaho can set up exec+publishing automation for something like this 
with a few test queries, plus a way for contributors to add new ones, 
LucidEra can submit a lot of coverage queries.  Publication could 
include per-query and total-time line graphs with change number as the x 
axis.

Since a script doesn't depend on any code changes, it could easily be 
used for historical analysis as well (write a higher-level script which 
pulls all old change numbers from Perforce and collects timing from 
them, or at least for compatible queries).  That's one of the reasons 
the script below copies from a clean client to a dirty temp workspace; 
that way the pull from Perforce can be incremental for each change 
number, and we don't have to worry about pollution across changes. 
Think binary search in eigenchange space for automatically finding the 
point of introduction of a regression...

We could start with a single configuration, and then start adding more 
to the mix (as with megatest), as well as building up some performance 
analytics on that (break down by feature, contributor, etc).

A suggestion from Stephan Zuercher:  instrument Mondrian enough so that 
logical counters such as number of expression evaluations and number of 
SQL queries issued can also be included in any reporting/alerting. 
These are a lot less noisy than real-time execution metrics.  (This 
wouldn't be compatible with historical analysis earlier than the 
instrumentation's point of introduction, but once it's in place, we can 
use it when we flash back to any point after that.)

JVS

#!/bin/bash

set -e
set -v

cleanpath=/apps/jvs/open/mondrian
workpath=/apps/jvs/open/work
changeno=11154
propsfile=/apps/jvs/open/slamit/local.properties
queryfile=/apps/jvs/open/slamit/query.mdx

p4 sync //open/mondrian/...@${changeno}

rm -rf ${workpath}
cp -R ${cleanpath} ${workpath}
cd ${workpath}
ant clean
ant
ant jar
ant cmdrunner
bin/run.sh -t -p ${propsfile} -f ${queryfile} > query.out 2>&1
grep "time\[" query.out > time.txt



More information about the Mondrian mailing list