[Mondrian] Native Dimension Filter

Julian Hyde jhyde at pentaho.com
Fri Apr 17 17:48:22 EDT 2009


In my estimation:

*	80% of the time, regexps are so simple that the SQL regexp and Java
regexp will be the same (provided that the DBMS implements regexps).
*	80% of the time, a particular Mondrian application will only ever
run on a single database.

Combine those two probabilities and there is a compelling case to just send
the regexp to the DBMS unchanged.
 
Obviously it's 'unclean'. You're welcome to add a property so that people
can disable this 'unclean' behavior if they wish.
 
And you're welcome to add a method to the Dialect to translate from Java
regexp format to the DBMS' native regexp format. For example,
 
    String Dialect.createRegexpSql(String valueExpr, String regexp)
 
On Oracle
 
    createRegexp("emp.name", "bar.*")
 
might return
 
    "emp.name LIKE 'bar%'".
 
and
 
    createRegexp("emp.name", "bar(baz)?")
 
would return null because Oracle cannot deal with that kind of regexp.
 
But let's do the simple case, which covers 96% of possibilities, first.
 
Julian



  _____  

From: mondrian-bounces at pentaho.org [mailto:mondrian-bounces at pentaho.org] On
Behalf Of Rushan
Sent: Friday, April 17, 2009 2:22 PM
To: Mondrian developer mailing list
Cc: rchen at lucidera.com
Subject: Re: [Mondrian] Native Dimension Filter


SQL-2003 regexp rules are certainly hard to understand, and I am not sure
how many vendors are supporting it fully. There should be many Java programs
that query a DB using regexp patterns; however I was not able to find an
open source translator between SQL regexp and Java version.

As for LucidDB, the plugin is a good idea- saves the headache of LucidDB
having to translate SQL regexp back to the Java one!

Rushan


On Fri, Apr 17, 2009 at 2:01 PM, John V. Sichi <jsichi at gmail.com> wrote:


Rushan wrote:


Note that the only dimension filter considered at this point is In/Not In.
There are other candidates such as Matches/Not Matches filter. I punted on
this one because DBs have different levels of SQL regexp support. Some are
buggy such as LucidDB.



For MATCHES, another way to do it would be to allow vendors to plug
something in via the dialect, since I doubt very many DBMS products are ever
going to implement the bizarre SQL:2003 mishmash convention of regular
expressions and LIKE patterns ('%' for '.*', '_' for '.').

So, for example, we could require the DBMS to be capable of processing a
regular expression directly (instead of a SQL:2003 pattern).  In the case of
LucidDB, that can be done with a UDF wrapping java.util.regex as described
here:

http://pub.eigenbase.org/wiki/LucidDbUdfJavaHowto

Other DB's may have native regex-processing capabilities.  A sticking point
might be that there are a number of different conventions for regex patterns
across languages/libraries.

JVS

_______________________________________________
Mondrian mailing list
Mondrian at pentaho.org
http://lists.pentaho.org/mailman/listinfo/mondrian



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20090417/0775c898/attachment.html 


More information about the Mondrian mailing list