[Mondrian] Native Dimension Filter

Julian Hyde jhyde at pentaho.com
Fri Apr 17 17:48:22 EDT 2009

In my estimation:

*	80% of the time, regexps are so simple that the SQL regexp and Java
regexp will be the same (provided that the DBMS implements regexps).
*	80% of the time, a particular Mondrian application will only ever
run on a single database.

Combine those two probabilities and there is a compelling case to just send
the regexp to the DBMS unchanged.
Obviously it's 'unclean'. You're welcome to add a property so that people
can disable this 'unclean' behavior if they wish.
And you're welcome to add a method to the Dialect to translate from Java
regexp format to the DBMS' native regexp format. For example,
    String Dialect.createRegexpSql(String valueExpr, String regexp)
On Oracle
    createRegexp("emp.name", "bar.*")
might return
    "emp.name LIKE 'bar%'".
    createRegexp("emp.name", "bar(baz)?")
would return null because Oracle cannot deal with that kind of regexp.
But let's do the simple case, which covers 96% of possibilities, first.


From: mondrian-bounces at pentaho.org [mailto:mondrian-bounces at pentaho.org] On
Behalf Of Rushan
Sent: Friday, April 17, 2009 2:22 PM
To: Mondrian developer mailing list
Cc: rchen at lucidera.com
Subject: Re: [Mondrian] Native Dimension Filter

SQL-2003 regexp rules are certainly hard to understand, and I am not sure
how many vendors are supporting it fully. There should be many Java programs
that query a DB using regexp patterns; however I was not able to find an
open source translator between SQL regexp and Java version.

As for LucidDB, the plugin is a good idea- saves the headache of LucidDB
having to translate SQL regexp back to the Java one!


On Fri, Apr 17, 2009 at 2:01 PM, John V. Sichi <jsichi at gmail.com> wrote:

Rushan wrote:

Note that the only dimension filter considered at this point is In/Not In.
There are other candidates such as Matches/Not Matches filter. I punted on
this one because DBs have different levels of SQL regexp support. Some are
buggy such as LucidDB.

For MATCHES, another way to do it would be to allow vendors to plug
something in via the dialect, since I doubt very many DBMS products are ever
going to implement the bizarre SQL:2003 mishmash convention of regular
expressions and LIKE patterns ('%' for '.*', '_' for '.').

So, for example, we could require the DBMS to be capable of processing a
regular expression directly (instead of a SQL:2003 pattern).  In the case of
LucidDB, that can be done with a UDF wrapping java.util.regex as described


Other DB's may have native regex-processing capabilities.  A sticking point
might be that there are a number of different conventions for regex patterns
across languages/libraries.


Mondrian mailing list
Mondrian at pentaho.org

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20090417/0775c898/attachment.html 

More information about the Mondrian mailing list