[Mondrian] Efficient JSON representation for "XMLA" responses

Michele Rossi michele.rossi at gmail.com
Wed Sep 19 04:36:31 EDT 2012


hi,
we are also concerned about the poor performances of XMLA, as you say below
it requires too many queries to obtain metadata and cell data and the
representation itself is extremely inefficient.

We have a real customer use case of a 20000 rows pivots which is rendered
in Excel via the SimbaO2X plugin.
Such pivot results in about 268Mb of XML exchanged between client and
server.
What's interesting is that the informative content of such xml is very
limited and it can be compressed to 1.69Mb (0.6% of the original).

Right at this moment I don't have the time to look at how you're using JSON
in the new olap4j-xmlaserver but judging from the example provided below it
looks like you're transferring xmla-like content, only serialised in JSON.
I am thinking that something like that would only be useful if you control
client and server (unit tests?) or if you manage to create a new standard.
Or perhaps to create a new olap4j transport driver which could use whatever
format we want to simply delegate calls over to a remote olap4j driver (we
have done that here).
Hope I am not missing something obvious here.

What we've done in my company while building our own olap4j driver was to
transfer metadata using a custom XSD schema which mirrors closely the
Olap4j entities.
This way we can obtain all the data required to deep-populate
a org.olap4j.metadata.Schema in one call.
We have used XSD+JAXB with XML serialization for simplicity but we are
planning to stop using XML all together and use JSON to represent data
instead.

While our olap4j drivers implements all ResultSet based metadata calls
such org.olap4j.OlapDatabaseMetaData.getCubes(String, String, String) we
are not actively using them preferring instead deep metadata objects such
as org.olap4j.metadata.Schema obtained as described above.

And that was my two cents, hope it helps.

thanks,
Michele


On 19 September 2012 00:46, Julian Hyde <jhyde at pentaho.com> wrote:

> You're right about round trips. I believe I've already addressed that,
> with the <Deep>true</Deep> option. Search for "testMDCubesDeepJson" in that
> file and you'll see what I mean.
>
> We can rip out some of the header elements that are all about SOAP and XML
> namespaces. "row" can be replaced with something else ("CUBES" seems more
> appropriate than "cube", if you look at the DeepJson example). The format
> needs also needs to be able to return an exception, with standard XMLA
> response codes.
>
> That said, people will put up with some crap in a protocol if it gives
> them what they need (and reasonably efficiently). So I wouldn't bend over
> backwards to make it absolutely the sexiest json ever.
>
> Take a look at the examples in test file and propose something concrete.
>
> Julian
>
>
> On Sep 18, 2012, at 4:29 PM, Paul Stoellberger <p.stoellberger at gmail.com>
> wrote:
>
> One of my biggest concerns over xmla-like communication is the overall
> overhead in a) number http calls b) processing the results
>
> With this row (array) based results we will have to build in logic into
> the client to be able to understand the logic behind catalogs, schemas,
> cubes and how they correlate and turn those rows into objects instead of
> returning objects directly, as we do in saiku today. In order to work with
> this data I need those objects, and this just doesn't feel right to me.
>
> Furthermore this will have to be processed by the client. Now I know this
> is probably not something very expensive to do but I would rather keep the
> necessary logic and processing on the client side to an absolute minimum.
> Doing the same transformations on server side code will always be faster
> and therefore deliver a better user experience to the user. The less the
> browser has to work, the better.
>
> But my major point of concern is the number of server calls needed to
> retrieve the necessary metadata information.
> I admit I don't know the details about all xmla calls but what I can tell
> by looking at olap4j's xmla driver there are a lot of round trips needed to
> retrieve all the information necessary. I have plans to understand those
> calls better in order to improve the xmla driver. Maybe I'm wrong, so feel
> free to convince me otherwise.
>
> Regarding standards, are there any other OLAP servers that return metadata
> in a xmla like manner?
> What was good for xml, doesn't have to be necessarily the best for todays
> "ways of doing things".
> People were able to consume xmla (like Andy / Roland proved successfully)
> directly before, however we got great feedback from web developers who
> confirmed how easy it is to work with our (metadata) API.
> When I look at public API's like github's I see rich JSON objects and not
> row based resultsets.
> What are your arguments to keeping the structure completely the same but
> just switching from xml to json?
>
> Metadata calls are pretty straightforward and there is not much you can do
> differently, so I am happy to adapt to a better standard (although having 1
> server return the data this way doesn't make it a standard).
>
> However, the best API and standards don't make good software. My biggest
> concern is to make things work.
> If it means I can do stuff more efficient (less http calls, less
> processing needed) I prefer that over a "clean" API.
>
> But maybe in order to move forward:
> Maybe there is a way how we can turn those xmla like results into
> something nice thats still mappable to xmla results.
> E.g. the JSON you pasted below:
> If "row" becomes "cube" and this cube element (can) contain already
> hierarchy and/or level information in the same result, but still keep the
> xmla like result style that would make me already happy.
>
> Andy, Roland ... you two have probably the best insights on this topic so
> it would be great if you could share your opinion with us as well!
>
> -Paul
>
>
>
> On Sep 19, 2012, at 12:44 AM, Julian Hyde wrote:
>
> Paul,
>
> On IRC yesterday you said there was a more efficient JSON representation
> of XMLA responses than the one implemented in olap4j-xmlaserver (formerly
> Mondrian's XMLA server).
>
> There are some examples in XmlaBasicTest [ see
> https://raw.github.com/pentaho/mondrian/master/testsrc/main/mondrian/xmla/XmlaBasicTest.ref.xml and
> search for "Json" ]. One is
>
> "cxmla:DiscoverResponse": {
>   "xmlns:cxmla": "urn:schemas-microsoft-com:xml-analysis",
>   "cxmla:return": {
>     "root": {
>       "xmlns": "urn:schemas-microsoft-com:xml-analysis:rowset",
>       "xmlns:xsi": "http://www.w3.org/2001/XMLSchema-instance",
>       "xmlns:xsd": "http://www.w3.org/2001/XMLSchema",
>       "xmlns:EX": "urn:schemas-microsoft-com:xml-analysis:exception",
>       "row": [
>         {
>           "CATALOG_NAME": "FoodMart",
>           "SCHEMA_NAME": "FoodMart",
>           "CUBE_NAME": "HR",
>           "CUBE_TYPE": "CUBE",
>           "LAST_SCHEMA_UPDATE": "xxxx-xx-xxTxx:xx:xx",
>           "IS_DRILLTHROUGH_ENABLED": true,
>           "IS_WRITE_ENABLED": false,
>           "IS_LINKABLE": false,
>           "IS_SQL_ENABLED": false,
>           "CUBE_CAPTION": "HR",
>           "DESCRIPTION": "FoodMart Schema - HR Cube"
>         },
>         {
>           "CATALOG_NAME": "FoodMart",
>           "SCHEMA_NAME": "FoodMart",
>           "CUBE_NAME": "Sales",
>           "CUBE_TYPE": "CUBE",
>           "LAST_SCHEMA_UPDATE": "xxxx-xx-xxTxx:xx:xx",
>           "IS_DRILLTHROUGH_ENABLED": true,
>           "IS_WRITE_ENABLED": false,
>           "IS_LINKABLE": false,
>           "IS_SQL_ENABLED": false,
>           "CUBE_CAPTION": "Sales",
>           "DESCRIPTION": "FoodMart Schema - Sales Cube"
>         },
>
>                 ...
>
> and there's another that returns nested metadata (cubes, dimensions,
> hierarchies, etc.) and another that returns a cell set.
>
> I'm open to making this more concise and "json-like" as long as there is a
> clear mapping from the XMLA spec. (If we depart from the spec entirely
> we're back in the wild west, where clients can only rely on their own
> server.) If you have some ideas please propose them.
>
> Julian
>
>
>
> Julian Hyde
> jhyde at pentaho.com
>
>
>
>
>
>
> _______________________________________________
> Mondrian mailing list
> Mondrian at pentaho.org
> http://lists.pentaho.org/mailman/listinfo/mondrian
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.pentaho.org/pipermail/mondrian/attachments/20120919/a5b3c9cd/attachment.html 


More information about the Mondrian mailing list