BI: Multidimensional vs In-Memory (MDX vs DAX)
Traditional “multidimensional OLAP BI” is disk based pre-processed aggregations of sliced and diced mass data into data cubes, which have a dimension per category the numerical measures are split by. Whereas “In-Memory BI” allows the data to be modeled and compressed in RAM and applies aggregations and calculations on demand1.
It’s newer and more agile, but is it better?
Let’s look at the overall differences between the two approaches – with a slight Microsoft focus for ease of comparison, so MDX vs DAX as we will see, without getting unnecessarily technical. We can help decide if one is better than the other and likely to replace it. Note this is a longer than usual post!
It might seem like a quibble between disk based and in-memory, as though it’s just a question of data storage – it’s not. The two architectures are very different and require separate software and programming methodologies. Neither design is wrong, but they both have their strengths and weaknesses. Most importantly, once chosen, there’s a significant investment of development and usage time which is not easily shifted between them, so it’s important to choose wisely at the beginning, as well as understand future developments.
Products and Vendors
There will always be cross-overs and merging of functionality in BI tools, and it’s often difficult to separate the database from the BI tooling, but here is a current split.
|Traditional Multidimensional OLAP (MDX )
|In-Memory BI (DAX)
|Microsoft SSAS – Multidimensional
|Microsoft SSAS – Tabular
|SAP Business Objects
|Oracle OLAP Option
|QlikView / QlikSense
|Mondrian OLAP Server
Traditional multidimensional OLAP BI we will call MDX OLAP BI, where MDX is a standard SQL-like language for OLAP, named after MultiDimensional eXpressions 2. In-Memory BI can be called exactly that – although there are many different types of such product they all share the fact their data models are held in RAM. Microsoft call their In-Memory BI system “Tabular” as opposed to the traditional OLAP “Multidimensional”, and both come under the overall BISM model (BI Semantic Model) of SQL Server Analysis Services (SSAS).
When you create a new SSAS service for a DataWarehouse/Mart (DW/DM), you need to decide which of the 2 models it will be – “Multidimensional” traditional OLAP or “Tabular” In-Memory 3.
Source databases are usually Online Transactional Processing (OLTP) systems which operate high speed data transaction entry and have a de-normalised structure split across many sub tables with data integrity enforcing relationships so that records can be combined. This is efficient for storage and stops data integrity violations, but it means more processing required to produce meaningful records for business reporting. Data is also being updated constantly, so better to feed these combined records into a reporting database where they will fit the format required.
As many OLTPs will serve different functions (sales, marketing, manufacturing, inventory, payroll etc), some of them can be combined into DataMarts (eg sales and production so that profits can be calculated from both revenue and costs). It is usually too big a system to combine everything into one huge Data Warehouse, but it can be done especially for smaller organisations.
There are 2 main approaches to building Data Warehouses and Marts – the Bill Inmon and Ralph Kimball methods. Inmon advocates using the OLTPs to build a single Data Warehouse and having Marts feed off that, which then go on to provide the OLAP cubes 5. The Kimball model builds Marts straight out of the OLTPs, where a Data Warehouse becomes unnecessary.
The Kimball approach is the one generally adopted by Microsoft, for both their Multidimensional and Tabular In-Memory models 9.
BI should avoid using OLTPs directly and instead report from a Data Warehouse or Mart, not least because DW/DM records are normalised, so they contain all the column/field data complete (even though items are effectively repeated) rather than IDs and Codes needing to be stitched together. This choice of DW/DM still doesn’t really help determine the choice between MDX OLAP BI and In-Memory BI but helps us understand what goes into them.
Continuing from the development of DW/DMs to BI, let us consider the choice of MDX OLAP or In-Memory BI system.
An MDX OLAP BI system is by nature a Data Warehouse or at least a Mart, but an In-Memory BI system doesn’t need to be – it’s data sources can be split, varied and of any nature – from small files, excel spreadsheetsto full-blown DWs. This can be considered beneficial functionality, but its disparate make up can also be considered a negative.
All organisations and businesses need confidence in reported numbers, and one way to help that is to ensure continuity and stability of data sources. Multiple Data Marts feeding up to (or from) single Data Warehouses help this, so traditional OLAP systems are strong. However, many different new and changeable reports are often needed to gain insights or determine previously unplanned analysis. This is what the self-service BI revolution is supposed to serve, and in-memory BI systems are perfect for, especially as analysts can get on with their work without so much dependency on IT Depts.
However, the danger of self-service multiple silos shouldn’t be underestimated – those calculations should be cross-checked and verified against wider data and systems. It seems In-Memory BI is the winner here, but with a caveat it is integrated with a more centralised Data Mart like MDX OLAP BI.
Before the millennium, IT systems tended to be designed by Waterfall project management where the whole structure is designed from the beginning. MDX OLAP BI systems are more likely to be implemented this way. Since then, agile development has become popular, where systems are partially rolled out and re-engineered considering feedback from the business. In-Memory BI systems are more natural here.
There are advantages to both methods – formalising the system in advance so it remains integral and correct, or changing parts to capture real needs and requirements.
Risk is often attributed more to the formal up-front specification, because if it turns out to be wrong (not meeting real business requirements) it is very difficult to turn around, so becomes an expensive failure. That is why agile has become so popular. However, agile can also fail spectacularly and expensively, when it goes off-course led by some new over-qualified requirement and the overall design is compromised, rendering the whole project a fail.
If In-Memory BI is better because it’s more agile, it should also be carefully developed to ensure more traditional formal correctness.
Although older technology such as MDX OLAP BI may become stale, they have had time to mature, consolidate and be tuned to resolve the many issues which would have cropped up in the working field. Such issues are addressed by version upgrades helped along by observations in community forums and product feedback.
New methods of data analyses present new opportunities, but these should also be considered against potentially immature implementations. Due to a longer history of maturity, MDX OLAP BI systems have the overall advantage here.
Disk based access of pre-processed OLAP cubes offer almost unlimited sizes into petabytes, and different types of disks from standard hard drives to fast RAID and even faster Solid-State Disks (SSD) provide big gains, although the weakest link in the chain (ie the slowest disk or part of the network which joins them to the CPU) will end up as a bottleneck. However, the very fact of pre-processing cubes and only needing to access their results, not re-process them each time, means that big gains are a given for MDX OLAP BI.
On the other hand, mass storage is good, but all those table records are mixed up with different datatypes, sparse (null) records and variable length text. Storage may not be optimal. In-Memory BI systems (even if not the underlying database, which could be any source) don’t use such crude tables – they generally use “Columnstores” where each column field is treated independently, and all its data can be specifically compressed 7. That’s where in-memory processing gets its speed and ability to fit everything within RAM.
Criticising In-Memory BI for being limited due to RAM is a bit like saying R or Python are limited due to performance – somewhat disingenuous because in production systems and analytic models, millions of base data are usually pre-aggregated into records more usable by the model. Base records can remain in the OLTP or DW/DMs so by the time they hit the BI system, the amount of memory required is not so important, even for big datasources.
In-Memory BI is therefore faster with new data, but not as fast as where previous data in MDX OLAP BI has been pre-aggregated. For that reason, there is no clear winner here.
In-memory BI systems tend to focus more on the analytic and visualisation capabilities rather than the aggregated data slicing and hierarchical navigation that traditional OLAP products provide. Depending onthe business and requirement, the balance of importance can vary. Financial analysis is often more about slicing and dicing aggregations, whereas scientific insight exploration often benefits from more diverse methods of charting. Operational business analysis may entail a combination of both.
MDX OLAP BI usually has better aggregated slice-and-dice navigation, but In-Memory BI almost certainly has a better wealth of additional visualisations.
Any article is likely to be biased, so we’ll look at trends.
One way we can gauge current and historical interest is by analysing internet searches. Google search has become ubiquitous enough to use as a standard, although any search engine may have its bias of users. Choosing comparative terms is awkward though, so we will restrict it to Microsoft technology as conveniently they provide both traditional OLAP using the standard SQL-like language MDX, as well as in-memory BI using their proprietary language DAX. If anything, MDX would be overstated compared to DAX, because other vendors also use MDX (but not DAX), however the results, also filtered to UK only to avoid any language or country specific complications, are revealing 3.
Figure 1: Google Trends – “MDX” compared to “DAX” in UK Business & Industrial
Figure 2: Google Trends – “MDX” compared to “DAX” in UK Computers & Electronics
We can see that from 2004, MDX was a more popular search term although DAX often spiked past it. From 2014, DAX caught up and then overtook MDX, especially under Business & Industrial. Under Computers &Electronic, DAX (itself a highly technical term) has become more prevalent than MDX. Also note the spike in DAX Business & Industrial interest during the Covid-19 summer of 2020.
We have seen that the major differences are that traditional multidimensional OLAP systems (MDX OLAP BI) process pre-aggregated data into disk-based cubes, whereas In-Memory BI systems compress data models into RAM and perform ad-hoc queries on that. Speed performance has been seen to be inconclusive as a decider, because pre-aggregation by nature will speed up later analysis, even though compressed in-memory data will be faster on the fly.
Newer technology like In-Memory BI is usually better, but older systems like MDX OLAP BI have more bug fixes and issues resolved, so superiority is again inconclusive. Similarly, older Waterfall design methodologies are usually applied with MDX OLAP BI Data Marts, and although new more agile development is favoured with In-Memory BI, there is good and bad with both.
Having a single source of truth in a Data Warehouse or Mart is ideal, but this must be offset against the ability to conduct ad-hoc analyses by different parts of the business, so In-Memory BI creeps ahead.
Even though MDX OLAP BI has a pedigree of slice-and-dice hierarchical navigation, the many additional visualisation components which ship with In-Memory BI is likely (whichever vendor) to put them ahead.
Google Trends has also shown that DAX is now searched formore than traditional MDX – another reason to go with In-Memory BI systems.
Rob Collie, previously lead BI of Excel at Microsoft said “Microsoft doesn’t like to say publicly that SSAS Tabular supersedes and replaces SSAS Multidimensional… but trust us, that is precisely what is going on. Even the longstanding ‘celebrities’ of SSAS Multidimensional, such as Chris Webb, Marco Russo and Alberto Ferrari, now use Tabular for most of their projects”1. After weighing up our own comparisons between issues here, I am inclined to agree, and believe this extends to all In-Memory BI systems, which are nimbler and more geared towards analytics than the older traditional Multidimensional BI, even though they both still have their place.
In-Memory BI will probably never entirely replace MDX OLAP BI although as they both gain from each other’s techniques, In-Memory BI will become more prevalent as it is already doing.
1 Collie, Rob and Singh, Avichal (2016, 2nd Edition). PowerPivot and Power BI. Florida, USA: Holy Macro Books.
2 Connolly, Thomas M and Begg, Carolyn E (2015, 6th Edition). DatabaseSystems: a practical approach to design, implementation, and management. England: Pearson Education Limited.
3 Google Trends (2019). http://trends.google.com. Results re-charted on 3/10/2020.
4 Harinath, S; Carroll, M; Meenakshisundaram, S; Zare, R and Guang-Yeu Lee, D (2009, 1stEdition). Professional Microsoft SQL Server Analysis Services 2008 with MDX. Indiana, USA: Wiley Publishing Inc.
5 Inmon, Bill (2005, 4thEdition). Building the Data Warehouse. Indiana, USA: Wiley Publishing Inc.
6 Kimball, Ralph (2013, 3rdEdition). The Data Warehouse Toolkit. Indiana, USA: Wiley Publishing Inc.
7 Larson, Brian (2017, 4thEdition). Delivering Business Intelligence with Microsoft SQL Server 2016. USA: McGraw Hill Education.
8 Linoff, Gordon S. and Berry, Michael J.A. (2011, 3rd Edition). Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management. Indiana, USA: Wiley Publishing Inc.
9 Mundy, Joy; Thornthwaite, Warren;and Kimball, Ralph (2006, 1st Edition). The Microsoft Data Warehouse Toolkit. Indiana, USA: Wiley Publishing Inc.
10 Russo, Marco and Ferrari, Alberto (2015, 1st Edition). The Definitive Guide to DAX. Washington, USA: Microsoft Press.
11 Varga, Stacia (2017, 1stEdition). Developing SQL Data Models. USA: Pearson Education Limited.
12 Webb, Chris; Ferrari, Alberto and Russo, Marco (2009, 1st Edition). Expert Cube Development with Microsoft SQL Server 2008 Analysis Services. Birmingham, UK: Packt Publishing Ltd.