As you know already, I’m a big fan of using Set Analysis in my QlikView applications, especially for Point In Time Reporting. However, there might be some specific situations in which Set Analysis just won’t cut it and you need an alternative.
I found myself in that situation this month in a project where the QlikView application was required to have Sales Data to the highest detail (resulting in a Fact Table of about 300 million rows for two years worth of data) and several analyses of the Pint-In-Time kind were to be made. So, I started doing the usual stuff: built the Data Model, prepared the calendar with some fields to optimize the use of Set expressions, and started creating the objects. To my surprise, when I created the PITR analysis objects, I noticed a poor performance, a very poor performance indeed. I say “to my surprise” because I was counting on having a high performance app, despite having that amount of data in memory, based on two things:
- The Server I was working on had 512 GB of RAM, and 4 octo core processors. (What else would you need with a server like that, right?)
- The Data Model was not that complex, I had already worked with similar models and required analyses, and had not had any problems with my set expressions.
But there you go, not everything goes as first planned in terms of performance, and I started making some adjustments: I tried summarizing the data, simplifying the Data Model a little bit, and even considered having the pre-built fields for MTD, LYMTD, etc. This last option was not the best actually and I was just not satisfied with settling for that because it would mean the user would not be able to “play” and analyze or make comparisons based on different months.
After several days trying to figure it out, I was presented with an option I discarded instinctively: using Flags in the calendar in order to use expressions like Sum(Sales * FlagMTD). The reason I first doubted using this approach was very obvious to me: How would an expression that had to be evaluated on a row-by-row basis would be better than Set Analysis? Plus, that is an old-fashion method, used when there was no Set Analysis. I was wrong, not in the second part of my reasoning, but in the first.
Turns out that on the first test, Flags performed a lot better than Set Analysis: From around 16 seconds in a single and basic object using Set expressions, to less than 1 second using Flags on the same object. The Dashboard Sheet, which was taking nearly 50 seconds to calculate using Set expressions, took less than 5 using Flags. Sometimes it took less than 1, depending on the selections I made.
I felt relief for being able to deliver a high performing app, and at the same time I was taught a lesson: Try it before you trash it.
So, how does it work? If this approach is new to you, I will advice you to see the following explanation from the “associative” perspective, since I think it would be easier to understand it that way. The following diagram illustrates the concept.