Reducing Time to Process Feeders

bplaia · Post by **bplaia** » Tue Oct 11, 2016 6:49 pm

Hi all,

I'm currently working on a relatively massive TM1 system for a client (404 GB in memory, 355 GB on disk in the Data directory) using Persistent Feeders to enable the instance to come back up during our weekly restarts and in the event of any crashes in about an hour.

The caveat with Persistent Feeders as you all know is that they must be recalculated every once in a while. When we need to perform this recalculation, the system is unavailable for roughly 24 hours while the feeders are recalculating.

One of my colleagues had an interesting idea to reduce the number of intersections in our cubes by removing unused elements from our SKU dimension (the dimension contains roughly 165,000 elements but only about 1/3 of those contain data somewhere in the system) before beginning the feeder recalculation so as to speed up the processing time since TM1 will be evaluating a fraction of the normal intersections in the cube while generating feeders. Once the system has come back up, we will run our normal dimension build processes to add the missing SKUs back into the dimension, so there would be no net overall effect on the data and we will have gained the speedup benefit of processing feeders.

Seeing as this is a rather unorthodox approach I was curious what everybody's thoughts are on the feasibility of this method.

Thanks,
Brian

lotsaram · Post by **lotsaram** » Tue Oct 11, 2016 7:34 pm

Your suggestion doesn't make a whole lot of sense to me. Feeders are an explicitly leaf concept, so SKUs with no data won't cause any feeding.

Unless what is being fed is "Total SKU" in which case a whole lot of overfeeding is going on which points to some dubious design. Sometimes it might be necessary to overfeed, for example in an allocation model. But there are almost always ways to limit the amount of overfeeding for example with hierarchies of SKUs relevant to a particular market or customer and feeding these rather than Total.

It is also possible given your description that you might have unnecessary cube areas being fed due to inadequate limiting on the left hand side of feeder statements. For example are versions and time periods triggering feeders which aren't needed? For historical data which is static are there rules in place? Could this data be exported the rule for static data removed and the data reimported?

Deleting SKUs then adding back seems to me like a hail Mary. There are many other things which I would try first. The first of which would be analysis of the rules and feeders by an experienced consultant. (without seeming like touting

)

tomok · Post by **tomok** » Tue Oct 11, 2016 7:39 pm

If these intersections are indeed empty AND not being fed, then removing them will likey not help much. TM1 does a very good job with sparcity so these intersections are not being evaluated to begin with. Sure, there is a little overhead here so not having them versus having them with sparcity is not exactly the same but I doubt you'll see much improvement. I would really question a TM1 design that has so much in rule driven calculations that it takes 24 hours to process feeders.

bplaia · Post by **bplaia** » Tue Oct 11, 2016 7:55 pm

lotsaram wrote:Unless what is being fed is "Total SKU" in which case a whole lot of overfeeding is going on which points to some dubious design. Sometimes it might be necessary to overfeed, for example in an allocation model. But there are almost always ways to limit the amount of overfeeding for example with hierarchies of SKUs relevant to a particular market or customer and feeding these rather than Total.

You are absolutely correct in that inadequate limiting of the left hand side is causing inappropriate areas of the cube to be fed. This is one of the first things I began working on to reduce the overfeeding.

Additionally, this cube feeds a completely rule-derived version of the cube minus the SKU dimension. As such, the feeders on the source cube were written with "Total SKU" in the left hand side. Removing "Total SKU" from the left hand side as well as limiting the left hand side to apply only to the appropriate sections of the cube reduced the time to process by roughly 35%, however that still leaves anywhere from 12-18 hours to recalculate feeders. The real killer is basically feeding a copy of the cube.

bplaia · Post by **bplaia** » Tue Oct 11, 2016 7:57 pm

tomok wrote:If these intersections are indeed empty AND not being fed, then removing them will likey not help much. TM1 does a very good job with sparcity so these intersections are not being evaluated to begin with. Sure, there is a little overhead here so not having them versus having them with sparcity is not exactly the same but I doubt you'll see much improvement. I would really question a TM1 design that has so much in rule driven calculations that it takes 24 hours to process feeders.

I figured as much, I just wasn't able to articulate how well TM1 handles sparsity. Unfortunately I have inherited this design and am doing my best to rework the feeders but with a system this size it becomes difficult to perform regression testing.

lotsaram · Post by **lotsaram** » Tue Oct 11, 2016 8:15 pm

bplaia wrote:the feeders on the source cube were written with "Total SKU" in the left hand side.

Well glad you at least got rid of that because, no that's simply not how you do it. If not filtering on dimension then simply leave the dimension out of the left hand side.

bplaia wrote:this cube feeds a completely rule-derived version of the cube minus the SKU dimension. ... the real killer is basically feeding a copy of the cube.

How big is this cube having a virtual copy of itself? Again sorry, but the design is simply ludicrous, and not in a good way. A cube with this degree of data has a source system. User driven forecast input, if any, will be a small fraction of the data volume, and any rule derived values in another cube should be limited to user driven input data that can be changed in real time by users. Anything loaded from a source system should be loaded simultaneously to the summary cube via CellIncrementN. There is no reason for this to be rule calculated and every reason for it not to be.

bplaia · Post by **bplaia** » Fri Oct 14, 2016 2:25 pm

lotsaram wrote:How big is this cube having a virtual copy of itself? Again sorry, but the design is simply ludicrous, and not in a good way. A cube with this degree of data has a source system. User driven forecast input, if any, will be a small fraction of the data volume, and any rule derived values in another cube should be limited to user driven input data that can be changed in real time by users. Anything loaded from a source system should be loaded simultaneously to the summary cube via CellIncrementN. There is no reason for this to be rule calculated and every reason for it not to be.

Here is a current snapshot of the }StatsByCube for the source cube (I warn you, it's not pretty):

: Capture.PNG (8.34 KiB) Viewed 11023 times

We are looking into alternative designs, potentially a TI driven solution rather than a rules driven solution, but the client has a strong desire for real-time calculations so rules were the obvious choice when the system was being designed a long time ago.

qml · Post by **qml** » Fri Oct 14, 2016 2:37 pm

bplaia wrote:We are looking into alternative designs, potentially a TI driven solution rather than a rules driven solution, but the client has a strong desire for real-time calculations so rules were the obvious choice when the system was being designed a long time ago.

Your stats do look like there is some massive overfeeding going on. Maybe in your case you could try a feederless solution? Lotsaram described it in this thread. It does require a bit of thought and preparation, but it can work very well if you're able to get rid of all N-level rules, replace them with C-level rules and faux-feed via natural consolidations.

bplaia · Post by **bplaia** » Fri Oct 14, 2016 2:57 pm

qml wrote:Your stats do look like there is some massive overfeeding going on. Maybe in your case you could try a feederless solution? Lotsaram described it in this thread. It does require a bit of thought and preparation, but it can work very well if you're able to get rid of all N-level rules, replace them with C-level rules and faux-feed via natural consolidations.

Once the virtual cube feeders are removed, the feeders for the rest of the cube are mainly for FX translations. In the past we had other cubes designed to use feederless rules to perform the FX translations, however we saw massive performance degradation when attempting to consolidate across multiple dimensions (i.e., feederless rules work fine if viewing the calculated data at leaf level, but when we start going to Total Company and Total Product levels the ConsolidateChildren functions cause a serious performance hit). This may have been a byproduct of another instance of massive overfeeding in said cube, however I'm wary of using ConsolidateChildren on large hierarchies.

qml · Post by **qml** » Fri Oct 14, 2016 3:12 pm

bplaia wrote:In the past we had other cubes designed to use feederless rules to perform the FX translations, however we saw massive performance degradation when attempting to consolidate across multiple dimensions (i.e., feederless rules work fine if viewing the calculated data at leaf level, but when we start going to Total Company and Total Product levels the ConsolidateChildren functions cause a serious performance hit). This may have been a byproduct of another instance of massive overfeeding in said cube, however I'm wary of using ConsolidateChildren on large hierarchies.

It's hard to talk about specifics without knowing anything about the model, but ConsolidateChildren will kill your performance, no doubt about it. However, it is generally possible to have a feederless design with C-level rules and not use ConsolidateChildren. I have used this technique in some really large cubes and it scaled very well and provided good performance.

mvaspal · Post by **mvaspal** » Fri Oct 14, 2016 7:37 pm

Your stats do look like there is some massive overfeeding going on

How exactly do you see that from the stats table? They have ca 36x so many fed cells as numeric. so what if there are 36 currencies calculated from local currency with rules

I understand that from earlier posts it turns out it may be an overfeeding, maybe TI or implicit C: feeders would help, but I'm not sure how you see from the screenshot that it's overfeeding? Do i miss smth obvious?

qml · Post by **qml** » Fri Oct 14, 2016 8:17 pm

mvaspal wrote:How exactly do you see that from the stats table? They have ca 36x so many fed cells as numeric. so what if there are 36 currencies calculated from local currency with rules

I understand that from earlier posts it turns out it may be an overfeeding, maybe TI or implicit C: feeders would help, but I'm not sure how you see from the screenshot that it's overfeeding? Do i miss smth obvious?

No, of course I can't be sure there is actual overfeeding. But when I see a 36-to-1 ratio of fed cells to populated cells then to me it's definitely a red flag. Sure, there are cases where that feeder ratio will be perfectly justified, but honestly if I had a model that is as feeder-heavy as this I would redesign it.

lotsaram · Post by **lotsaram** » Fri Oct 14, 2016 9:55 pm

mvaspal wrote:How exactly do you see that from the stats table? They have ca 36x so many fed cells as numeric. so what if there are 36 currencies calculated from local currency with rules

The rule of thumb we use to identify "suspected overfeeding" is 20x
Whether something really is overfed or not depends on the model. Sometimes, especially for allocation type models there is no alternative to overfeeding because there's no way to know ahead of time where the drivers will be. Although 20x is the rule of thumb to identify overfeeding normally I wouldn't be too bothered by anything much under 100x. Except that in this case I would be. We're talking about a cube of 190 GB in memory of which 180 GB is feeder flags. I can't imagine how performance could be anything other than barfworthy. For a data model of this size you have very different optimization parameters and design constraints that a cube of 100 MB. And one of the most significant constraints is that over very large datasets the order of magnitude difference between rules & pure data aggregation adds up to unacceptable performance in a rule heavy model (although MTQ is changing that).

bplaia · Post by **bplaia** » Sat Oct 15, 2016 2:26 am

lotsaram wrote:We're talking about a cube of 190 GB in memory of which 180 GB is feeder flags. I can't imagine how performance could be anything other than barfworthy. For a data model of this size you have very different optimization parameters and design constraints that a cube of 100 MB. And one of the most significant constraints is that over very large datasets the order of magnitude difference between rules & pure data aggregation adds up to unacceptable performance in a rule heavy model (although MTQ is changing that).

The performance in the cube itself is not terrible. I'd estimate that somewhere in the area of 120 GB of those feeder flags are actually feeding the virtual cube, so the stats are misleading as to how much memory is being consumed for feeders inside the source cube vs feeders to the virtual cube. Switching the design to use TI instead of rules to populate the "virtual" cube may be the way to go for a dataset this large.

bplaia · Post by **bplaia** » Wed Dec 27, 2017 5:50 pm

qml wrote: ↑Fri Oct 14, 2016 3:12 pm However, it is generally possible to have a feederless design with C-level rules and not use ConsolidateChildren. I have used this technique in some really large cubes and it scaled very well and provided good performance.

qml, I'm curious to know more about this. Were you able to design the rules so that a natural consolidation was simulated, or does the calculation occur at each level, which would be different from a naturally consolidated result? If it's the latter, I'm afraid that would be a deal-breaker.

Thanks,
Brian

tomok · Post by **tomok** » Wed Dec 27, 2017 7:01 pm

bplaia wrote: ↑Wed Dec 27, 2017 5:50 pm
qml wrote: ↑Fri Oct 14, 2016 3:12 pm However, it is generally possible to have a feederless design with C-level rules and not use ConsolidateChildren. I have used this technique in some really large cubes and it scaled very well and provided good performance.
qml, I'm curious to know more about this. Were you able to design the rules so that a natural consolidation was simulated, or does the calculation occur at each level, which would be different from a naturally consolidated result? If it's the latter, I'm afraid that would be a deal-breaker.

Thanks,
Brian

What he's talking about is creating a C element to hold the results of calculations instead of doing it in an N element. If you then make the element that drives the calculation a child of this new C element then you don't have to write a separate feeder. For example, let's say you want to add an element for Gross Margin %. Instead of adding it is an N element you add it as a (C)onsolidated element and you make Sales and Gross Margin $ children of it. You then write a rule that overrides the natural consolidation of that new element like this:

[Gross Margin %] = C:[Gross Margin $] / [Sales];

This does not need a feeder because Gross Margin % will automatically not zero suppress as long as either Gross Margin $ or Sales are not empty.

TM1 Forum

Reducing Time to Process Feeders

Reducing Time to Process Feeders

Re: Reducing Time to Process Feeders

Re: Reducing Time to Process Feeders

Re: Reducing Time to Process Feeders

Re: Reducing Time to Process Feeders

Re: Reducing Time to Process Feeders

Re: Reducing Time to Process Feeders

Re: Reducing Time to Process Feeders

Re: Reducing Time to Process Feeders

Re: Reducing Time to Process Feeders

Re: Reducing Time to Process Feeders

Re: Reducing Time to Process Feeders

Re: Reducing Time to Process Feeders

Re: Reducing Time to Process Feeders

Re: Reducing Time to Process Feeders

Re: Reducing Time to Process Feeders