Direct version of dimension updates

Post Reply
User avatar
Steve Rowe
Site Admin
Posts: 1783
Joined: Wed May 14, 2008 4:25 pm
OLAP Product: TM1
Version: 10.2.2., PAW
Excel Version: Nearly all of them

Direct version of dimension updates

Post by Steve Rowe » Wed Apr 18, 2018 2:50 pm

Hi,

The benefits of the direct version of dimension updates are well known, i.e. you only need to process the datasource once.

Has anyone encountered any disbenefits? i.e. increased lockiness or slower processing than doing it twice?

Just curious and haven't the time to set up so clear tests.

Cheers,

User avatar
PavoGa
Community Contributor
Posts: 206
Joined: Thu Apr 18, 2013 6:59 pm
OLAP Product: TM1
Version: 10.2.2 FP7, PA2.0
Excel Version: 2013
Location: Cleveland, Tennessee

Re: Direct version of dimension updates

Post by PavoGa » Wed Apr 18, 2018 3:42 pm

While I cannot recall where exactly, some documentation stated that using the DIRECT functions does not allow TM1 to store the dimension in an "optimized" manner. Whatever that means, I do not know and have not explored it further. I use the DIRECT functions, but if making a significant number of changes (which is entirely a judgement call on my part using SWAG, but generally more than just a few elements in the dimension), I use the METADATA tab to do it.

But no "Ah Hah!" moments one way or the other. As of yet.
Ty
Cleveland, TN

User avatar
paulsimon
MVP
Posts: 603
Joined: Sat Sep 03, 2011 11:10 pm
OLAP Product: TM1
Version: 10.1.1 and 10.2.2
Excel Version: 2013

Re: Direct version of dimension updates

Post by paulsimon » Wed Apr 18, 2018 7:51 pm

Hi Steve

I haven't noticed any performance issues in using them. However, I have never left things go that long without a conventional update.

At one client they had a cube with two large dimensions of over 100,000 elements. There were data loads running several times a day and any new elements needed to be inserted into the right place in the hierarchy. I can't remember the details of how I did it but I think I split the loads so that rows with existing elements went straight into the cube from the SQL source, any rows that had new elements went to a SQL table. If there were any new elements that needed to be added then the Epilog ran another process which matched the SQL table of new elements against the SQL View that gave all the upper hierarchy elements and then these got inserted and linked to the hierarchy above using dimension direct statements. A second process had to be called rather than just doing a Direct insert on the Data Tab, as the requirement was that the elements needed to be linked in to the hierarchy and there was not enough information on the data load source to do that.

Although it was a 24x5 operation there was a quiet time of an hour during which we ran conventional dimension updates. This was effectively the same process as run from the Epilog but just without the join to the table of new elements and a switch to select the indirect version of the Dimension insert statements instead of the Direct version.

My understanding is that the Direct functions don't compress the dimension properly which can potentially slow things down. From what I have heard this affects the structure both in memory and on disk and can potentially mean that navigating the dimension can require more hops than in a conventionally structured dimension. However, as soon as you run a conventional dim update then this compresses and optimally structures the dimension. The other alternative is to run the DimensionUpdateDirect statement.

So basically I think that you are unlikely to get a performance problem so long as you can run the DimensionUpdateDirect or a conventional Dim Update at least once a day. I have heard others leave it a week, eg in a 24x5 operation with no possible downtime, they leave it until the weekend. That seems to be OK.

Regards

Paul Simon

User avatar
Steve Rowe
Site Admin
Posts: 1783
Joined: Wed May 14, 2008 4:25 pm
OLAP Product: TM1
Version: 10.2.2., PAW
Excel Version: Nearly all of them

Re: Direct version of dimension updates

Post by Steve Rowe » Thu Apr 19, 2018 8:39 am

Thanks for the input.

So if you were looping over a SQL source and doing direct inserts it would make sense to do a DimensionUpdateDirect in the Epilog of the master TI or the data load TI after ever read.

User avatar
qml
MVP
Posts: 1057
Joined: Mon Feb 01, 2010 1:01 pm
OLAP Product: TM1 / Planning Analytics
Version: 2.0.4 and all previous
Excel Version: 2007 - 2016
Location: London, UK, Europe

Re: Direct version of dimension updates

Post by qml » Thu Apr 19, 2018 11:39 am

DimensionUpdateDirect is very important. If you don't use it to recompile the dim after using any of the Direct insert functions then I find that:
a) rules will not be attached to the new elements;
b) the first use of DIMIX will put an IX lock on the dimension - something you don't want to have if you are e.g. running multithreaded data loads and expect DIMIX to only result in an R lock.

Other than that it is worth mentioning that Direct metadata functions are still just as locking as their non-direct counterparts. It is not their purpose to remove metadata-related locking.
Kamil Arendt

User avatar
paulsimon
MVP
Posts: 603
Joined: Sat Sep 03, 2011 11:10 pm
OLAP Product: TM1
Version: 10.1.1 and 10.2.2
Excel Version: 2013

Re: Direct version of dimension updates

Post by paulsimon » Thu Apr 19, 2018 9:05 pm

Hi Steve

I have probably not gone into it as much as Kamil, but I seemed to get away without putting the DimensionUpdateDirect statement in the Epilog. We had rules for currency conversion so I think we would have noticed if rules were not working correctly. I left the DimenionUpdateDirect to the overnight bulk update. My reading of the DimensionUpdateDirect statement is that if its role is to compress the dimension then it must surely be doing the same level of locking as the whole dimension update that happens at the end of the MetaData tab with the normal DimensionElementInsert statements.Any MetaData statement is going to cause some degree of locking but the duration is shorter with a DimenionElementInsertDirect than a conventional DimensionELementInsert. For a start you have one less pass over the data. Perhaps the fact that I did not get this issue was due to the approach of writing all rows with new elements to a separate table. At the very least, this approach cut any locking to the minimum as the vast majority of the records had existing elements for all columns in the row and were handled by the data load process. It was only the small subset of records that had new elements that needed to be processed by the separate process.

Regards

Paul Simon

Post Reply