952 PI - management of multithreaded processes

Post Reply
David Usherwood
Site Admin
Posts: 1453
Joined: Wed May 28, 2008 9:09 am

952 PI - management of multithreaded processes

Post by David Usherwood »

Forumers may recall the humungous model I have which takes days to process.
I've been liaising with IBM on using PI to parallel process the calculations, which led to 952 FP1 HF6, which significantly improves process time (reduced by a factor of 200). I'm now exploring how to manage such a set of parallel processes (PPs).
We know that turning logging on or off, or running SaveDataAll, can't be used in PPs as you get locks. So I have added monitoring code to the main TIs which update a status cell on at the start and off at the end, then written a TI which runs every 5 minutes monitoring the total of those cells, aiming to reactivate logging and do an SDA when the jobs are all complete. BUT... as far as I can see, the monitoring TI does not see the 1s in the cube. I've looked myself at the relevant cube, and added asciioutput code (with unique timestamped filenames) to dump out the contents, the flags don't show.
Ideas welcome.
If I can't get past this I suppose I'll write flags to the filesystem, but surely TM1 is a better place to do this - especially with PI? Is this a multiuser database or not? Or what have I missed?
:shock:
rmackenzie
MVP
Posts: 733
Joined: Wed May 14, 2008 11:06 pm

Re: 952 PI - management of multithreaded processes

Post by rmackenzie »

Why doesn't the last job know it is the last job to run? If it did then you would be able to execute the finalisation logic from just that process eliminating the current issue you are facing.
Robin Mackenzie
User avatar
qml
MVP
Posts: 1094
Joined: Mon Feb 01, 2010 1:01 pm
OLAP Product: TM1 / Planning Analytics
Version: 2.0.9 and all previous
Excel Version: 2007 - 2016
Location: London, UK, Europe

Re: 952 PI - management of multithreaded processes

Post by qml »

My understanding of Parallel Interaction is that each parallel writer (in this case - process) will get its own copy of data to work on and change, but this dataset will not be available to others until a commit is performed, which is at the end of a process. Only then are the changes merged with the main dataset and made 'official' and available to other writers and readers.

This means that writing 1s at the beginning of your process and replacing them with 0s at its end does not have any real effect on the active data, because the 1s never get committed. The only thread that can see these 1s is the one writing them. Ergo, with PI switched on you will need to find a different design for semaphores in your processes (file system flags should work just fine, or you can try global variables). This behaviour I believe is the reason why the TI function synchronized() has been introduced in version 10.1.
Last edited by qml on Wed Mar 14, 2012 10:39 pm, edited 1 time in total.
Kamil Arendt
User avatar
qml
MVP
Posts: 1094
Joined: Mon Feb 01, 2010 1:01 pm
OLAP Product: TM1 / Planning Analytics
Version: 2.0.9 and all previous
Excel Version: 2007 - 2016
Location: London, UK, Europe

Re: 952 PI - management of multithreaded processes

Post by qml »

rmackenzie wrote:Why doesn't the last job know it is the last job to run?
Probably because this is a prallelised environment in which the order of execution does not matter and thanks to this all the TIs can run at the same time, increasing performance by orders of magnitude?
Kamil Arendt
lotsaram
MVP
Posts: 3652
Joined: Fri Mar 13, 2009 11:14 am
OLAP Product: TableManager1
Version: PA 2.0.x
Excel Version: Office 365
Location: Switzerland

Re: 952 PI - management of multithreaded processes

Post by lotsaram »

qml wrote:My understanding of Parallel Interaction is that each parallel writer (in this case - process) will get its own copy of data to work on and change, but this dataset will not be available to others until a commit is performed, which is at the end of a process. Only then are the changes merged with the main dataset and made 'official' and available to other writers and readers.

This means that writing 1s at the beginning of your process and replacing them with 0s at its end does not have any real effect on the active data, because the 1s never get committed. The only thread that can see these 1s is the one writing them. Ergo, with PI switched on you will need to find a different design for semaphores in your processes (file system flags should work just fine, or you can try global variables). This behaviour I believe is the reason why the TI function synchronized() has been introduced in version 10.1.
Pretty much my understanding.

If you go to file system flags it should work but then you would need to also encapsulate the writing out to file either in an Execute process or ExecuteCommand with some script. If it is in the main part of any long running TI then the process will hold a lock on the file until it finishes it's commit.
rmackenzie
MVP
Posts: 733
Joined: Wed May 14, 2008 11:06 pm

Re: 952 PI - management of multithreaded processes

Post by rmackenzie »

qml wrote:
rmackenzie wrote:Why doesn't the last job know it is the last job to run?
Probably because this is a prallelised environment in which the order of execution does not matter and thanks to this all the TIs can run at the same time, increasing performance by orders of magnitude?
If there are n jobs then n-1 can run parallelised and still realise the vast majority of performance gains. The remaining job could run last 'knowing' it is the last and call the post-processing jobs like turning on logging and doing the SaveData. Possibility?
Robin Mackenzie
User avatar
qml
MVP
Posts: 1094
Joined: Mon Feb 01, 2010 1:01 pm
OLAP Product: TM1 / Planning Analytics
Version: 2.0.9 and all previous
Excel Version: 2007 - 2016
Location: London, UK, Europe

Re: 952 PI - management of multithreaded processes

Post by qml »

rmackenzie wrote:The remaining job could run last 'knowing' it is the last
That is the whole point: how does it 'know' it's the last one? Or, in other words, how does it 'know' that all others have finished?
Kamil Arendt
David Usherwood
Site Admin
Posts: 1453
Joined: Wed May 28, 2008 9:09 am

Re: 952 PI - management of multithreaded processes

Post by David Usherwood »

@qml:
The 'private copy' approach you mention seems to me spot on. I was at Black Belt in Waltham in July 2010 (one of two from the UK) and don't recall this being made explicit in the session on CubeVersioning (as it was then named).
I've checked over using global variables and from the documentation, they pass from a process to subprocesses and from process to process within a chore - so I can't see that they would work here.
So I think it's back to:
a Filesystem (probably asciioutput/asciidelete/wildcardfilesearch)
b SQL? Overkill I know but SQL Express is more or less free
c Perhaps Powershell? or even an environment variable?
PlanningDev
Community Contributor
Posts: 349
Joined: Tue Aug 17, 2010 6:31 am
OLAP Product: Planning Analytics
Version: 2.0.5
Excel Version: 2016

Re: 952 PI - management of multithreaded processes

Post by PlanningDev »

Anyone have any thoughts on this approach.

What if you called a TI Process that wrote the status from the command line or API? Im assuming that using this method would de-couple the commit from original process.

You would need to pass parameters to either your own API or to the TM1RunTI.exe but it might work.
User avatar
qml
MVP
Posts: 1094
Joined: Mon Feb 01, 2010 1:01 pm
OLAP Product: TM1 / Planning Analytics
Version: 2.0.9 and all previous
Excel Version: 2007 - 2016
Location: London, UK, Europe

Re: 952 PI - management of multithreaded processes

Post by qml »

I think this could work, definitely worth trying.
Kamil Arendt
David Usherwood
Site Admin
Posts: 1453
Joined: Wed May 28, 2008 9:09 am

Re: 952 PI - management of multithreaded processes

Post by David Usherwood »

@qml, I went for the file system approach and it's working well. However I'm still having major issues with a flood of cube dependency messages which I'm trying to manage down without proper documentation from IBM, plus a hard to locate secondary pair of dependencies which appear to be both ways between an attribute cube and a driver cube (Why :x ). So haven't achieved the nice smooth PI calculation engine I want. The logging is working nicely though :)
PlanningDev
Community Contributor
Posts: 349
Joined: Tue Aug 17, 2010 6:31 am
OLAP Product: Planning Analytics
Version: 2.0.5
Excel Version: 2016

Re: 952 PI - management of multithreaded processes

Post by PlanningDev »

Just finished testing by using the TM1RunTI.exe approach and it works. Calling this essentially starts another session which allows the data commit to be outside the existing session which PI is holding untill the TI completes.
ellissj3
Posts: 54
Joined: Tue Jun 15, 2010 1:43 pm
OLAP Product: Cognos TM1
Version: 9.0 - 10.2
Excel Version: 2010

Re: 952 PI - management of multithreaded processes

Post by ellissj3 »

Forumers may recall the humungous model I have which takes days to process.
I've been liaising with IBM on using PI to parallel process the calculations, which led to 952 FP1 HF6, which significantly improves process time (reduced by a factor of 200). I'm now exploring how to manage such a set of parallel processes (PPs).
We know that turning logging on or off, or running SaveDataAll, can't be used in PPs as you get locks. So I have added monitoring code to the main TIs which update a status cell on at the start and off at the end, then written a TI which runs every 5 minutes monitoring the total of those cells, aiming to reactivate logging and do an SDA when the jobs are all complete. BUT... as far as I can see, the monitoring TI does not see the 1s in the cube. I've looked myself at the relevant cube, and added asciioutput code (with unique timestamped filenames) to dump out the contents, the flags don't show.
Ideas welcome.
If I can't get past this I suppose I'll write flags to the filesystem, but surely TM1 is a better place to do this - especially with PI? Is this a multiuser database or not? Or what have I missed?
Hi,

I am currently using the PP in our model. The way that I got it to work nicely was to create a Process Control cube that holds the information as to what part of the chore is running, and/or has completed. I have a chore that starts off the process (daily), and multiple chores that recur every 30 minutes looking for the completed statuses of the chores and processes so it knows when (via a conditional in the TI) to execute the next batch of processes. I mocked up a .jpg of how I came to this setup.

Hope this helps,
Process Control Cube Example
Process Control Cube Example
PP_Example.jpg (144.62 KiB) Viewed 10415 times
David Usherwood
Site Admin
Posts: 1453
Joined: Wed May 28, 2008 9:09 am

Re: 952 PI - management of multithreaded processes

Post by David Usherwood »

Maybe I'm missing something, but I think that your approach ensures that your processes run in sequence, not in parallel. What I'm trying to do is to force a massive amount of calculation through a large complex model in less time than happens now, utilising multiple cores to calculate distinct parts of the same set of cubes, writing the results to a static cube.
Is my understanding (of what you have done) correct?
ellissj3
Posts: 54
Joined: Tue Jun 15, 2010 1:43 pm
OLAP Product: Cognos TM1
Version: 9.0 - 10.2
Excel Version: 2010

Re: 952 PI - management of multithreaded processes

Post by ellissj3 »

Maybe I'm missing something, but I think that your approach ensures that your processes run in sequence, not in parallel. What I'm trying to do is to force a massive amount of calculation through a large complex model in less time than happens now, utilising multiple cores to calculate distinct parts of the same set of cubes, writing the results to a static cube.
Is my understanding (of what you have done) correct?

I overlooked a vital piece of the picture here. The number on the bottom right logs the processes as they finish, but they run concurrently via the TM1RunTI.exe in the Command Line.

My apologies,
PlanningDev
Community Contributor
Posts: 349
Joined: Tue Aug 17, 2010 6:31 am
OLAP Product: Planning Analytics
Version: 2.0.5
Excel Version: 2016

Re: 952 PI - management of multithreaded processes

Post by PlanningDev »

How do you control the serial updates for dimensions? Do you put all of the dimension updates in a chore to serialize them?

Also, does anyone know if you can run dimension updates in parallel as long as they don't have security and aren't related to any of the same cube objects?
PlanningDev
Community Contributor
Posts: 349
Joined: Tue Aug 17, 2010 6:31 am
OLAP Product: Planning Analytics
Version: 2.0.5
Excel Version: 2016

Re: 952 PI - management of multithreaded processes

Post by PlanningDev »

Do you use anything to perform the wait while your polling to see if processes have completed?
ellissj3
Posts: 54
Joined: Tue Jun 15, 2010 1:43 pm
OLAP Product: Cognos TM1
Version: 9.0 - 10.2
Excel Version: 2010

Re: 952 PI - management of multithreaded processes

Post by ellissj3 »

PlanningDev,
How do you control the serial updates for dimensions? Do you put all of the dimension updates in a chore to serialize them?

Also, does anyone know if you can run dimension updates in parallel as long as they don't have security and aren't related to any of the same cube objects?
I am unsure about parallel dimension updating. I haven't tested that.

After a dimension update or a rule save, you'll have to re-establish the cube dependencies.
Do you use anything to perform the wait while your polling to see if processes have completed?
I don't ask the TI to wait at the completion of the TM1RunTI execution. I have multiple chores that are scheduled recurring throughout the day, but they only kick off when certain crieria are satisfied.

The policing occurs in the first process of each chore. The example above is the first part of many. For the second chore to execute, the first chore has to be completed, and all the individual threads must be finished processing (which is the number on the bottom right). Otherwise the process executes a ChoreQuit and exits the chore. This chore (along with all the others) will recur throughout the day.
PlanningDev
Community Contributor
Posts: 349
Joined: Tue Aug 17, 2010 6:31 am
OLAP Product: Planning Analytics
Version: 2.0.5
Excel Version: 2016

Re: 952 PI - management of multithreaded processes

Post by PlanningDev »

I guess what I was asking is, do you use anything to perform a wait and let one TI thats executing continuously poll to see if criteria are met or not. It sounds like instead of doing this you have scheduled your data updates to run on some interval and at that point check wether or not all criteria have been met, and if not then chorequit.

Do you disable the schedule in any case if the process has already run? In my case Im looking to only run once a day but want to achieve parallel loading. I guess my thought was that if you continuously run the update and all criteria have been met then you don't need to keep running the chore or process untill the next day.
ellissj3
Posts: 54
Joined: Tue Jun 15, 2010 1:43 pm
OLAP Product: Cognos TM1
Version: 9.0 - 10.2
Excel Version: 2010

Re: 952 PI - management of multithreaded processes

Post by ellissj3 »

I don't disable the schedule, I just make sure that the criteria will only successful one time every day. I have a 2 logs occurring for every chore. One is to indicate chore is in progress, and one is when chore is complete. The chore will only execute when the chore is not in progress and not completed.
Post Reply