Data Saves not completing and retriggering 10.2.2 FP4

Post Reply
shorsted
Posts: 32
Joined: Mon Dec 15, 2008 4:37 pm

Data Saves not completing and retriggering 10.2.2 FP4

Post by shorsted » Thu Sep 06, 2018 2:32 pm

Hi

We are running into an occasional issue where our data save (running on a chore) is not completing. It starts, we can see it running in Tm1Top, many other chores, processes and data loads are then happening alongside this. The save process does not complete, so it then reruns, and again doesn't complete. This might have 4 or 5 times. We have lost some data this way as the cub files didn't update and today it finally resulted in the instance crashing.

We are considering adding the data save onto the end of the big data load processes (it's legacy that it runs on a separate chore) but as we have also experienced issues when 2 data save overlap we are cautious about that too!

Has anyone experienced similar? We have PI turned on in the cfg. We are running 10.2.2 FP4 on windows 2008 server

I am in the process of getting the latest FP available to us in the hope that it resolves the issue but as we have to go through IT this won't be quick. (Version is probably only FP6 as IT in their wisdom decided to cancel our support contract with iBM - in the process of getting this readdressed!)

Any help would be most appreciated!
Sarah

User avatar
gtonkin
MVP
Posts: 638
Joined: Thu May 06, 2010 3:03 pm
OLAP Product: TM1
Version: PAL 2.0.3
Excel Version: 2016 64-bit
Location: JHB, South Africa
Contact:

Re: Data Saves not completing and retriggering 10.2.2 FP4

Post by gtonkin » Thu Sep 06, 2018 5:19 pm

Is the save in the Prolog? If it is move it to Epilog. There have been issues with a SaveDataAll in Prolog.

User avatar
tomok
MVP
Posts: 2498
Joined: Tue Feb 16, 2010 2:39 pm
OLAP Product: TM1, Palo
Version: Beginning of time thru 10.2
Excel Version: 2003-2007-2010-2013
Location: Atlanta, GA
Contact:

Re: Data Saves not completing and retriggering 10.2.2 FP4

Post by tomok » Thu Sep 06, 2018 7:25 pm

You really should NEVER have any other processes running when a Save Data is taking place, it's just going to cause problems. Turning on PI is not going to help because Save Data saves the entire cube at once, not just the area you changed. You need to go back and create a schedule that ensures that no other processes will run while the Save Data is taking place.
Tom O'Kelley - Manager Finance Systems
American Tower
http://www.onlinecourtreservations.com/

User avatar
jim wood
Site Admin
Posts: 3610
Joined: Wed May 14, 2008 1:51 pm
OLAP Product: TM1
Version: TM1 10.2.2
Excel Version: 2007
Location: 1639 Route 10, Suite 107, Parsippany, NJ, USA
Contact:

Re: Data Saves not completing and retriggering 10.2.2 FP4

Post by jim wood » Thu Sep 06, 2018 7:30 pm

tomok wrote:
Thu Sep 06, 2018 7:25 pm
You really should NEVER have any other processes running when a Save Data is taking place, it's just going to cause problems. Turning on PI is not going to help because Save Data saves the entire cube at once, not just the area you changed. You need to go back and create a schedule that ensures that no other processes will run while the Save Data is taking place.
Agreed, but if you don't want to create a schedule, wrap your processes in a driver process, executing the save as it's own process.
Struggling through the quagmire of life to reach the other side of who knows where.
Application Consulting Group (ACG) TM1 Consulting
OS: Windows 10 64-bit. TM1 Version: 10.2.2

Wim Gielis
MVP
Posts: 1830
Joined: Mon Dec 29, 2008 6:26 pm
OLAP Product: TM1
Version: PAL 2.0
Excel Version: 2016
Location: Brussels, Belgium
Contact:

Re: Data Saves not completing and retriggering 10.2.2 FP4

Post by Wim Gielis » Thu Sep 06, 2018 8:39 pm

tomok wrote:
Thu Sep 06, 2018 7:25 pm
You really should NEVER have any other processes running when a Save Data is taking place, it's just going to cause problems.
I fully agree. But then you have the users claiming that deadlines are tight and still somehow want to execute their 'small process' that only takes a couple of minutes...

At a major TM1 site, the save data takes about 20-25 minutes. It is done at 7.30AM, 1.30PM and 7.30PM.
At night, chores run from about 2AM to 7AM. The application is used in Europe only, so not much time differences.
I've seen cases where the midday savedata starts 35 times before it completes, around 5.30 PM.
There are always users who force the savedata to start over again.

In general, what do people do to enforce a window where the save data can run freely ?
As I understand it, it's very difficult to technically force such a window.
  • Should we use a small cube to put a flag when the save data starts, take it away when the save data is done ? Drawback, several hundreds of processes need to be adapted to look in the cube to know whether they can run or not.
  • Should we postpone the midday savedata and have 2 savedata's instead of 3 ? It's dangerous to lose half a day of work of many users
  • Putting security access to none when the savedata starts seems risky business too.
Other ideas ?
Last edited by Wim Gielis on Fri Sep 07, 2018 6:41 am, edited 1 time in total.
Best regards,

Wim Gielis

Excel Most Valuable Professional, 2011-2014
http://www.wimgielis.com ==> 105 TM1 articles and a lot of custom code
Newest blog article: Looping over input files

User avatar
gtonkin
MVP
Posts: 638
Joined: Thu May 06, 2010 3:03 pm
OLAP Product: TM1
Version: PAL 2.0.3
Excel Version: 2016 64-bit
Location: JHB, South Africa
Contact:

Re: Data Saves not completing and retriggering 10.2.2 FP4

Post by gtonkin » Fri Sep 07, 2018 6:22 am

Wim Gielis wrote:
Thu Sep 06, 2018 8:39 pm
...
Other ideas ?
Have not tried this myself as yet, but what about a caller process that loops through the cubes, for each cube a child process is run to do a CubeSaveData.

Doing this may create breathers at the very least. Not sure how changes to the cube will be treated should the cube being saved change during the save.
I may try this in the near future as we have a similar issue to the one you mention at one large budgeting client.
The trick with these things is to be able to adequately test in with equivalent load/change as production has.

Edit: Most of the time is spent saving the .feeder files from the monitoring I have done.

lotsaram
MVP
Posts: 3141
Joined: Fri Mar 13, 2009 11:14 am
OLAP Product: TableManager1
Version: PA 2.0.x
Excel Version: Office 365
Location: Switzerland

Re: Data Saves not completing and retriggering 10.2.2 FP4

Post by lotsaram » Fri Sep 07, 2018 8:53 am

shorsted wrote:
Thu Sep 06, 2018 2:32 pm
We are running into an occasional issue where our data save (running on a chore) is not completing. It starts, we can see it running in Tm1Top, many other chores, processes and data loads are then happening alongside this.
As others have said and as the IBM documentation states, the save data should run a "quarantined" window without interruption from other processes.
Wim Gielis wrote:
Thu Sep 06, 2018 8:39 pm
At a major TM1 site, the save data takes about 20-25 minutes. It is done at 7.30AM, 1.30PM and 7.30PM.
At night, chores run from about 2AM to 7AM. The application is used in Europe only, so not much time differences.
I've seen cases where the midday savedata starts 35 times before it completes, around 5.30 PM.
There are always users who force the savedata to start over again.
I hear you and also know the pain of managing a busy production model with users over multiple timezones. Finding windows for batch loads and data save is a challenge! But why are there 3 data saves withing the day? And why (oh why oh why) is there one in the middle of the day? Maybe (probably?) there were past issues with stability and frequent data saves were implemented to prevent data loss. But simply this is the wrong approach. If transaction logging is handled correctly then risk of data loss is already handled. Also today with persistent feeders, multi-threaded feeders on startup, parallel loading, etc. a TM1 server can recover much faster. I don't see a reason to run a SaveDataAll more than 1x per day, typically at the conclusion of the data loading batch window.
Please place all requests for help in a public thread. I will not answer PMs requesting assistance.

Wim Gielis
MVP
Posts: 1830
Joined: Mon Dec 29, 2008 6:26 pm
OLAP Product: TM1
Version: PAL 2.0
Excel Version: 2016
Location: Brussels, Belgium
Contact:

Re: Data Saves not completing and retriggering 10.2.2 FP4

Post by Wim Gielis » Fri Sep 07, 2018 9:23 am

lotsaram wrote:
Fri Sep 07, 2018 8:53 am
shorsted wrote:
Thu Sep 06, 2018 2:32 pm
We are running into an occasional issue where our data save (running on a chore) is not completing. It starts, we can see it running in Tm1Top, many other chores, processes and data loads are then happening alongside this.
As others have said and as the IBM documentation states, the save data should run a "quarantined" window without interruption from other processes.
Wim Gielis wrote:
Thu Sep 06, 2018 8:39 pm
At a major TM1 site, the save data takes about 20-25 minutes. It is done at 7.30AM, 1.30PM and 7.30PM.
At night, chores run from about 2AM to 7AM. The application is used in Europe only, so not much time differences.
I've seen cases where the midday savedata starts 35 times before it completes, around 5.30 PM.
There are always users who force the savedata to start over again.
I hear you and also know the pain of managing a busy production model with users over multiple timezones. Finding windows for batch loads and data save is a challenge! But why are there 3 data saves withing the day? And why (oh why oh why) is there one in the middle of the day? Maybe (probably?) there were past issues with stability and frequent data saves were implemented to prevent data loss. But simply this is the wrong approach. If transaction logging is handled correctly then risk of data loss is already handled. Also today with persistent feeders, multi-threaded feeders on startup, parallel loading, etc. a TM1 server can recover much faster. I don't see a reason to run a SaveDataAll more than 1x per day, typically at the conclusion of the data loading batch window.
It's for certain that data losses in the past have attributed to the question of 3 savedata operations per day.

Some background... Please comment where you see improvements.

The nightly loads coming from SAP (2AM-7AM) and other sources are committed with a savedata at 7.30 AM. Transaction logging is turned off during the loads.
Then users enter data in a heavy way and also execute a lot of processes. This is a planning application at a very granular level and mostly TI is used to push results to (different) P&L's - 1 P&L for each of the 'tracks' in the model. A savedata at noon is used to confirm the changes in the morning.
The same applies to the afternoon, heavy usage of TM1 and savedata at 7.30 PM.

All cubes have transaction logging turned on outside of TI processes, turned off during process runs.
During the day TI processes also load data from SAP (with the TM1 Package connector) although users in general do not execute the same processes as the ones that run during the night - it can occur though and then it blocks other users. The main reasons are that metadata and data operations are not split, and that 1 Product dimension (100,000 elements in a retail environment) is shared over the tracks. All tracks use the same product dimensinon. Having multiple Product dimensions was not an option I have been told.

Restarting TM1 takes around 30 minutes with all the features you highlighted, memory is at 90 GB currently.
But when the feeders files need to be recreated we have 4.5 hours of startup time.
TM1 Web is used and Perspectives, no PAW or PAX.
If transaction logging is handled correctly then risk of data loss is already handled[\quote]

How does this fit in the explanation above ? Can you do suggestions ?
Best regards,

Wim Gielis

Excel Most Valuable Professional, 2011-2014
http://www.wimgielis.com ==> 105 TM1 articles and a lot of custom code
Newest blog article: Looping over input files

lotsaram
MVP
Posts: 3141
Joined: Fri Mar 13, 2009 11:14 am
OLAP Product: TableManager1
Version: PA 2.0.x
Excel Version: Office 365
Location: Switzerland

Re: Data Saves not completing and retriggering 10.2.2 FP4

Post by lotsaram » Fri Sep 07, 2018 9:54 am

Hi Wim,

Based on your description I would do 2x SaveData
- immediately before the loads from SAP (01:30). Make sure all user entered data from daily activities is committed to disk
- immediately after load from SAP finishes (07:00). Make sure all data changes from source system done without transaction logging are committed to disk
Please place all requests for help in a public thread. I will not answer PMs requesting assistance.

Wim Gielis
MVP
Posts: 1830
Joined: Mon Dec 29, 2008 6:26 pm
OLAP Product: TM1
Version: PAL 2.0
Excel Version: 2016
Location: Brussels, Belgium
Contact:

Re: Data Saves not completing and retriggering 10.2.2 FP4

Post by Wim Gielis » Fri Sep 07, 2018 10:01 am

Hi,

Thanks Lotsaram ! That's clear.

In fact, to be complete, I should have added that after daily activity by the user, a few chores kick in (10PM).
Then a Savedata is done at 11.30 PM, after which 7zip does a backup of the TM1 data dir.
Lastly at 2AM the chores load data until 7AM, with a save data to conclude.

So basically, we should get rid of the midday savedata, weighing off:
- the risk of losing up to a day of work, but manual data entry will be collected in the transaction logs so could be purged. We lose the data is loaded / copied / calculated with Turbo Integrator
- ... against the benefit of not having a save data at noon that often takes a long time, starts over again, difficult to find a time window, impacts the users (and vice versa), etc.
Best regards,

Wim Gielis

Excel Most Valuable Professional, 2011-2014
http://www.wimgielis.com ==> 105 TM1 articles and a lot of custom code
Newest blog article: Looping over input files

lotsaram
MVP
Posts: 3141
Joined: Fri Mar 13, 2009 11:14 am
OLAP Product: TableManager1
Version: PA 2.0.x
Excel Version: Office 365
Location: Switzerland

Re: Data Saves not completing and retriggering 10.2.2 FP4

Post by lotsaram » Fri Sep 07, 2018 10:50 am

Hi Wim,

Yes basically. I think the middle of the day data save is causing or has potential to cause more problems than it solves. If the server can't acquire the locks it needs for the save data and this is causing "thrashing" with multiple rollbacks and restarts then this is going to have performance consequences and lead to bad user experience. If transaction logging is on for all user changes happening during the day then in the event of any crash the changes would be recovered anyway. You could go a step further and also log TI changes that came from user actions during the day to guarantee 100% recovery.
Please place all requests for help in a public thread. I will not answer PMs requesting assistance.

Wim Gielis
MVP
Posts: 1830
Joined: Mon Dec 29, 2008 6:26 pm
OLAP Product: TM1
Version: PAL 2.0
Excel Version: 2016
Location: Brussels, Belgium
Contact:

Re: Data Saves not completing and retriggering 10.2.2 FP4

Post by Wim Gielis » Fri Sep 07, 2018 10:54 am

lotsaram wrote:
Fri Sep 07, 2018 10:50 am
Hi Wim,

Yes basically. I think the middle of the day data save is causing or has potential to cause more problems than it solves. If the server can't acquire the locks it needs for the save data and this is causing "thrashing" with multiple rollbacks and restarts then this is going to have performance consequences and lead to bad user experience. If transaction logging is on for all user changes happening during the day then in the event of any crash the changes would be recovered anyway. You could go a step further and also log TI changes that came from user actions during the day to guarantee 100% recovery.
Makes sense ! Thanks.

Sarah: apologies for hijacking your thread.
If you feel not all questions are answered, please continue.
Best regards,

Wim Gielis

Excel Most Valuable Professional, 2011-2014
http://www.wimgielis.com ==> 105 TM1 articles and a lot of custom code
Newest blog article: Looping over input files

shorsted
Posts: 32
Joined: Mon Dec 15, 2008 4:37 pm

Re: Data Saves not completing and retriggering 10.2.2 FP4

Post by shorsted » Mon Sep 10, 2018 1:58 pm

Hi

No worries about hijacking the thread as your experience seems very similar to our own!

We are also going to turn off our midday save and instead run later in the evening when there is less user traffic. Unfortunately our users use the system from 6am until midnight and we have files being submitted constantly through that time so we may need to look at some radical changes. Most of the suggestions are things that we are already considering

shorsted
Posts: 32
Joined: Mon Dec 15, 2008 4:37 pm

Re: Data Saves not completing and retriggering 10.2.2 FP4

Post by shorsted » Mon Sep 10, 2018 2:01 pm

Sorry posted that last reply too early!

Thanks for the suggestions, I'm not sure we can stop users submitting data in certain windows but are looking at how we might better manage it. :( I will post back if we find any of the suggestions work better than others!

Regards
Sarah

Post Reply