Chore crashing tm1

Post Reply
mbeckw
Posts: 29
Joined: Tue Jul 08, 2008 5:00 pm
OLAP Product: TM1
Version: 10.1.1
Excel Version: 2010

Chore crashing tm1

Post by mbeckw »

I have a chore that I have been using for about 6 months with no problem. All of a sudden it causes tm1 to crash when it is on a schedule. I run the chore manually and have no problem but the scheduled chore always crashes the system. Items that might have an impact are:

we are using v9.1.2 as a service.

running 2 databases (migrating to new cubes in a new database and then deleting the old database)

running about 1.6 Gig of memory on a 32bit server. will be about 1.2 Gig whenols database is deleted.

Thanks
TM1 version 9.1 using excel 2003
User avatar
Alan Kirk
Site Admin
Posts: 6606
Joined: Sun May 11, 2008 2:30 am
OLAP Product: TM1
Version: PA2.0.9.18 Classic NO PAW!
Excel Version: 2013 and Office 365
Location: Sydney, Australia
Contact:

Re: Chore crashing tm1

Post by Alan Kirk »

mbeckw wrote:I have a chore that I have been using for about 6 months with no problem. All of a sudden it causes tm1 to crash when it is on a schedule. I run the chore manually and have no problem but the scheduled chore always crashes the system. Items that might have an impact are:

we are using v9.1.2 as a service.

running 2 databases (migrating to new cubes in a new database and then deleting the old database)

running about 1.6 Gig of memory on a 32bit server. will be about 1.2 Gig whenols database is deleted.

Thanks
I suggest checking the log files to see whether another chore is scheduled to / attempting to run at the same time. There was a bug in earlier releases of 9.1 (fixed in SP4) which caused the server to crash if two chores tried to run simultaneously. (Can't recall whether it was in 9.1 SP2.)

Actually it's worth checking the server log anyway to see what (if any) messages are written at the time of the crash. Also, check the Event Viewer on the Windows server (right click on My Computer and select "Manage") to see if any error events were reported.

Finally, is one of the commands that you're using SaveDataAll? That thing's been flaky in scheduled chores for ages. I believe it's supposed to have been fixed in the latest releases, but I'll believe it when I see it. If you're doing a data save, you may want to split that off into a separate chore and see whether that's what's triggering the crash.
"To them, equipment failure is terrifying. To me, it’s 'Tuesday.' "
-----------
Before posting, please check the documentation, the FAQ, the Search function and FOR THE LOVE OF GLUB the Request Guidelines.
User avatar
George Regateiro
MVP
Posts: 326
Joined: Fri May 16, 2008 3:35 pm
OLAP Product: TM1
Version: 10.1.1
Excel Version: 2007 SP3
Location: Tampa FL USA

Re: Chore crashing tm1

Post by George Regateiro »

Checking the log between the two different runs is a good idea. I am on 9.1 sp2 and had an incident where a chore that was running fine when manually kicked off would lock up the server.

Looking at the logs I was seeing different behavior

1) When run manually all process ran in serial
2) When scheduled all processes ran parallel, the only problem it was the same process that was running 4 times during the chore with different parameters and it locked up the system

Support was unable to help and the eventual solution was to rebuild the chore and now everything is happy.

Not much in the old post, but is shows an example of my log

http://applixforum.olapforums.com/viewP ... eir#161921
User avatar
mattgoff
MVP
Posts: 516
Joined: Fri May 16, 2008 1:37 pm
OLAP Product: TM1
Version: 10.2.2.6
Excel Version: O365
Location: Florida, USA

Re: Chore crashing tm1

Post by mattgoff »

Locking during competing chores does seem to be fixed for us in 9.1 SP4, BUT it appears (I'm still debugging) that this does not hold true for scheduled replication. I have two different planets with overlapping RESERVE rights-- it looks like if their replications occur at the same time it's random which planet's data wins, even if only one planet has changes.

Meaning/Scenario: Planet A has changes + Planet B has none + coincident replication with a shared Star = original values could be written back to Planet A erasing updates. Since it's a race condition, this can be inconsistent too.

Not sure how many people are using replication, but it's something to be cautious.

Matt
Please read and follow the Request for Assistance Guidelines. It helps us answer your question and saves everyone a lot of time.
Gregor Koch
MVP
Posts: 263
Joined: Fri Jun 27, 2008 12:15 am
OLAP Product: Cognos TM1, CX
Version: 9.0 and up
Excel Version: 2007 and up

Re: Chore crashing tm1

Post by Gregor Koch »

The "locking during competing chores" is definetely not fixed in 9.1 SP4.

We are testing 9.1 SP4 (to upgrade from 9.0 SP3 U8) and have so many problems with it, one of them being competing chores which hang the server, that the upgrade has been postponed.

Cheers
User avatar
Alan Kirk
Site Admin
Posts: 6606
Joined: Sun May 11, 2008 2:30 am
OLAP Product: TM1
Version: PA2.0.9.18 Classic NO PAW!
Excel Version: 2013 and Office 365
Location: Sydney, Australia
Contact:

Re: Chore crashing tm1

Post by Alan Kirk »

Gregor Koch wrote:The "locking during competing chores" is definetely not fixed in 9.1 SP4.

We are testing 9.1 SP4 (to upgrade from 9.0 SP3 U8) and have so many problems with it, one of them being competing chores which hang the server, that the upgrade has been postponed.
I'm not sure whether we've started to talk about two different things here. In 9.1 SP3, if you ran two chores at the same time it wouldn't lock the server or hang it; it would tank the thing, crash it, terminate it with extreme prejudice, the server wouldn't be pining, it would be passed on. The server session would be no more. It ceased to be. It expired and went to meet its maker. This is a late server session. It's a stiff. Bereft of life, it rests in peace. The server session wouldn't voom if you put four thousand volts through it, and believe me after that happened a few times that was awfully tempting.

This crash would be accompanied by a Windows memory read / write error dialog on the server box which would prevent the service from restarting until you cleared it.

That error does indeed appear to have been fixed in SP4, or at the very least we haven't been able to reproduce it using the same chores that would crash SP3.

This of course doesn't necessarily mean that there aren't necessarily OTHER problems with it, one of which may be what's being referred to here; though we haven't encountered them as yet.
"To them, equipment failure is terrifying. To me, it’s 'Tuesday.' "
-----------
Before posting, please check the documentation, the FAQ, the Search function and FOR THE LOVE OF GLUB the Request Guidelines.
Gregor Koch
MVP
Posts: 263
Joined: Fri Jun 27, 2008 12:15 am
OLAP Product: Cognos TM1, CX
Version: 9.0 and up
Excel Version: 2007 and up

Re: Chore crashing tm1

Post by Gregor Koch »

Alan,

I think I did start talking about something else. Sorry about that.

Wasn't talking about a (Monty Python-) server crash.
In the end the result for us is pretty similar though, because after the two competing chores hang each other all that is left to do is to restart the server.

The solution, and I think this was mentioned elsewhere before, is to have the involved processes write a flag to indicate they are running and have all chores check that flag.

Cheers

PS:
"Tis but a scratch"
"A scratch?! Your arm's off!"
"No, it isn't.'”
User avatar
Steve Vincent
Site Admin
Posts: 1054
Joined: Mon May 12, 2008 8:33 am
OLAP Product: TM1
Version: 10.2.2 FP1
Excel Version: 2010
Location: UK

Re: Chore crashing tm1

Post by Steve Vincent »

<refrains from Monty Python post in aid of remaining on topic>
Had that happen in 9.0.3 on a few occasions too so it's been around a while. One chore that copies some performance monitor stuff, and any number of others where it kicked in during its running. Sometimes it would complete (but much, much slower than usual) other times i decided to give up and killed the service manually. Always seemed to happen at the worst possible time too...

I have played with 9.4 a bit (but had to remove it today due to other work earmarked for the server) but didn't get to testing the chore issue. I'd had enough problems with the new audit logs before i started to look in to it in any depth.

now where is that manual on how not to be seen...
If this were a dictatorship, it would be a heck of a lot easier, just so long as I'm the dictator.
Production: Planning Analytics 64 bit 2.0.5, Windows 2016 Server. Excel 2016, IE11 for t'internet
Post Reply