Server Crash Due To Page File

Post Reply
kenship
Posts: 93
Joined: Thu May 30, 2013 1:41 pm
OLAP Product: Cognos
Version: TM1 10.2
Excel Version: 2010

Server Crash Due To Page File

Post by kenship » Thu Mar 01, 2018 3:28 pm

One of our instance has been crashing 3 times for the last 2 days. The immediate reason is that there was no more free disk space for read and write of temporary files. At worst only 80k was available before crash. We have a small model and here's what I know about our setup:

Total memory = 32GB
Total disk space = 100GB
Total file size = 40 GB
Total memory used according to StatsForServer normally = 6GB
Total memory for cubes according to StatsForCube = 5GB

We noticed that all available disk space (60GB) was taken up as page file before crash, StatsForServer before crash reported the following:

Number of Connection: 27
Number of Active Threads: 27

Memory Used: 98,345,277,816
Memory in Garbage: 352,064,512

Memory used and active threads were way too high compared to our normal (6GB and 3-5 active threads).

We checked the log and couldn't find anything meaningful to diagnose the situation. IT told us that the page file size grew quickly but we don't know the cause of it.

Any comment, suggestion to what we should look at is greatly appreciated.

Kenneth

User avatar
tomok
MVP
Posts: 2412
Joined: Tue Feb 16, 2010 2:39 pm
OLAP Product: TM1, Palo
Version: Beginning of time thru 10.2
Excel Version: 2003-2007-2010-2013
Location: Atlanta, GA
Contact:

Re: Server Crash Due To Page File

Post by tomok » Thu Mar 01, 2018 4:21 pm

I don't know all the ins and outs of the technicalities about paging files but I will tell you that, based on my experience, you are crazy to run TM1 on a server where the available disk space is only a factor of 3 times larger than the memory you have installed in your server. This should be more like a factor of 10 or more.
Tom O'Kelley - Manager Finance Systems
American Tower
http://www.onlinecourtreservations.com/

David Usherwood
Site Admin
Posts: 1314
Joined: Wed May 28, 2008 9:09 am

Re: Server Crash Due To Page File

Post by David Usherwood » Thu Mar 01, 2018 5:50 pm

Turn on Performance Monitor and study }StatsbyCube and }StatsforServer - that should tell you what's eating memory (and thus your pagefile).

User avatar
gtonkin
MVP
Posts: 578
Joined: Thu May 06, 2010 3:03 pm
OLAP Product: TM1
Version: PAL 2.0.1
Excel Version: 2016 64-bit
Location: JHB, South Africa
Contact:

Re: Server Crash Due To Page File

Post by gtonkin » Thu Mar 01, 2018 5:59 pm

May also be worthwhile deleting your feeder files if you are using persistent feeders, especially if you have been making rule changes and other structure changes.
From some dev environments I have noticed that you can accumulate a lot of rubbish which then seems to be loaded, even if not needed, purely because it is in the feeders file.

User avatar
Steve Rowe
Site Admin
Posts: 1742
Joined: Wed May 14, 2008 4:25 pm
OLAP Product: TM1
Version: 10.2.2., PAW
Excel Version: Nearly all of them

Re: Server Crash Due To Page File

Post by Steve Rowe » Thu Mar 01, 2018 6:00 pm

Have you by any chance got any CA reports looking at (the new) hierarchies?

The issue should be relatively easy to diagnose just by monitoring the DB and the stats cube, you have a repeatable issue. You'll need to monitor what's going on inside the DB, the external stat you list don't really help other than it is using a bunch of RAM for some reason.

kenship
Posts: 93
Joined: Thu May 30, 2013 1:41 pm
OLAP Product: Cognos
Version: TM1 10.2
Excel Version: 2010

Re: Server Crash Due To Page File

Post by kenship » Thu Mar 01, 2018 6:09 pm

Here's the interesting thing, we haven't loaded or modified any new dimension, cube, hierarchy whatsoever.

When the memory from server stat reached 98G, my cube stat totals 6G only, which gives me no clue of which cube is giving me issue.

Kenneth


Steve Rowe wrote:
Thu Mar 01, 2018 6:00 pm
Have you by any chance got any CA reports looking at (the new) hierarchies?

The issue should be relatively easy to diagnose just by monitoring the DB and the stats cube, you have a repeatable issue. You'll need to monitor what's going on inside the DB, the external stat you list don't really help other than it is using a bunch of RAM for some reason.

User avatar
gtonkin
MVP
Posts: 578
Joined: Thu May 06, 2010 3:03 pm
OLAP Product: TM1
Version: PAL 2.0.1
Excel Version: 2016 64-bit
Location: JHB, South Africa
Contact:

Re: Server Crash Due To Page File

Post by gtonkin » Thu Mar 01, 2018 6:22 pm

kenship wrote:
Thu Mar 01, 2018 6:09 pm
Here's the interesting thing, we haven't loaded or modified any new dimension, cube, hierarchy whatsoever.
Have you been loading data via SQL by any chance? Noticed a similar increase in memory at a client, luckily did not crash but seem to remember garbage collection also sitting with quite a lot of memory used.
P.s. not exactly sure if you meant loading cubes or loading data into cubes.

kenship
Posts: 93
Joined: Thu May 30, 2013 1:41 pm
OLAP Product: Cognos
Version: TM1 10.2
Excel Version: 2010

Re: Server Crash Due To Page File

Post by kenship » Thu Mar 01, 2018 6:43 pm

We have one daily process. It ran about 9 hours before crash, didn't notice any error message.

What I meant by loading means running TI to load csv file for building dimension. What I meant to say was we were not doing anything out of ordinary for the last few years.

Right now I'm monitoring the free space available and it hasn't changed much since server restart about 22 hours ago.

Kenneth
gtonkin wrote:
Thu Mar 01, 2018 6:22 pm
kenship wrote:
Thu Mar 01, 2018 6:09 pm
Here's the interesting thing, we haven't loaded or modified any new dimension, cube, hierarchy whatsoever.
Have you been loading data via SQL by any chance? Noticed a similar increase in memory at a client, luckily did not crash but seem to remember garbage collection also sitting with quite a lot of memory used.
P.s. not exactly sure if you meant loading cubes or loading data into cubes.

Mark RMBC
Posts: 132
Joined: Tue Sep 06, 2016 7:55 am
OLAP Product: TM1
Version: 10.1.1
Excel Version: Excel 2010

Re: Server Crash Due To Page File

Post by Mark RMBC » Fri Mar 02, 2018 10:04 am

Hi,

I wasn't directly involved but seem to remember a server crash where the error said paging file was too small. I think the problem related to the fact that in one of the CFG files a server AdminHost was pointing to the live URL and not the test environment.

Hope that at least rules out one possibility!

cheers, Mark

Post Reply