Memory Used to Text File Size Ratio

Post Reply
User avatar
jim wood
Site Admin
Posts: 3951
Joined: Wed May 14, 2008 1:51 pm
OLAP Product: TM1
Version: PA 2.0.7
Excel Version: Office 365
Location: 37 East 18th Street New York
Contact:

Memory Used to Text File Size Ratio

Post by jim wood »

Guys,

I am trying to export a big cube to a flat file (The whole cube) and it seems to be taking forever. The cube has no rules attached but contains lots of data. It currently takes around 9GB on our UNIX 64-bit version of 9.1 SP3 U2.

I know that 64-bit cubes can take up to twice as much memory which is why I am unsure what the final export size will be,

Jim.
Struggling through the quagmire of life to reach the other side of who knows where.
Shop at Amazon
Jimbo PC Builds on YouTube
OS: Mac OS 11 PA Version: 2.0.7
User avatar
Martin Ryan
Site Admin
Posts: 1988
Joined: Sat May 10, 2008 9:08 am
OLAP Product: TM1
Version: 10.1
Excel Version: 2010
Location: Wellington, New Zealand
Contact:

Re: Memory Used to Text File Size Ratio

Post by Martin Ryan »

I think the flat files of a rule-free cube end up quite a bit larger than the size on disk of the cube because it has to print out all the element references as well as the data. As for RAM relation to disk space, that depends on a lot including your dimension order.

I would guesstimate (and it's only a guess) that your flat file would end up around three times the size of your cube on disk. If you have a lot of dimensions in your cube the multiple might be higher.

Martin
Please do not send technical questions via private message or email. Post them in the forum where you'll probably get a faster reply, and everyone can benefit from the answers.
Jodi Ryan Family Lawyer
User avatar
jim wood
Site Admin
Posts: 3951
Joined: Wed May 14, 2008 1:51 pm
OLAP Product: TM1
Version: PA 2.0.7
Excel Version: Office 365
Location: 37 East 18th Street New York
Contact:

Re: Memory Used to Text File Size Ratio

Post by jim wood »

Thanks Martin. The size of the cube on the hard drive is just under 4GB. add to this the fact that it has 7 dimensions (One with over 480,000 members) and I think I can expect a flat file of over 12GB. Now the fun bit. I am exporting this to test whether Essbase can do what TM1 is doing at the moment. SHould be interesting,

Jim.
Struggling through the quagmire of life to reach the other side of who knows where.
Shop at Amazon
Jimbo PC Builds on YouTube
OS: Mac OS 11 PA Version: 2.0.7
User avatar
ykud
MVP
Posts: 148
Joined: Sat Jan 10, 2009 10:52 am
Contact:

Re: Memory Used to Text File Size Ratio

Post by ykud »

jim wood wrote: I am exporting this to test whether Essbase can do what TM1 is doing at the moment. SHould be interesting,
Sorry for raising old thread: Jim, any luck with this test?
And a few more questions concerning your current TM1 cube:
- what's the amount of data loaded (in # of rows and gb's)?
- what's the load time? are you loading incrementally\parallel load threads?

just a point to compare, I've recently built an essbase cube loading 70gb (~600mln rows) in less than 1,5 hours (40 min load, 50 minutes aggregation calc) on 4-cpu, 20GB Ram server. Final cube size: 10 gb.
User avatar
jim wood
Site Admin
Posts: 3951
Joined: Wed May 14, 2008 1:51 pm
OLAP Product: TM1
Version: PA 2.0.7
Excel Version: Office 365
Location: 37 East 18th Street New York
Contact:

Re: Memory Used to Text File Size Ratio

Post by jim wood »

Hello there,

We did complete the test. The final extract from the TM1 cube was 12GB in size. This was loaded in to Essbase without any worries. We also managed to replicate the process we run each week where in TM1 we upload another weeks worth of information, rebuild a large product hierarchy (Over 50,000 elements) and export 2 years worth of data for loading in to cubes beyond. Where Essbase struggled was beyond this cube. In some of the other cubes (Based on this information) we have dimensions with multiple hierarchies with elements having multiple shared parents. We managed to work round this (kind of) but it is still a problem going forward.

As things stand the business is still to choose between using IBM/Cognos or Oracle/Hyperion for it's MI tool set.We have proven that both OLAP tools are capable of doing what the other is being used for,

Jim.

Apologies for the lack of size information. We didn't really look at this as Essbase seemed to take the data without issue. (We used ASO rather than BSO by the way. The load of the 12GB file took around 35 minutes.)
Struggling through the quagmire of life to reach the other side of who knows where.
Shop at Amazon
Jimbo PC Builds on YouTube
OS: Mac OS 11 PA Version: 2.0.7
User avatar
ykud
MVP
Posts: 148
Joined: Sat Jan 10, 2009 10:52 am
Contact:

Re: Memory Used to Text File Size Ratio

Post by ykud »

Jim, another couple of questions, if I may:

1) Are there any estimates for data loading speed? We currently get something like 30k rows/s (3gb in 15 mins) on single thread using TurboIntegrator -- is that normal?
2) Any ways to run multithreaded data load? As far as I understand now, we've got to split data into several files and start several TI processes using BatchUpdate -- is that right?

Thanks, Yuri.
User avatar
jim wood
Site Admin
Posts: 3951
Joined: Wed May 14, 2008 1:51 pm
OLAP Product: TM1
Version: PA 2.0.7
Excel Version: Office 365
Location: 37 East 18th Street New York
Contact:

Re: Memory Used to Text File Size Ratio

Post by jim wood »

Hi Yuri,

Data load speeds via TI can depend heavily on netrwork traffic. It can also depend on how many calculations you have within the process. Also if you have had added any variables make sure they are set to data only (Unless you are amending dimensions) as if code appears on both Data and Metadata the process effectively runs twice.

As for multi processing loads, this is a restriction of TM1 in general rather than just TI. Spliting and loading is one option but it depends on your version. Version 8 still locks the system while data is being loaded. Also some versions of 9 crash if you try to run them via the scheduler in paralell.

I tend to find that most of my data loads occur out of hours. With the scheduler you have this option within TM1. Loads during the day tend to take longer as network traffic and TM1 usage is increased.

I hope that helps,

Jim.
Struggling through the quagmire of life to reach the other side of who knows where.
Shop at Amazon
Jimbo PC Builds on YouTube
OS: Mac OS 11 PA Version: 2.0.7
User avatar
Martin Ryan
Site Admin
Posts: 1988
Joined: Sat May 10, 2008 9:08 am
OLAP Product: TM1
Version: 10.1
Excel Version: 2010
Location: Wellington, New Zealand
Contact:

Re: Memory Used to Text File Size Ratio

Post by Martin Ryan »

ykud wrote: 2) Any ways to run multithreaded data load? As far as I understand now, we've got to split data into several files and start several TI processes using BatchUpdate -- is that right?
No (at least not yet). If you set off multiple TI processes simultaneously, all that will happen is that process B will wait for process A to finish before it starts. There's a bit of overhead at the start and finish of a process, so you'll actually find this method slower.

I haven't used Batch Update, so can't make a useful comment on it.

As Jim notes, running overnight loads are a good way of speeding things up and not annoying users.

Cheers,
Martin
Please do not send technical questions via private message or email. Post them in the forum where you'll probably get a faster reply, and everyone can benefit from the answers.
Jodi Ryan Family Lawyer
Post Reply