Page 1 of 1

Check if the data of two cubes are the same

Posted: Thu Jul 12, 2018 5:11 am
by jimhung777
Hello everyone,

We consider using the different method to load/write cube. In general, if we choose to use the concurrent TI process, the total job will be done faster. But since the TI called by another TI which is also called by the other TI, there will be several layers to finish the job. It also means that it could not simply check one or few views to determine if the data of two cubes are identical which deposit the result of two corresponding AS-IS/TO-BE executed processes. And we need them to be the same but choosing the shorter execute time in the better method or algorithm. The question is during the trial and error experiment, how could we possible to examine control cube and test cube they are actually the same cube?

Are there some suggestions like loop through the whole of two cubes to check if they are the same, no matter in structure or in data?

Many thanks first. :) :)

Re: Check if the data of two cubes are the same

Posted: Thu Jul 12, 2018 9:31 am
by David Usherwood
Where to start...
Firstly, it wouldn't be sensible to review the destination cube till all the concurrent processes are complete.
Secondly, TM1 exists to roll up data to totals. If the dimensions of your cubes have been designed sensibly there will be totals on almost all dimensions. Create a slice showing source and destination totals and differences, with some pretty conditional formatting. Job done.

Re: Check if the data of two cubes are the same

Posted: Thu Jul 12, 2018 2:26 pm
by jimhung777
David Usherwood wrote: Thu Jul 12, 2018 9:31 am Where to start...
Firstly, it wouldn't be sensible to review the destination cube till all the concurrent processes are complete.
Secondly, TM1 exists to roll up data to totals. If the dimensions of your cubes have been designed sensibly there will be totals on almost all dimensions. Create a slice showing source and destination totals and differences, with some pretty conditional formatting. Job done.
Thank you for the reply.

1. Yes. We aim at the result of two cubes after processing. Probably not just concurrent vs not. Is like knowing every comparison of control/test process to find out which is the best practice to manipulate the data.

2. This I'd like to hear more since the processes we are running is kind of like allocation. That is, either we choose to allocate the weight [10, 30, 20] to [A, B, C] or the weight [20, 10, 30] to [A, B, C], to the consolidation level, sum up A, B, C is 60. But it actually is different.

Perhaps we could see another way we want the total of A.

A1, B1, C1=[10, 30, 20],
A2, B2, C2=[20, 10, 30]
or
A1, B1, C1=[20, 30, 10],
A2, B2, C2=[10, 20, 30]

Then the total of A is still 30. And surely the total B is different. However, we could design like Total = A+B+C+...+Z, but hard to like A=A1+A2+..., B=B1+B2+... Z=Z1+Z2...Especially, if we have like 36000 elements in each dimension and totally 25 dimensions in one cube. It is hard to distinguish which total of one of 36000 elements is different by drag and drop on cube view.

Maybe I am wrong, please guide me the way.

Re: Check if the data of two cubes are the same

Posted: Thu Jul 12, 2018 6:52 pm
by Wim Gielis
Then you can:

1) Set up a 2nd cube next to the first cube, with the correct dimensions if they differ from the dimensions in the first cube
2) You have 2 sets of processes, one set to each cube, and with the different logic
3) You run the processes
4) You and some colleagues cancel your plans for the next weekend and check cube views at all kinds of levels in the cubes during the next weekend. Maybe let's add the weekend after as well.

Seriously though. If you cannot rely on only checking top-level aggregations and a limited number of other cells, then you will have to check the results manually. Or, create text file exports from the cubes and compare those with for example Notepad++ (Compare option) or similar tools. Or, write a rule in a 3rd (similar) cube that says subtracts cube 2 from cube 1 and if the result is not 0 then you have a difference in either or both cubes.

Re: Check if the data of two cubes are the same

Posted: Fri Jul 13, 2018 1:40 am
by jimhung777
Wim Gielis wrote: Thu Jul 12, 2018 6:52 pm Then you can:

1) Set up a 2nd cube next to the first cube, with the correct dimensions if they differ from the dimensions in the first cube
2) You have 2 sets of processes, one set to each cube, and with the different logic
3) You run the processes
4) You and some colleagues cancel your plans for the next weekend and check cube views at all kinds of levels in the cubes during the next weekend. Maybe let's add the weekend after as well.

Seriously though. If you cannot rely on only checking top-level aggregations and a limited number of other cells, then you will have to check the results manually. Or, create text file exports from the cubes and compare those with for example Notepad++ (Compare option) or similar tools. Or, write a rule in a 3rd (similar) cube that says subtracts cube 2 from cube 1 and if the result is not 0 then you have a difference in either or both cubes.
I think this is a good idea! If possible, that could be one rule expression to discover the difference. :)
Could I just do?

Code: Select all

[Cube3] = [Cube1] - [Cube2];
My another initial thinking is writing one process like Bedrock.Cube.Clone. First, loop through dimensions to examine the structure as like name, type, order, level, attribute...then loop through data indexically, if there is any difference the TI will stop. But in my way, that would take a very long time to finish the job if the cube and dimensions are huge and complicated.

Re: Check if the data of two cubes are the same

Posted: Fri Jul 13, 2018 8:52 am
by Steve Rowe
A couple of thoughts

25 dimensions in a cube is pretty unusual, are you sure you have the design right or consider hierarchies in the latest release.

As to your actual problem

How about using DOS to compare the .cub files on the disc, I've never tried this so I'm assuming that two identical cubes at the N level would always result in two identical .cub files. Its possible that it won't of course...

Also similar to the rule based approach, what about TI'ing the relevant slice of each cube into a third cube and doing the comparison, this would avoid two potentially massive feeder going from your source cubes to the comparative cube.