I am trying parallel execution for the first time and i'm living with a problem that i don't understand:
i have a master process that call some child process in batch using Tm1RunTi. The problem is, sometimes, aleatory, the master process just stop execution with any action completed.
The master process is calling 94 TI Process in Batch. Some times the master get finish successfully but othe times begin execution (i can see many childs process running in the TM1 Operations Console ) and then just stop with any results and no error messages in log file.
When i look in log file for more details i just find that:
- Only 25 child process was successfully completed (i was expecting 94)
- 50 records of TM1.Login (25 attempt and 25 success)
- and eventually this messages in tm1lockexeptiondebug.log:
sometime it fails with no lock exeption messages, so i am not sure this is the problem.17580 [44] DEBUG 2017-09-15 17:41:26.897 TM1.Lock.Exception Contention encountered attempting to acquire lock (0x00000011811C5410) on object [Client "CAMID("ldap_sprc:u:d3fd2c09f5e2e74da348fa0160ae90ed")", addr=0x00000011808DF010, index=R-2255] in READONLY mode at ..\tm1_r7s\TM1ClientImpl.cpp:1296 during function 'SystemServerConnectWithCAMNamespace'.
Entering wait state 'IXC'.
Blocked by the following 1 thread:
Thread 7908 holds the lock in IX mode
i was reviewing the code in the master process and in the child process but this is ok and i pretty sure because sometime it runs complete and successfully as i said before.
i have other master process that works fine but just calling 12 or 24 process a time.
As each TI process that i call in batch is a new Server Connection, i was wondering if the number of server connections is limited maybe by number of cors in the server or maybe by license...
i hear some people send more than 2000 process at the same time with Tm1RunTi... and i wasn't expecting to deal with any problem with just 94.
I have TM1 10.2.2 with FP7 installed, the server has 2 processors and MTQ is set as "MTQ = 2" in configuration file.
in anycase i can't explain why sometimes works pretty fine and sometimes just crash. I am making the testing always with the same Data and if i execute the childs one by one there's no problem... that's why i think is not a coding or Data problem. The only thing out of normal state is the lock message in log file that i have shown above.
I hope you can help me to understand more the parallel execution and to clarify if the problem is in the loggin step.
Thank you very much in advanced.