Is MOCO CPU agnostic?

OpenSim Moco is a software toolkit to solve optimal control problems with musculoskeletal models defined in OpenSim using the direct collocation method.
User avatar
Pasha van Bijlert
Posts: 227
Joined: Sun May 10, 2020 3:15 am

Is MOCO CPU agnostic?

Post by Pasha van Bijlert » Mon May 10, 2021 2:48 am

Hi all,

Perhaps the answer is quite obvious, but I thought I'd double check. I'll be assembling a 3D/CAD workstation with an AMD Ryzen 5900x soon (CAD software in general benefits from a powerful CPU). While it's not the purpose of this computer, I'll be in the situation that I can run MOCO optimization runs on it as well, and I expect that this would be substantially faster than on my 10 year old i5 desktop. So this had me wondering: can MOCO (I suppose more specifically, CASADI) effectively use the 24 threads of that processor? Are there some type of compatibility issues that may occur or am I overthinking things? MOCO can make use of parallelization, correct? Is there a ceiling to this?

Best wishes,
Pasha

User avatar
Ross Miller
Posts: 375
Joined: Tue Sep 22, 2009 2:02 pm

Re: Is MOCO CPU agnostic?

Post by Ross Miller » Mon May 10, 2021 10:05 am

Hi Pasha,

At least when using the CasADI solver, Moco for sure can run in parallel. For both machines I've run Moco on (both Macs), it automatically detects how many cores I have and runs two threads on each core in parallel.

I think Brian Umberger has told me previously that he has found the speed increase from giving Moco/CasADI/IPOPT more cores levels off and can even get slower beyond a certain point, but generally speaking for the range of cores on most consumer-grade CPUs (6-20ish), I would expect more cores will make Moco run its IPOPT iterations faster.

Ross

User avatar
Aaron Fox
Posts: 289
Joined: Sun Aug 06, 2017 10:54 pm

Re: Is MOCO CPU agnostic?

Post by Aaron Fox » Mon May 10, 2021 6:39 pm

Hi Pasha,

I can back up Ross and state that Moco automatically detects cores and uses them appropriately it seems. We had some older PCs in our lab that had 24 cores vs. I have a newer laptop with 8 cores - and found that my laptop performed just as quickly, if not quicker. I suspect this had to do with more than just the number of cores but was an interesting finding for me in which set-up to run simulations on.

Aaron

User avatar
Pasha van Bijlert
Posts: 227
Joined: Sun May 10, 2020 3:15 am

Re: Is MOCO CPU agnostic?

Post by Pasha van Bijlert » Tue May 11, 2021 5:57 am

Hi Ross & Aaron,

Interesting, so there seems to be a sweet spot with the number of cores? Would you care to speculate as to why this is? I'd think the the computational effort to parallelize something is always going to be lower than performance benefit, but apparently not...

Aaron, was that a 24 core (thus 48 thread) CPU, or also 24 threads? Interesting that it was outperformed by fewer cores. I suppose that the processing power per core is also a factor in how fast the calculations are performed.

Best,

Pasha

User avatar
Aaron Fox
Posts: 289
Joined: Sun Aug 06, 2017 10:54 pm

Re: Is MOCO CPU agnostic?

Post by Aaron Fox » Tue May 11, 2021 4:01 pm

Hi Pasha,

It was 24 threads, so yes I do recall that CPU having 12 cores. They were much older computers than the laptop I was using so I'd say there was a different balance of processing power per core there. I've found that my laptop (Lenovo X1 Yoga) using 8 threads performs pretty well with respect to the number of iterations it gets through.

Aaron

User avatar
Nicholas Bianco
Posts: 1044
Joined: Thu Oct 04, 2012 8:09 pm

Re: Is MOCO CPU agnostic?

Post by Nicholas Bianco » Wed May 12, 2021 10:55 am

Thanks Ross and Aaron for the great info.

Pasha, regarding the "sweet spot" with the number of cores: there is some computational overhead to manage all the independent threads when parallelizing in Moco (fyi, we use CasADi's built in parallelization tools). As you increase the number of threads, this overhead increases, and this could be part of the explanation.

User avatar
Pasha van Bijlert
Posts: 227
Joined: Sun May 10, 2020 3:15 am

Re: Is MOCO CPU agnostic?

Post by Pasha van Bijlert » Thu May 13, 2021 2:05 am

Hi all,

Thanks for the interesting discussion. You can limit the number of parallel cores/workers used in matlab, does this have a downstream effect on how many cores CasADI gets access to? In that case I could play around with it and report back.

Thanks!
Pasha

User avatar
Nicholas Bianco
Posts: 1044
Joined: Thu Oct 04, 2012 8:09 pm

Re: Is MOCO CPU agnostic?

Post by Nicholas Bianco » Fri May 14, 2021 1:09 pm

Hi Pasha,

The number of cores you set for Matlab's parallelization tools (e.g., parpool) should not (I believe) affect the number of cores used by CasADi in Moco.

You can set the number of threads used by Moco using the "set_parallel()" property. When you run a problem, Moco will print out the number of threads being used for that problem to the console, so you can verify that way.

Best,
Nick

User avatar
Brian Umberger
Posts: 48
Joined: Tue Aug 28, 2007 2:03 pm

Re: Is MOCO CPU agnostic?

Post by Brian Umberger » Wed Jun 02, 2021 4:36 pm

Hi All,

I was busy with the end of the semester and some admin duties, so I missed this interesting thread started by Pasha a few weeks ago. I can make two small additions.

One of my PhD students, Alex Denton, has an abstract at the ASB meeting this summer on multicore performance in Moco using the CasADi parallelization. Our preliminary results, subject to ongoing work, is that there are diminishing returns beyond approximately 10 physical cores (i.e., actual cores, not hyperthreading). Some problems show continued speedup all the way to 36 cores, but by very small amounts, and some problems do get slower with more cores due to parallel overhead as noted by Ross and Nick. In all cases so far you would be better off with a fast 8 core processor than a slower 24 core processor, consistent with Aaron's experience. That statement is based on the fact that for a fixed monetary cost, the number of processor cores trades off against the clock speed per core. Again, the exact numbers are subject to work still in progress.

However... having many cores can still be beneficial if you have lots of similar but independent problems to run, such as different initial guesses, different weights in the cost function, etc. The Matlab and CasADi parallelization tools function independently. So, if you have say 24 cores available you could set the Moco parallel property to use 8 cores (set_parallel(8)) per optimization, and then run the multiple optimizations in parallel within a Matlab parfor loop with 3 workers. In our experience (thus far) that will yield the results in a fraction of the time of running the multiple optimizations one after the other with set_parallel(24). The speedup in this case, at a project level rather than the level of a single optimization, can be substantial.

Best,
Brian

User avatar
Carlos Gonçalves
Posts: 135
Joined: Wed Jun 08, 2016 4:56 am

Re: Is MOCO CPU agnostic?

Post by Carlos Gonçalves » Fri Jun 04, 2021 7:43 am

Excellent discussion. Especially that my rowing simulations with metabolic goals are during 8 hours in my old laptop https://www.linkedin.com/posts/carlos-g ... 60961-c88F

Dr. Umberger, any suggestions on how to run these parallelizations in Python?

Best regards.

POST REPLY