Multicore parallel computing with Moco

OpenSim Moco is a software toolkit to solve optimal control problems with musculoskeletal models defined in OpenSim using the direct collocation method.
POST REPLY
User avatar
Brian Umberger
Posts: 48
Joined: Tue Aug 28, 2007 2:03 pm

Multicore parallel computing with Moco

Post by Brian Umberger » Wed Sep 27, 2023 11:15 am

Hi All,

Over the past couple of years, there have been some discussions about parallel computing in Moco in this forum. Alex Denton, a former PhD student in our group (now a postdoc at Oregon), just published a paper that investigates multicore parallel speed-up in Moco. In the paper, she addresses how parallel speed-up interacts with model complexity, movement task, temporal mesh density, and the type of initial guess. For anyone who is interested, here are the links to the open-access paper and the SimTK page, which includes some example codes.

https://onlinelibrary.wiley.com/doi/ful ... 2/cnm.3777

https://simtk.org/projects/mocoparallel

The tl;dr summary is that most problems had diminishing returns for parallel speed-up above about 6 cores. So while there certainly may be exceptions, the primary advantage to having a machine with lots of cores seems to be the ability to solve multiple independent problems simultaneously, rather than solving a single problem really fast. While our focus was on parallel speed-up, there was also lots of problem specificity in total runtimes, so (unfortunately) there is no substitute for spending some time to see what works for your problem to get the greatest computational performance.

Best regards,
Brian

User avatar
Pagnon David
Posts: 86
Joined: Mon Jan 06, 2014 3:13 am

Re: Multicore parallel computing with Moco

Post by Pagnon David » Thu Sep 28, 2023 1:47 am

Thank you for sharing. This gives me the opportunity to ask: if the maximum speed is almost reached with 6 cores, does it mean that GPU computing cannot be leveraged? Do you know [about this](https://simtk.org/projects/gpuexp)?

If I understand right,
- the NLP function evaluations by CasADi can easily be parallelized
- the NLP solving by IPOPT (used within CasADi) is not easy to parallelize, and you did not investigate it
The speed-up rate mostly depends on how much time is spent in evaluating the NLP functions vs. in optimizing the solution, which is problem dependent. Does it sound accurate?

User avatar
Brian Umberger
Posts: 48
Joined: Tue Aug 28, 2007 2:03 pm

Re: Multicore parallel computing with Moco

Post by Brian Umberger » Thu Sep 28, 2023 10:20 pm

Hi David,

I think your summary of our results is accurate.

I had not seen the GPU project you linked to. We can't say for certain based on our results, but with the current framework I'm doubtful GPU computing would be fruitful. Even in the best case the speed-up, while certainly beneficial, was far from ideal. The main caveat to that statement is there are many possible types of analyses that could be done with Moco and we only considered a subset of them. Possibly GPUs could be exploited in some other situations.

Best,
Brian

User avatar
Ross Miller
Posts: 371
Joined: Tue Sep 22, 2009 2:02 pm

Re: Multicore parallel computing with Moco

Post by Ross Miller » Sun Oct 01, 2023 4:21 am

Sounds like I should see what the return policy is on the 12-core CPU / 30-core GPU I just bought.

User avatar
Nicholas Bianco
Posts: 980
Joined: Thu Oct 04, 2012 8:09 pm

Re: Multicore parallel computing with Moco

Post by Nicholas Bianco » Mon Oct 02, 2023 9:28 am

Thanks Brian! This is a great resource for the Moco community :)

User avatar
Brian Umberger
Posts: 48
Joined: Tue Aug 28, 2007 2:03 pm

Re: Multicore parallel computing with Moco

Post by Brian Umberger » Mon Oct 02, 2023 9:38 am

Thanks, Nick!

And Ross, I don't know... that sounds like a pretty good machine to me for running several simultaneous Moco simulations, each using 4-6 cores!

Brian

User avatar
Lars D''Hondt
Posts: 1
Joined: Wed Nov 03, 2021 3:44 am

Re: Multicore parallel computing with Moco

Post by Lars D''Hondt » Mon Feb 05, 2024 8:53 am

Hi all,

Some performance loss can be caused by how CasADi handles parallelization under the hood. (https://github.com/casadi/casadi/blob/m ... #L658-L672)
The number of mesh intervals that is assigned to each parallel thread is n_mesh/n_thread, and if this is not a round number, it is rounded up. The additional functions that this creates are evaluated, but their outputs are discarded.

Looking at figure 3 of the paper Brian linked, some of the results that fall below the curve could be due to an unfortunate combination of number of mesh intervals and number of cores/threads:
- 10 intervals and 9 cores -> dynamics are evaluated at 18 intervals
- 25 intervals and 12 cores -> dynamics are evaluated at 36 intervals
- 25 intervals and 18 cores -> dynamics are evaluated at 36 intervals

Limiting the number of additional function evaluations won't be the most impactful way to speed up simulations, but it's very low effort.

Kind regards,
Lars

User avatar
Brian Umberger
Posts: 48
Joined: Tue Aug 28, 2007 2:03 pm

Re: Multicore parallel computing with Moco

Post by Brian Umberger » Mon Feb 05, 2024 8:12 pm

Hi Lars,

Thanks for these comments. We had searched without success for documentation on exactly how CasADi handles the work allocation across threads, and I guess we should have gone straight to the source code. You are right that by using a typical mesh interval progression of ..., 10, 25, 50, ... they did not align with the number of cores being used.

Your post made me curious, so I did a couple of quick checks with the 2-D predictive walking simulation. For the "10 interval, 9 core" case, increasing to 10 cores had almost no effect on the computational speed (only 0.75% faster). However, in the "25 interval, 12 core" case, dropping down to 24 intervals improved computational speed by 14%. So, the actual impact in practice seems variable, but the n_mesh/n_thread ratio is definitely worth checking and easy to control, as you noted.

Thanks again.

Brian

User avatar
Nicholas Bianco
Posts: 980
Joined: Thu Oct 04, 2012 8:09 pm

Re: Multicore parallel computing with Moco

Post by Nicholas Bianco » Wed Feb 07, 2024 11:47 am

Great tip Lars! Thanks for posting.

POST REPLY