Same code, different solutions on different computers?

Pasha van Bijlert · Post by **Pasha van Bijlert** » Wed Jan 17, 2024 2:18 pm

Hi all,

Was wondering if someone could clear up a bit of confusion on my end. For the supplementary materials of an upcoming publication (about predictive gait simulations in the emu), I've set up a code example that runs a gate prediction in Moco using a quasi-random initial guess. The guess is quasi random in the sense that it is infeasible (really just floating forward, with the legs swinging from front to back or vice versa), but it isn't randomly generated each time the code is run, so the optimization should be deterministic. Once the first optimization converges, it automatically uses it as an initial guess for an optimization at a slightly higher target speed.

If I run it twice on my desktop, I get the exact same gait solutions (down to the number of iterations, etc.). However, if I run it on different computers, I get different optima. I've tried it on my own PC (OpenSim 4.4 from April 14 2022), a Mac desktop (OpenSim 4.4 from Jul 24 2022), and a windows Laptop (Opensim 4.5, May 19 2023). Some of the solutions are similar, although some are clearly different. Is this expected behaviour? I thought these optimizations were deterministic, and I don't really understand what's causing these differences.

The only thing I could think of were different number of cores perhaps affecting how the optimizations were parallelized, but this is a shot in the dark, I don't know if this should/could affect each iteration of the optimization. Anybody have any ideas?

Cheers,
Pasha

Carlos Gonçalves · Post by **Carlos Gonçalves** » Wed Jan 17, 2024 5:56 pm

Hello Pasha!

Your post remembered this old one: viewtopicPhpbb.php?f=1815&t=16529&p=45808&start=0&view=

From the little that I know, IPOPT has some randomness in its operation with guesses, as Ross mentioned here viewtopicPhpbb.php?f=1815&t=17220&p=0&s ... ec07acfe8b. And it has some hardware-related changes, as Pavlos mentioned in the old post above.

How are the different optima different in your simulations? I stopped running my code on different machines, but maybe the answer is in CaSADI or IPOPT references.

Best regards.

Pasha van Bijlert · Post by **Pasha van Bijlert** » Thu Jan 18, 2024 2:32 pm

Hi Carlos,

Looks like this is the exact same phenomenon. I hesitate to attribute this all to "randomness", since the results from each of those computers are repeatable/deterministic on the same machine. The results are just not identical between machines. So there's doesn't appear to be a random process here, unless the random seed stays the same upon subsequent trials but is somehow machine dependent. But Ipopts perturbation of the initial guess being machine dependent sounds plausible enough.

I guess this makes my predictive simulation code example slightly less predictable than I'd hoped, because I can't control for the specific machine it's being run on. I'll add an extra comment saying that my code should only serve as an example, and that the user should spend some time trying out different initial guesses (which I did do for the actual analysis in my paper).

cheers,
Pasha

Aaron Fox · Post by **Aaron Fox** » Thu Jan 18, 2024 3:21 pm

I'll be testing your code when it comes out Pasha just for fun.

This is an interesting phenomenon that could be worth figuring out. Open science and data is becoming more prevalent, and the idea of other researchers being able to replicate your results from code coming along with this as well. The idea that someone trying to replicate your results being impacted by using a different machine doesn't really mesh well with these points.

Aaron

Ross Miller · Post by **Ross Miller** » Fri Jan 19, 2024 5:44 am

I don't think there is an element of randomness to the perturbation IPOPT applies to initial guesses.

There is some discussion on this issue elsewhere if you search for things like "IPOPT repeatability", for example:

https://groups.google.com/g/opti-toolbo ... JPxa_c_CLg

I believe this largely has to do with whether IPOPT is run in parallel, and if so on how many CPUs or cores or threads or whatever, and how different machines and operating systems handle that. So from that perspective the lack of repeatability is the expected behavior.

Ross

Pasha van Bijlert · Post by **Pasha van Bijlert** » Fri Jan 19, 2024 12:34 pm

Hi Ross,

Thank you, that clears it up!

@Aaron, I'll keep you posted!
Cheers,
Pasha

Brian Umberger · Post by **Brian Umberger** » Mon Jan 29, 2024 11:08 am

Hi All,

Thanks for this interesting discussion. We have also recently been comparing the results obtained on different computers as we work on moving some of our projects onto our university's computer cluster. We likewise find different results on different computers.

I wanted to add that in our experience, the results obtained in solving the same problem on the same computer do not depend on the number of cores used within Moco/CasADi (e.g., solver.set_parallel()). The issue causing inconsistent results, as I understand it, seems to be due to multi-threading within the low-level linear algebra libraries that IPOPT is compiled against.

Something I have been thinking about, but have not yet had time to try, is the following:

We know that solving the same problem on different computers can lead to different results. However, if you solve the same problem on different computers several times each, starting from a range of sufficiently different initial guesses, then is the best solution across machines the same (or nearly so)? If so, that would seem to be more tolerable than that situation where simulation results are wholly machine-dependent.

I would be interested to know if anyone has tested this, and if not, I will try to remember to report back when we get to it.

Best,
Brian

Nicholas Bianco · Post by **Nicholas Bianco** » Mon Jan 29, 2024 12:52 pm

To add to Brian's point, I'd also be curious to know how solver tolerances play a role in differing results between machines. I'd assume that looser tolerances would lead to greater differences across machines, but characterizing that effect to know how tight your tolerances need to be to combat this issue would be useful to know.

Pasha van Bijlert · Post by **Pasha van Bijlert** » Fri Feb 09, 2024 9:37 am

Hello all,

For what it's worth, my convergence tolerance is usually set to e-3 and constraint tolerance to e-4. I've found that inter-computer differences are usually because the optimizer takes a completely different route (i.e. the objective and the infeasibilities at specific iteration numbers will already be different).

I wanted to add that in our experience, the results obtained in solving the same problem on the same computer do not depend on the number of cores used within Moco/CasADi (e.g., solver.set_parallel()).

This is a good point, thank you! You just reminded me that I'd actually done a very small test myself two years ago (viewtopicPhpbb.php?f=1815&t=13391&p=39688&start=0&view=), so I could have remembered this.

Best wishes,
Pasha

Same code, different solutions on different computers?

Same code, different solutions on different computers?

Re: Same code, different solutions on different computers?

Re: Same code, different solutions on different computers?

Re: Same code, different solutions on different computers?

Re: Same code, different solutions on different computers?

Re: Same code, different solutions on different computers?

Re: Same code, different solutions on different computers?

Re: Same code, different solutions on different computers?

Re: Same code, different solutions on different computers?