Grid convergence study
- Ross Miller
- Posts: 375
- Joined: Tue Sep 22, 2009 2:02 pm
Grid convergence study
Update in Post 7
I wanted to share some results from a grid sensitivity test I've been working on lately.
Using a 3D model (31 DoF, 84 muscles), I did a tracking simulation of a stride of walking, tracking mean human experimental kinematics and GRF for walking at average speed 1.45 m/s. The cost function was the weighted sum of the state tracking error, contact (GRF) tracking error, control effort (u^2), deviation of the torso from the global axes, and the auxiliary derivatives. The latter term is the tendon force "yanks" and is in there only because it has to be to use the implicit muscle formulation (I guess it could also be considered a "smoothness" criterion of sorts) and has a very light weight. The multibody dynamics are explicit. I haven't found an advantage to using the implicit multibody dynamics in Moco yet.
I ran the simulation several times with different grid densities. The "grid" here ("mesh" in Moco solver settings) is the discretization of the movement time, defining the timestep separating the nodal values at which the problem is evaluated. In Moco you set the grid density in the solver: solver.set_num_mesh_intervals(N), where N = number of intervals. The problem (to my understanding) is actually evaluated at nodal values on the ends of these intervals, and the number of nodes depends on the transcription scheme. With the default Hermite-Simpson scheme, the problem is evaluated at 2*N+1 nodes because it creates nodes also in the middle of the intervals. So for example if your simulation is for a time of 1.0 seconds and you use N=50 and Hermite-Simpson transcription, there are 101 nodes and the timestep between nodes is 10 milliseconds.
I started with N=5, which was the coarsest grid I could get convergence on with a reasonable constraint tolerance (1e-4). The initial guess for this simulation was generated using solver.createGuess(). I then took this solution and used it as the initial guess for N=10, then repeated that process for N=25, N=50, N=100, and N=200, each time using the previous solution as the initial guess. Results are in the attached figures.
You can see that there is a massive difference in the solution quality (minimization of cost function) between N=5 and N=10, but beyond N=10 there are only small changes in the cost function score. For this problem, changes in the cost function score on the order of 0.1 are pretty trivial, you would be hard-pressed to look at the videos or time series data of two solutions that differ by 0.1 and tell which is which. The bar graph figure breaks down the scores by the terms in the cost function.
My summary/interpretation: at least for tracking problems of walking data, N=10 or so is sufficient for getting grid-independent results. For publication-quality final results you will probably want to use a finer grid (e.g. at least for cosmetic purposes in figures) but the coarser grid can save a lot of time for evaluating the model, cost function, etc. For comparison, N=10 took a little under two hours to converge on my laptop, while N=50 took over six hours.
I would like to still try some predictive simulations, both of walking and of a much faster movement (running/sprinting). I would expect a faster movement would need a finer grid for grid-independence.
Ross
I wanted to share some results from a grid sensitivity test I've been working on lately.
Using a 3D model (31 DoF, 84 muscles), I did a tracking simulation of a stride of walking, tracking mean human experimental kinematics and GRF for walking at average speed 1.45 m/s. The cost function was the weighted sum of the state tracking error, contact (GRF) tracking error, control effort (u^2), deviation of the torso from the global axes, and the auxiliary derivatives. The latter term is the tendon force "yanks" and is in there only because it has to be to use the implicit muscle formulation (I guess it could also be considered a "smoothness" criterion of sorts) and has a very light weight. The multibody dynamics are explicit. I haven't found an advantage to using the implicit multibody dynamics in Moco yet.
I ran the simulation several times with different grid densities. The "grid" here ("mesh" in Moco solver settings) is the discretization of the movement time, defining the timestep separating the nodal values at which the problem is evaluated. In Moco you set the grid density in the solver: solver.set_num_mesh_intervals(N), where N = number of intervals. The problem (to my understanding) is actually evaluated at nodal values on the ends of these intervals, and the number of nodes depends on the transcription scheme. With the default Hermite-Simpson scheme, the problem is evaluated at 2*N+1 nodes because it creates nodes also in the middle of the intervals. So for example if your simulation is for a time of 1.0 seconds and you use N=50 and Hermite-Simpson transcription, there are 101 nodes and the timestep between nodes is 10 milliseconds.
I started with N=5, which was the coarsest grid I could get convergence on with a reasonable constraint tolerance (1e-4). The initial guess for this simulation was generated using solver.createGuess(). I then took this solution and used it as the initial guess for N=10, then repeated that process for N=25, N=50, N=100, and N=200, each time using the previous solution as the initial guess. Results are in the attached figures.
You can see that there is a massive difference in the solution quality (minimization of cost function) between N=5 and N=10, but beyond N=10 there are only small changes in the cost function score. For this problem, changes in the cost function score on the order of 0.1 are pretty trivial, you would be hard-pressed to look at the videos or time series data of two solutions that differ by 0.1 and tell which is which. The bar graph figure breaks down the scores by the terms in the cost function.
My summary/interpretation: at least for tracking problems of walking data, N=10 or so is sufficient for getting grid-independent results. For publication-quality final results you will probably want to use a finer grid (e.g. at least for cosmetic purposes in figures) but the coarser grid can save a lot of time for evaluating the model, cost function, etc. For comparison, N=10 took a little under two hours to converge on my laptop, while N=50 took over six hours.
I would like to still try some predictive simulations, both of walking and of a much faster movement (running/sprinting). I would expect a faster movement would need a finer grid for grid-independence.
Ross
Last edited by Ross Miller on Fri May 22, 2020 10:30 am, edited 2 times in total.
- Ross Miller
- Posts: 375
- Joined: Tue Sep 22, 2009 2:02 pm
Re: Grid convergence study
Here are the data if anyone is interested. "tpi" is the wall-time per iteration. This was on a 2019 MacBook Pro with a 2.6-GHz Intel Core i7, running Moco in parallel on all six cores / 12 threads.
This was not a rigorous study of computational performance, e.g. sometimes I was running Zoom or streaming Netflix while Moco was running, but I think it's generally indicative of what to expect (roughly linear increase in time per iteration with more grid nodes).
This was not a rigorous study of computational performance, e.g. sometimes I was running Zoom or streaming Netflix while Moco was running, but I think it's generally indicative of what to expect (roughly linear increase in time per iteration with more grid nodes).
Last edited by Ross Miller on Fri May 15, 2020 7:19 am, edited 3 times in total.
- Christopher Dembia
- Posts: 506
- Joined: Fri Oct 12, 2012 4:09 pm
Re: Grid convergence study
Ross, this is fantastic! This is very thorough, and you've provided background that will be helpful for many people. I will pin this post for a while.
I'm curious about how conservation of energy holds across these grids (model.calcKineticEnergy() and model.calcPotentialEnergy()).
I'm very surprised that the cost plateaus after just N=10. I wonder how ODE error and the smoothness of the muscle activity varies across these grid settings.
I'm curious about how conservation of energy holds across these grids (model.calcKineticEnergy() and model.calcPotentialEnergy()).
I'm very surprised that the cost plateaus after just N=10. I wonder how ODE error and the smoothness of the muscle activity varies across these grid settings.
- Ross Miller
- Posts: 375
- Joined: Tue Sep 22, 2009 2:02 pm
Re: Grid convergence study
Chris, that's a great point re: energy. It would be a good error-check of sorts. I will try to add that tomorrow. Will probably need help =)
The "mid-point Euler" discretization scheme is supposedly grid-independent in terms of energy conservation, but I could never get it to work well on my own codes. I always ended up using backward Euler.
Ross
The "mid-point Euler" discretization scheme is supposedly grid-independent in terms of energy conservation, but I could never get it to work well on my own codes. I always ended up using backward Euler.
Ross
Re: Grid convergence study
Nice work Ross!
From memory the Lee & Umberger study in Peer J that did some grid refinement calculated root mean square error of things like muscle activations between solutions as a comparison of how the more refined grid changed outputs. I'm fairly sure Moco has a simple command that allows you to calculate the RMS between two solutions, specifying the model outputs you want included in the calculation. This might be another good way to compare those continuous type outputs between solutions.
Aaron
From memory the Lee & Umberger study in Peer J that did some grid refinement calculated root mean square error of things like muscle activations between solutions as a comparison of how the more refined grid changed outputs. I'm fairly sure Moco has a simple command that allows you to calculate the RMS between two solutions, specifying the model outputs you want included in the calculation. This might be another good way to compare those continuous type outputs between solutions.
Aaron
- Christopher Dembia
- Posts: 506
- Joined: Fri Oct 12, 2012 4:09 pm
Re: Grid convergence study
Good idea, Aaron.
Here is the function for computing RMS differences between trajectories.
https://opensim-org.github.io/opensim-m ... 0ef724f178
Here is the function for computing RMS differences between trajectories.
https://opensim-org.github.io/opensim-m ... 0ef724f178
- Ross Miller
- Posts: 375
- Joined: Tue Sep 22, 2009 2:02 pm
Re: Grid convergence study
I repeated this process recently for some predictive simulations of walking and some tracking simulations of running. For the walking simulations the same result as the tracking simulations in the first post was seen: cost function dramatically changed from 5 to 10 intervals then no meaningful changes on finer grids.
The result of the running simulations (3.17 m/s) is attached here. Its cost function did not "stabilize" until 25 intervals. The change in cost from 10 to 25 intervals is meaningful: with 25 intervals the tracking error was 26% smaller, with 22% less control effort.
Conclusion: for tracking simulations of walking or slow/moderate running, you are probably fine using grids of 25-50 Hermite-Simpson intervals (51-101 nodes, or one node every 1-2% of the gait cycle). Those grids are pretty standard in the DC literature for locomotion simulations. For predictive simulations this is probably fine too, unless you expect or are interested in possible predictive solutions that have much faster movements than the tracking solution (typically we are interested in predictive results that at least roughly resemble the tracking result but this may not always be the case).
I think this difference between walking and moderate running (3.17 m/s is ~8.5 minutes/mile) may be bad news for people wanting to simulate "fast" sports-type motions like jumping and sprinting (might need lots of intervals). The 200-interval predictive walking simulation took 34 hours on my laptop.
Is it possible to give Moco in its present version a "maximum speed" goal? This doesn't appear to be a specific goal currently, but I thought I could possibly make up some impossible synthetic data (e.g. an impossibly far pelvis translation) and ask it to try to track that with periodicity. Has anyone tried this or something similar?
I haven't examined the variables suggested by Chris and Aaron yet but will get to that eventually and update with that info. I don't plan to try to publish this (doesn't really seem like "science") but I think it's helpful technical information for deciding on an important parameter. I may write something up and just leave it on biorXiv.
Ross
The result of the running simulations (3.17 m/s) is attached here. Its cost function did not "stabilize" until 25 intervals. The change in cost from 10 to 25 intervals is meaningful: with 25 intervals the tracking error was 26% smaller, with 22% less control effort.
Conclusion: for tracking simulations of walking or slow/moderate running, you are probably fine using grids of 25-50 Hermite-Simpson intervals (51-101 nodes, or one node every 1-2% of the gait cycle). Those grids are pretty standard in the DC literature for locomotion simulations. For predictive simulations this is probably fine too, unless you expect or are interested in possible predictive solutions that have much faster movements than the tracking solution (typically we are interested in predictive results that at least roughly resemble the tracking result but this may not always be the case).
I think this difference between walking and moderate running (3.17 m/s is ~8.5 minutes/mile) may be bad news for people wanting to simulate "fast" sports-type motions like jumping and sprinting (might need lots of intervals). The 200-interval predictive walking simulation took 34 hours on my laptop.
Is it possible to give Moco in its present version a "maximum speed" goal? This doesn't appear to be a specific goal currently, but I thought I could possibly make up some impossible synthetic data (e.g. an impossibly far pelvis translation) and ask it to try to track that with periodicity. Has anyone tried this or something similar?
I haven't examined the variables suggested by Chris and Aaron yet but will get to that eventually and update with that info. I don't plan to try to publish this (doesn't really seem like "science") but I think it's helpful technical information for deciding on an important parameter. I may write something up and just leave it on biorXiv.
Ross
- Simon Jeng
- Posts: 87
- Joined: Fri Sep 07, 2018 8:26 pm
Re: Grid convergence study
Hi Ross,
In my research (predicting squatting), I also would like to find a suitable grid density as you did. The number of mesh intervals I chose are 12,25,50,75, and 100. The convergence tolerance and the constraint tolerance are both 1e-4. The maximum number of iteration is 10000. I did not get a stable objective function value across different grid densities, you can see it in the attached figure. The x-axis is the number of mesh intervals and the y-axis is the objective function value.
What may cause this? I find that my objective function values is small (on the order of 1e-2). Is it probably because the convergence tolerance is relatively too large to converge to a great solution?
I tried to reduce the convergence tolerance to 1e-5, but the solver did not succeed: restoration failed. What does it mean?
Thanks for your help in advance.
Best,
Simon
In my research (predicting squatting), I also would like to find a suitable grid density as you did. The number of mesh intervals I chose are 12,25,50,75, and 100. The convergence tolerance and the constraint tolerance are both 1e-4. The maximum number of iteration is 10000. I did not get a stable objective function value across different grid densities, you can see it in the attached figure. The x-axis is the number of mesh intervals and the y-axis is the objective function value.
What may cause this? I find that my objective function values is small (on the order of 1e-2). Is it probably because the convergence tolerance is relatively too large to converge to a great solution?
I tried to reduce the convergence tolerance to 1e-5, but the solver did not succeed: restoration failed. What does it mean?
Thanks for your help in advance.
Best,
Simon
- Attachments
-
- objective function value under different mesh intervals.png (7.88 KiB) Viewed 2689 times
Last edited by Simon Jeng on Sun Apr 18, 2021 8:40 am, edited 1 time in total.
- Ross Miller
- Posts: 375
- Joined: Tue Sep 22, 2009 2:02 pm
Re: Grid convergence study
Hi Simon,
There are lots of factors that can effect this. I don't know enough about your problem to say if ~10% changes in the cost function score are meaningful. For example, a 10% increase in sprinting speed or 10% decrease in metabolic cost is pretty meaningful. 10% decrease in mean squared tracking error is generally not meaningful (5 degrees vs. 4.7 degrees).
On your restoration question and the meaning of exit messages, I would check the IPOPT documentation, e.g.:
https://coin-or.github.io/Ipopt/OUTPUT.html
Ross
There are lots of factors that can effect this. I don't know enough about your problem to say if ~10% changes in the cost function score are meaningful. For example, a 10% increase in sprinting speed or 10% decrease in metabolic cost is pretty meaningful. 10% decrease in mean squared tracking error is generally not meaningful (5 degrees vs. 4.7 degrees).
On your restoration question and the meaning of exit messages, I would check the IPOPT documentation, e.g.:
https://coin-or.github.io/Ipopt/OUTPUT.html
Ross
- Pagnon David
- Posts: 86
- Joined: Mon Jan 06, 2014 3:13 am
Re: Grid convergence study
Dear Ross,
Did you happen to write and publish this to biorxiv (or something)?
I may want to cite it at some point.
Did you happen to write and publish this to biorxiv (or something)?
I may want to cite it at some point.