Page 1 of 1

GCVSplineSet errorVariance

Posted: Fri Jan 05, 2024 1:32 pm
by kernalnet
Hi,
I'm trying to fit a spline to the generalized coordinates and apply some smoothness with GCVSplineSet. Any suggestion what would be the best value for errorVariance? And how to find it?
The computation speed varies depending on the data characteristics. Even with a small input value, some signals may require a long time to process.
errorVariance
Estimate of the variance of the error in the data to be fit. If negative, the variance will be estimated. If 0.0, the fit will try to fit the data points exactly- no smoothing. If positive, the fits will be smoothed according to the specified variance. The larger the error variance, the more the smoothing. Note that this is the error variance assumed for each column in the TimeSeriesTable. If different variances should be set for the various columns, you will need to construct each GCVSpline individually.
Thank you,
-Mohammadreza

Re: GCVSplineSet errorVariance

Posted: Sat Jan 06, 2024 3:48 pm
by johnjdavisiv
I'm pretty sure the "errorVariance" is a just a regularization parameter: a smoothing penalty such that, the greater it is, the smoother (less "wiggly") the spline fit is. The "GCV" stands for generalized cross-validation, which based on my read of the docs is how the method chooses the errorVariance parameter in situations where you specify a negative number. It's (almost) identical to the kind of leave-one-out cross-validation methods you see in predictive modeling/machine learning papers.

The "correct" value for errorVariance will depend quite a lot on the data, how noisy it is, what units it is measured in, and so on. You can think of it as being like the cutoff frequency of a butterworth filter. The GCV method for choosing the amount of smoothing has some nice theoretical properties though so it would be a good default.

Maybe a bigger question, though, is whether GCVSplines are the right choice at all? The only advantage of the GCVSplines over a standard butterworth filter is that you can use them when you have irregularly-spaced datapoints, but that's pretty unusual for most biomechanical data (and is often fixable by just using interpolation).

When OpenSim calls this spline method for its own internal purposes I'm pretty sure it doesn't apply any smoothing at all (errorVariance = 0), and just uses it as essentially a datapoint interpolation method, e.g. for ground reaction force and motion capture data when you need their value at some intermediate timepoint (like when taking a forward step with the CMC algorithm).

To me at least, a butterworth filter with a known cutoff frequency is a lot more interpretable. Seems like the modern approach to smoothing is to filter conservatively, if at all, and lean on more "biomechancially-based" methods like inverse kinematics to implicitly smooth, in a way that's more in keeping with what we know (or assume) about the body. There is some really good discussion re: these points here on Biomch-L about filtering.

It might help to know your workflow and what you need the filtering/smoothing for.

Re: GCVSplineSet errorVariance

Posted: Sat Jan 06, 2024 5:30 pm
by kernalnet
Hi John,
(Happy new year!!!)

Fair enough; thanks for your response. I'm studying static/dynamic optimization. I already know the methods of finding the optimal cut-off frequency. But here, the errorVariance argument sometimes cannot finish the processing. For example: the output with errorVariance of 0, 0.04, and 0.05 are identical, but it freezes on 0.06 or more. I will test more and take some characteristics of the signal into account.

Actually, I found GCVSplineSet class pretty useful; not only for smoothing, but also for differentiating (what I need) and time interpolating. Regarding the last one, it is helpful when you have marker data sampled at, e.g., 60 or 120 Hz which is frequently used. In this case, as you said, irregularity in time intervals would occur when they are rounded (1/60=0.01666...). An example is the data associated with Gait2392 model. If you run IK and ID tools and compare their time columns, you will find that the time frames are slightly different. When you set the irregular times to the State, it may take the GRF (x10 or x20 sampling frequency) at incorrect times. So, it seems good practice to fix this irregularity. But I admit that in inverse methods this is not significant at all, as you mentioned.

Applying a low-pass Butterworth filter on the TimeSeriesTable before using GCVSplineSet sounds a good alternative for now.

Thanks again,
-Mohammadreza

Re: GCVSplineSet errorVariance

Posted: Mon Jan 08, 2024 12:30 pm
by bogert
I have a lot of experience with the GCV spline code (received it in 1985 on a 9-track tape!).

The Fortran source code is still available on Netlib at https://www.netlib.org/gcv/gcvspl. After the first page or two, look at the "SUBROUTINE GCVSPL" with good technical information. There is an accompanying paper with the full mathematical details [1]. Tests and comparisons with other methods are in [2].

GCV stands for "Generalized Cross Validation" and it is a method for determining the optimal smoothing parameter (p). It is somewhat similar to other model order criteria such as BIC and AIC. Like those, GCV is asymptotically equivalent to "leave one out" cross validation, but requires less computation.

The original GCVSPL code can be run in various modes. The simplest is to directly specify p. Other modes are GCV and "known variance". Oddly, OpenSim only seems to provide the "known variance" mode. So although the tool is named "GCVSpline", the GCV criterion is not supported. Someone please correct me if I am wrong. It would be good if GCVSpline would also support the other modes.

However, after experimenting with the GCV criterion on real data, a long time ago, I concluded that it tends not to smooth enough for using the accelerations. So you have to play around with the smoothing anyway. Also, as John mentioned, and Mohammedreza concluded, it is better to use traditional frequency domain methods where the cutoff frequency is the smoothing parameter. It works just as well and is better understood.

The "known variance" option is attractive for motion capture, because often we know the noise level of the data, e.g. 1.0 mm, where the variance would be 1e-6 m^2. Then the GCVSPL code will find the smoothing parameter p such that the RMS difference between raw and filtered data is exactly the square root of the specified variance.

The smoothing parameter p has a specific meaning. A smoothing spline is the solution to a trajectory optimization problem where the cost function has two terms: data tracking and smoothness. The parameter p is the weighting of the second term. For a cubic smoothing spline, smoothness is the integral of the square of the acceleration. Wikipedia has a good explanation: https://en.wikipedia.org/wiki/Smoothing_spline.

What is not well known is that a cubic smoothing spline has the same frequency response as a double 2nd order Butterworth filter! I have actually verified this by running artificial sine waves through GCVSPL and Matlab's Butterworth filter. The only difference is in the boundary conditions. A cubic spline assumes zero acceleration at the endpoints. Matlab's filtfilt uses the boundary conditions in [3] (which I have not read). And, as John mentioned, splines can be used with non-equidistant input data.

I have attached the memo that Herman Woltring distributed to GCVSPL users in 1986.
gcvspl_memo.txt
(3.6 KiB) Downloaded 62 times
. It provides an equation for the relationship between smoothing parameter p and the cutoff frequency w0 of the equivalent Butterworth filter. My sine wave filtering experiments verified that the equation is correct.

When the GCVSPL code is run in GCV or "known variance" modes, it tells you which optimal value of p it found after the optimization. You can then convert this into an equivalent cutoff frequency using that formula. This would be useful to have available in the OpenSim API.

Ton van den Bogert

[1] Woltring HJ (1986), A FORTRAN package for generalized, cross-validatory spline smoothing and differentiation. Advances in Engineering Software 8(2):104-113 (U.K.).

[2] Woltring HJ (1985) On optimal smoothing and derivative estimation from noisy displacement data in biomechanics. Human Movement Science 4, 229-245.

[3] Gustafsson (1996) Determining the initial states in forward-backward filtering. IEEE Transactions on Signal Processing 44: 988-992. https://doi.org/10.1109/78.492552.

Re: GCVSplineSet errorVariance

Posted: Tue Jan 09, 2024 7:08 am
by kernalnet
Hi Dr. van den Bogert,

Thank you so much for your comprehensive response. Also, thanks for sharing your valuable experiences here and I do appreciate it.

I did a few more tests:
  • I think the (wrong: "known variance") GCV mode can be implemented in OpenSim GCVSplineSet if I set the errorVariance to a negative value. As you mentioned, the smoothness won't be sufficient.
  • If I increase the errorVariance to a positive value, it really does a great job in smoothing, but the few initial and last frames would deviate from the original signal, as you mentioned
Overall, in terms of smoothing, I would opt the Butterworth filter with paddings; thanks john and Ton.

Only one more question; sorry if it sounds silly. If I want to calculate the accelerations from Inverse Kinematics output using GCVSplineSet and use it for acceleration-matching static optimization for example, how many smoothing steps do you recommend? Only the coordinates value, only the calculated accelerations, or both of them (i.e., before and after differentiation).

Please let me know your thoughts.

Thanks again,
-Mohammadreza

Re: GCVSplineSet errorVariance

Posted: Tue Jan 09, 2024 11:07 am
by bogert
Mohammadreza,

You probably meant to say "the GCV mode can be implemented ... if I set the errorVariance to a negative value".

If you know the variance, you can specify it as a positive value. And the larger that value, the more difference the spline allows between smoothed and raw data. So more smoothing.

I now see inthe API documentation: "If negative, the variance will be estimated". It does not say which method is used to estimate the variance, but it is probably the GCV criterion.

Ton

Re: GCVSplineSet errorVariance

Posted: Tue Jan 09, 2024 1:12 pm
by kernalnet
Oh, thank you so much for correcting me, I just edited that mistake.

According to the source codes and comments, OpenSim GCVSpline only supports fitFromGCV and fitFromErrorVariance modes:
https://github.com/opensim-org/opensim- ... #L592-L595

But fitFromDOF and fitForSmoothingParameter modes are also available in Simbody:
https://github.com/simbody/simbody/blob ... h#L85-L158

Thanks again,
-Mohammadreza

Re: GCVSplineSet errorVariance

Posted: Tue Jan 09, 2024 1:48 pm
by kernalnet
Only one more question; sorry if it sounds silly. If I want to calculate the accelerations from Inverse Kinematics output using GCVSplineSet and use it for acceleration-matching static optimization for example, how many smoothing steps do you recommend? Only the coordinates values, only the calculated accelerations, or both of them (i.e., before and after differentiation).
Any recommendation for this? I really appreciate your help.
-Mohammadreza

Re: GCVSplineSet errorVariance

Posted: Thu Oct 03, 2024 10:39 am
by bogert
I missed this, the notifications from SimTk forums don't always arrive.

This reply is very late, but may still be useful for someone.

When you use GCVSpline, you generate a spline from the (X,Y) data once, with a specified smoothing setting (the variance parameter). The spline is stored as a series of coefficients for the piecewise polynomials.

When the spline is evaluated, you can ask to evaluate the spline function itself, or a derivative of the spline, at any value X. So no additional smoothing steps are needed. You keep using the same spline.

I am curious where in the GCVSpline source code the evaluation is done, I could not find a method in the class either. In the original Fortran code, which is translated to C in OpenSim, the evaluation is done by the function SPLDER. See https://github.com/opensim-org/opensim- ... pl.c#L1386

If you need velocities or accelerations, the spline will need to be more smooth than when you only need positions. The easiest way to judge this is to watch for noise in the final result of your processing. If you only look at the smoothed position, it may look fine, but the accelerations may still be too noisy. If the final result seems too noisy, more smoothing is needed. This is typically trial and error. If you are processing data from a known movement, the literature may have advice on what filter to use. For instance, 6 Hz low pass for gait marker data. You can convert the 6 Hz cutoff frequency to a smoothing parameter for the spline. That is the parameter \lambda in this definition: https://en.wikipedia.org/wiki/Smoothing_spline

Unfortunately, OpenSim appears to only have the option to specify variance, or unknown variance (GCV), not the lambda directly. If the lambda is specified directly, the spline code does not have to run an optimization and it will be a lot faster.

Because splines are equivalent to butterworth filters, it may be easier to just use a butterworth filter. This will be easier in the Matlab signal processing toolbox than with OpenSim. OpenSim seems to have a double (forward and reverse) 3rd order low pass filter (presumably Butterworth) in Signal::LowpassIIR. In biomechanics, the standard is to use a double 2nd order filter.

While looking a the Signal class, I noticed that it also has a smoothing spline that has the capability to specify cutoff frequency in Hz!

Ton van den Bogert