Review of summary document providing Committee overview

The Committee on Credible Practice of
Modeling & Simulation in Healthcare aims to establish a task-oriented collaborative platform to outline good practice of simulation-based medicine.
User avatar
Ahmet Erdemir
Posts: 77
Joined: Sun Sep 10, 2006 1:35 pm

Re: Review of summary document providing Committee overview

Post by Ahmet Erdemir » Sun Apr 28, 2013 5:22 am

jbarhak wrote: 1. Under the Clinical urgency slide please consider adding the following bullet
Current computing technology can now replace many human tasks and decisions. It is important that the ability of computers is neither exaggerated nor diminished. It is important to gage this transition of tasks from human to machine in a manner that will be most efficient while diminishing negative phenomena. Establishing the credibility level of models will help smooth this transition.
This is a valid point, as the computing technology replaces human decision making process, it may increase objectivity. Let me clarify, when automated model development is possible, it will likely remove a point of error, i.e., an analyst glueing different parts of model.
jbarhak wrote: 2. Please consider the following topic under the slide: Charge
Identify and promote innovative game changing modeling technologies
I will rephrase this as "identift and promote innovative game changing technologies establishing model credibility".
jbarhak wrote:
3. Under the slide titled Propose guidelines and procedures for credible practice
Endorse methods that directly tie claims to results
This seems rather hard in biological disciplines, particularly if we want the "results" to be classified as clinical outcomes. When going from claims to results; it is possible that multiple methodologies can be used individually or in a combined manner and both may work. Proposing guidelines and procedures for credible practice is a step towards what you mention.
jbarhak wrote:
4. Under the slide titled Promote good practice:

Reward Self Criticism: Suggest methods and promote environments that allow admitting failure to speed up the development cycle.
Yes, indeed. Hopefully we will implement such a culture.
Last edited by Ahmet Erdemir on Sun Apr 28, 2013 5:33 am, edited 1 time in total.

User avatar
Ahmet Erdemir
Posts: 77
Joined: Sun Sep 10, 2006 1:35 pm

Re: Credibility and reproducibility

Post by Ahmet Erdemir » Sun Apr 28, 2013 5:32 am

jbarhak wrote:
You have sufficient clarifications in the reply to better explain your thoughts.
To keep things short here are some characteristics that give credibility points:

- Reproducibility
- Publicly available Test Suite
- Documentation with examples
- Good service indicated by Responsiveness of developers
- Improvement with versions
- Open error reporting
- The system is blind tested
- The system is competitive compared to other systems
- Traceability of data to its source
- Open source
I agree with most of the characteristics listed above, yet have some specific comments for the following, particularly within the context of computational models (not necessarily software):

- Publicly available Test Suite
Having publicy available test suite may not be useful at all, if the software required to run a model is not available publicly. Also, simulation with a model may require high performance computing, which may diminish the utility of an available public test suite

- Good service indicated by Responsiveness of developers
This is ideal yet can be problematic for biological models. It seems to me that products of modeling and simulation community are niche (not a big group of developers behind; yet when translated may have widespread use). This is rather different than the products of open source software community, which are in a sense commodities.

- The system is blind tested
This is ideal yet may not be possible. I am trying to think of healtcare economics as well. A premise of modeling and simulation (other than discovery) is potentially to increase efficiency and sometimes a good balance between the level of efficiency and level of credibility may be established.

- The system is competitive compared to other systems
This again relates to healthcare economics. How can we minimize cost while increasing effectiveness and reliability.

User avatar
Ahmet Erdemir
Posts: 77
Joined: Sun Sep 10, 2006 1:35 pm

Re: Regarding the issue of "intended uses"

Post by Ahmet Erdemir » Sun Apr 28, 2013 5:46 am

huntatucsf wrote: 1. Does the Committee need some way to shrink the modeling & model use case space on which it will focus? Are we focused on uses where simulation results are essential?
I would like to be more inclusive and learn from other disciplines, different modeling techniques, and a variety of use cases. Of course, this may go out of hands but may be we can categorize and work on one at a time.
huntatucsf wrote: 2. One-on-one discussions within the past year with individuals doing M&S work within big Pharma brought to light many model "uses" that are often ignored. …the _real_, primary reason the work was done. Examples follow. The M&S work was done (I'm paraphrasing)…
• to bolster my qualifications for a new position (or promotion)
• because [someone higher up or the "outside consultant] insisted on it
• to enhance publishability
• because we knew that it would distract the FDA from these other issues
• to make a messy situation look better
• because it proven effective as a means to "absorb" "what about X" questions during quarterly reviews
Etc.
huntatucsf wrote: Such hidden uses impact credibility. Do we ignore these issues or confront them?
You have a point, I think such uses models, i.e. modeling and simulation for marketing and promotion, exist regardless the discpline. It will not surprise me that we are all to blame doing something similar at some point in our careers. These hidden uses are apparent in the immediate modeling and simulation community of the discpline of interest. Yet not as much for those removed from that level, e.g., administrators, clinicians, regulators, etc. We shouldn't ignore these issues and increase awareness of such uses when we realize one charge of the Committee - promoting good practice.

User avatar
Martin Steele
Posts: 37
Joined: Tue Apr 23, 2013 9:52 am

Re: Review of summary document providing Committee overview

Post by Martin Steele » Mon Apr 29, 2013 9:26 am

aerdemir wrote: 1. Under the Clinical urgency slide please consider adding the following bullet
Current computing technology can now replace many human tasks and decisions. It is important that the ability of computers is neither exaggerated nor diminished. It is important to gage this transition of tasks from human to machine in a manner that will be most efficient while diminishing negative phenomena. Establishing the credibility level of models will help smooth this transition.
jbarhak wrote: This is a valid point, as the computing technology replaces human decision making process, it may increase objectivity. Let me clarify, when automated model development is possible, it will likely remove a point of error, i.e., an analyst glueing different parts of model.

I must say I'm amazed at where this part of the conversation has gone: Computers replacing human decisions? This is essentially required in some instances (where humans either cannot adequately keep up or keep going): launch vehicle final countdown and ascent, airplane auto-pilot. These are only accomplished in well-defined/constrained domains, and with people watching or nearby. Shouldn't we start with computer (or M&S) informed decision making before turning our world over to the machines? {... at least until we're Three Laws Safe}

User avatar
Martin Steele
Posts: 37
Joined: Tue Apr 23, 2013 9:52 am

Re: Credibility and reproducibility

Post by Martin Steele » Mon Apr 29, 2013 10:24 am

jbarhak wrote: Credibility points:

- Reproducibility
- Publicly available Test Suite
- Documentation with examples
- Good service indicated by Responsiveness of developers
- Improvement with versions
- Open error reporting
- The system is blind tested
- The system is competitive compared to other systems
- Traceability of data to its source
- Open source
Qualification of my responses: my work with NASA’s Standard for Models & Simulations, which includes a credibility assessment, focuses my thoughts on the “credibility of M&S results.” This effort is on the “credible practice of M&S.” There is a difference, but I may confuse them on occasion.

Jacob – 1st to your credibility points:
Reproducibility – To what level is reproducibility required? I can re-run my M&S-based analysis, or I can have a separate, isolated, and independent team replicate the model and analysis on a different hardware and software platform. The latter is safest, but more costly & time-consuming.

Good service indicated by Responsiveness – this is not required for M&S results (or the practice of M&S) to be credible, even though it is a preferred trait. You can receive a “good answer” by a cantankerous personality (or hard to use system) and it can be late (or slow), while still having high credibility.

Competitive compared to other systems AND Open Source – again, this is not required for the practice of M&S to be credible, even though it is a preferred trait. A single proprietary practice of modeling & simulation can be credible.

As you may know, NASA developed a Standard for Models & Simulation that includes a defined assessment of credibility, as well as requirements for reporting M&S-based results. We define credibility as “the quality to elicit belief or trust in M&S results.” As such, we also acknowledge that credibility is not something that can be directly determined. However, it is possible to assess key factors that contribute to
a person’s own assessment of credibility.

The core (minimal) set of credibility factors for M&S were determined to be:
• Verification
• Validation
• Input Pedigree
• Results Uncertainty
• Results Robustness
• Use History
• M&S Management
• People Qualifications
• Technical Review

The reporting requirements for M&S-based results must include:
a. Any un-achieved acceptance criteria.
b. Violation of any assumptions of any model.
c. Violation of the limits of operation.
d. Execution warning and error messages.
e. Unfavorable outcomes from the intended use and setup/execution assessments.
f. Waivers to any of the requirements in this standard.
g. An estimate of uncertainty in the M&S results.
h. An assessment of M&S results credibility.

User avatar
Martin Steele
Posts: 37
Joined: Tue Apr 23, 2013 9:52 am

Re: Comment on Committee overview

Post by Martin Steele » Mon Apr 29, 2013 11:44 am

For those of us new to the healthcare arena (other than being a client), please define "clinical outcome" and/or elucidate on "if we want the "results" to be classified as clinical outcomes."
aerdemir wrote:Hi Tony, thank you for your comments. I tried to respond to them below:
  • I agree that for the sake of condensing the information in to a few words, we may have ended up being vague. As a group we should definitely clarify what "the pressing need" is and how modeling and simulation can offer a solution. From my perspective (orthopaedic biomechanics, musculoskeletal and tissue; implants, etc) I feel that the pressing need emerges from two conditions: 1) individualized medicine - Can modeling and simulation provide the means to increase the accuracy of how we deliver healthcare? Which intervention works for what type of patient? 2) expedited delivery of healthcare products - Can modeling and simulation increase the efficiency we design implants and simplifies the efforts for their regulation? Until we all iterate back and forth on the need and the premise, I believe a statement like the one you propose will be more appropriate:
    "Modeling and simulation offers the capabilities to potentially expedite and increase the efficiency of healthcare delivery by supporting clinical research and decision making"
  • About the gap, I think we are generalizing the issue without realizing that there are indeed models translated to clinical practice. On the other hand, acceptance of published research models (to be used in clinical care) and the rate research models move into clinical realm seems lagging behind (at least for my discipline). I am just wondering if new mechanisms and processes are needed to address this essentially valley of death problem in translational research. For now, we may want to be at least be specific and refer to the models as "research models".
Ahmet

User avatar
Martin Steele
Posts: 37
Joined: Tue Apr 23, 2013 9:52 am

Re: Credibility and reproducibility

Post by Martin Steele » Mon Apr 29, 2013 12:46 pm

jbarhak wrote: ######## The Long story ########

First, Martin your definition of " computational analysis" is more inclusive than our limited context of simulation and modeling in Healthcare. And I do think it is reasonable to demand reproducibility to establish credibility.
I do find reproducibility important to the credible practice of M&S. As a part of my orientation to this team, I’ve now recognized the difference between “the credible practice of modeling & simulation” and “the credibility of M&S-based analysis results.” I would propose that the credible practice of modeling & simulation produces credible M&S results.
jbarhak wrote: In cases where it is more difficult to reproduce at least all the source code and intermediate data should be frozen to allow tracing results backwards.
And so, the credible practice of M&S should include Data, Product, & Process Management.
jbarhak wrote: Here is an example I had experience with: Monte Carlo simulations on a cluster of computers. Each time I launched the simulation the results will be different since each machine is running a different random seed. To help reproducibility, there is a directory that holds all random states and the source code that ran. This allows future replacement of the code with the seeds to produce the same results. I used the source code traceability for debugging. It was useful in finding some issues that could not be explained by normal statistic techniques typically used with Monte Carlo simulations. Therefore the ability to trace back results to the original model is superior to statistical analysis. It is reasonable to demand this these days.

This brings me to the next point Martin mentioned - how do you know a software is any good. If the software is supplied with the tools to test its integrity and with sufficient reproducible examples then the customer/user has at least the capabilities of assessing the credibility of the software tools.
And versioning is part of the game - a very important part. Each version of a model/software should be better than the previous. I believe you will find that Tony Hunt supports this point - he refers to this as "model falsification" - Tony, please correct me if I am wrong. If there is a test suite attached to each version you can make sure that the tests pass and a newer version should pass more tests with each version.

Note that versioning tools are available today and even this committee uses such a tool to hold its documents. Also, there are systems today that help cope with multiple versions of multiple dependent software tools.

So versioning should not be seen as a hazard to credibility. On the contrary, a system that has many versions and rapidly responds to demands and evolves/develops quickly should be given credibility points.
As for critical applications, the best stable version should be frozen until a better version can pass all the tests. Think of test driven development where the tests are written first and the model/software should accommodate these.
We must be careful here to distinguish between a true version and maintenance releases. Too many maintenance updates can also infer inattention to details, lack of testing, and/or regression testing from the application software side. I would also worry about rapid versioning. For example, once you have a model running and a new version of the software application gets released, regression testing is forced on the model developers. For critical M&S, every change in the application software (either a new version or a maintenance update) requires regression testing for the M&S.
jbarhak wrote: Many systems today such as operating systems are constantly updated with new versions/patches. This update is a sign of their credibility. If such a versioning and correction mechanism is not active, then there is a problem and a system should be doubted. It is very much like buying a car that no one can service.

Almost any software system today is not perfect. Yet having the way to correct it is essential. The software should be as good as the demands and in many cases in modeling demands can be coded. The rest - such as Graphic User Interfaces can still be human tested.

Martin is correct to cast some doubt in software systems. Even the same source code may not run the same in different environments. I can give all sorts of examples. Never the less, humans do trust some computer systems. The question is what demands should we make from a system to tag it as credible?

Remember that the model/software just has to be better than we already have today - it is always possible to make a competition to test this.

The list above is what I would answer. You are welcome to edit this list from your own knowledge/experience.

User avatar
Jacob Barhak
Posts: 64
Joined: Wed Apr 17, 2013 4:14 pm

Re: Review of summary document providing Committee overview

Post by Jacob Barhak » Mon Apr 29, 2013 5:56 pm

Hi Martin, Hi Ahmet,

Both of you have valid points. Human repetitive jobs are a good target for automation. Yet this can be expanded to some decisions as well.

Here is one example that may clarify things: Medical devices that recommend dosage. This is traditionally something a human doctor was doing and an algorithm is replacing this decision making process.

The doctor in this case is not replaced and deprived of decision. His decision is different now - elevated to a new level. The healthcare provider has to decide on the best/most fitting medical device for this patient. The human doctor is not being replaced - on the contrary the human gets better tools to control.

So how do you evaluate the credibility of new technology/model/algorithm? For sure some of the traditional methods of testing apply, yet guidelines should be aware of the newer technology.

To keep things simple and to avoid repetition, I am merging the replies to several posts by Ahmet and Martin.

It seems that there is a gap between what is ideally desirable and what is possible to achieve within given resources and time. Ahmet and Martin are right. There are compromises to be made. Especially since biological systems are much more uncertain. Tony has defined this well in figure 1 in http://www.imagwiki.nibib.nih.gov/media ... Poster.pdf. You can also find it in the following paper: C Anthony Hunt, Glen EP Ropella, Tai ning Lam, Andrew D Gewitz, Relational grounding facilitates development of scientifically useful multiscale models. PMCID: PMC3200146. Theor Biol Med Model. 2011; 8: 35. Published online 2011 September 27. DOI:10.1186/1742-4682-8-35


When reading my comments, please place me on the extreme right side of the diagram - chances are that I will be pulling the committee there. Like Martin, I would be interested that people who work with biological models from the left side of the diagram will join the discussion to get contributions that will encompass more ideas.

I generally agree with both Ahmet and Martin, there are not too many differences in our point of view - the terminology is a bit different, yet it seems we have similar ideas. It is important that we have this discussion so we can map things out for the future.

And Martin, if you are asking about clinical outcomes. At least in the context of health states, I can refer you to the following definitions of diseases and health problems. I am not sure this is what you asked for, yet this is at least one definition of clinical outcomes:
http://en.wikipedia.org/wiki/ICD-9
http://en.wikipedia.org/wiki/ICD-10

I hope that bundling all the replies in one post is helpful.

Jacob

User avatar
Martin Steele
Posts: 37
Joined: Tue Apr 23, 2013 9:52 am

Types of Modeling & Simulation in Healthcare

Post by Martin Steele » Tue Apr 30, 2013 6:28 am

What types of modeling & simulation does healthcare do?
In other words:
What M&S methodologies/disciplines are used?
How is the represented part of the healthcare system modeled?
Providing this, perhaps with a one-liner example of each, will help understand the context and breadth of domains discussed.

User avatar
Martin Steele
Posts: 37
Joined: Tue Apr 23, 2013 9:52 am

Re: Review of summary document providing Committee overview

Post by Martin Steele » Tue Apr 30, 2013 6:31 am

All:
One of the things I’m finding interesting is the terminology this community is using. In several of my latest efforts, I’ve found the compilation/creation of a domain glossary/lexicon very important, and if it’s not done sooner, we’ll discover later that we should have done it sooner. Perhaps we can designate an IMAG/MSM glossary/lexicon location within this site.

Jacob, from your last post (not quoted here):
The web links to which you referred to understand the term ‘clinical outcome’ were not entirely helpful to me. May I infer that a clinical outcome is “a result that may be used in routine practice (i.e., in the field or in the clinics where it will be put into every day practice)”?

POST REPLY