Current State

In vivo data is available at https://multisbeta.stanford.edu/.

In vitro data management site is currently undergoing testing (https://mobilizealpha1.stanford.edu/).

Please see /Discussion for details on the current state of the data management specifications/infrastructure.

Data Management Overview

Target Outcome

1. Automatically push data from its origin to a central data analysis computer. After analysis, data will be manually pushed to to the Stanford and MIDAS sites.

Local Data Management

Stages

A specified folder on each computer will be automatically pushed to the server.

Folder Configuration

In Vivo

Raw Data Collection:

Data Analysis folders appended:

In Vitro - Ultrasound

Raw Data Collection:

Data Analysis folders appended:

In Vitro - Surgical Tools

** TBD: Work in progress Instrumented Surgical Tools:

PART I: Raw Data

Target Outcome

1. To build a web-based data management for organization and dissemination of raw data.

Proposed web-based data management and databases

Use Cases

1. Upload and reorganization of raw MRI files (DICOM) format [project administrator actions]

  1. Assume files are organized in a directory structure off-line, which is to be uploaded and replicated in the web-based system. Implement as a drag-and-drop.
  2. Meta-data will be automatically derived from wiki page or readme, looking for certain keywords (like gender, age). User can update meta-data after import (or enter meta-data if none is automatically imported). Meta-data associated with all child files. Meta-data may come from file or folder name also, maybe have user include key for filenames as part of readme.
  3. Once files are in the web-based system, they can be moved around to different folders or deleted. New files and folders of files can be added to the existing directory structure.
  4. Folders can be added, deleted, or moved.

2. Dissemination of raw data [data user actions]

  1. Example queries would require searches based on gender, age, data type (in this MRI data but others may only be interested in mechanical data),...is this sufficient to start?
    1. Gender
    2. Age
    3. Health condition (arthritic or not)
    4. Data type, e.g., if you want to analyze tissue thickness across a population, then you want to find all datasets with MRI data.
    5. Cadaver vs. human
    6. Licensing restrictions
  2. Should queries work across studies or within a study? In other words, will you be uploading all your data in as one study so searches are just done within your study? Start with queries within a study. Later when we have more studies, we can think
  3. Have ability to "select all" or check off boxes for which data from the specific subjects to download. Download the data as a zip file containing the files in a directory structure.
  4. Provenance information should be included somehow. At a minimum, this should include the date (if available, revision number) and location from which the data was obtained. Part of download as a README file.
  5. Also need to provide licensing info with downloaded zip file.
  6. In moving files from storage to dissemination stage, want to add license info perhaps to just a portion of it.

Minimum Required Functionality

Desired End-User Functionality

Desired System Specifications

1. Ability to handle large datasets. In this project, the expectation is that there will be about 1GB of data per subject.

2. Existing functionality for file management would be ideal, so this wouldn't need to be recreated.

Preliminary Work

Workflow

Check FEA-workflow.pdf , which provides a detailed workflow for finite element analysis in biomechanics. Various steps within the workflow indicate the need for a relevant platform to organize and disseminate raw data. Also refer to FEA-anatomy.pdf .

File Browser Software Analysis

Requirements

- Licensing: BSD or MIT or equivalent Open Source license that allows for commercialization

Candidates

Feature Requests

May 6, 2016

Outstanding Questions

1. Is the ability to set permissions at a study level sufficient? Need it at a more granular level

2. Should we create ability to import meta-data from wiki page? Yes, eventually

3. What fields do you think users would want to query on? See above

4. Will you be uploading all your data in as one study so searches are just done within your study? Yes.

5. What provenance information is required? Thoughts on what is the best way to include that in the data that is downloaded - as an additional README? in the header of each file? other ideas? This decision will affect the handling of derivative data and associating it back to the parent data when it is uploaded. It's really tricky to incorporate the info in the header of every file. Better to advise and strongly encourage people to keep that README around.

PART II: Derivative Data

Target Outcome

To build web-based databases for organization and dissemination of derivative data and their association to raw data.

Use Cases

Minimum Required Functionality

Desired End-User Functionality

Desired System Specifications

Preliminary Work

Check FEA-workflow.pdf , which provides a detailed workflow for finite element analysis in biomechanics. Various steps within the workflow indicate the need for a relevant platform i) to process raw data to bring it into a form readily usable for modeling & simulation, ii) to build relational databases to organize and disseminate derivative data. Also refer to FEA-anatomy.pdf .

Sample Data

This zipped folder contains a subset of in-vivo data that has been collected to be used for final modifications to the data management interface.

Specifications/DataManagement (last edited 2018-11-09 16:36:32 by ricimorrill1)