Recurring Meeting of Cleveland Clinic - Stanford University

Date: April 4, 2017

Time: 1:00 PM EST

Means: Conference Call

Attendees:

  1. Ahmet Erdemir (Cleveland Clinic)
  2. Tyler Schimmoeller (Cleveland Clinic)
  3. Rici Morrill (Cleveland Clinic)
  4. Joy Ku (Stanford University)
  5. Mike Wong (Consultant)

Agenda:

This is a freeform meeting to discuss public launch and promotion of the data management and query interface.

Immediate Action Items:

Notes:

  1. The team discussed issues with the launched system and elaborated on resolution strategies.
    • Rici was not able to browse and upload more data to the data management system. Mike identified the problem to be related to disk space and removed old download packages to allow access and use. Mike recommended a disk space at least 5 times the anticipated data size. For in vivo testing, the Cleveland Clinic team expects a total data size of 200 GB. Therefore a 1 TB disk space may be necessary. In response to a question, the Cleveland Clinic team noted that the in vitro testing data will be at a similar size to in vivo testing data. That data set will contain less number of segments but additional imaging data acquired through CT and MRI.
    • Joy noted that the server is a virtual machine; they can request and add space. She noted that they monitor these virtual machines (memory and CPU usage) for SimTk already and all that infrastructure will need to be moved over to this project in future.

    • Ahmet also recommended that snapshots created to facilitated download of the whole data set can be removed regularly, i.e. whenever a new snapshot is created, the older one can be removed. This can be adapted as a policy and may help alleviate the disk space issue. There is already a policy to delete prepared download packages after three days. Ahmet also noted that additional measures may need to be placed to prevent large number of requests for preparation of download packages.
    • The Cleveland Clinic team informed the developers that they would be using the feature request and bug tracking system as they encounter potential features or new bugs.
    • Rici asked about the timeline to put the rest of the in vivo testing data to the system. Mike can quickly update the system to automatically delete old snapshots. Joy will keep the Cleveland Clinic team posted about other updates on the virtual machine.
    • Mike asked Rici if she found the interface to include the accepted trials helpful. Ahmet noted that the Cleveland Clinic team needs to provide documentation to the users to explain what "accepted" trial means.
    • Rici pointed out that when querying none of the ultrasound images were downloading. Mike noted that if derivative data are selected, you get only derivative data. Nonetheless, when derivative data are not selected, the system should provide the ultrasound images as well. Mike will confirm if that is the case or not. Rici will do the same.
  2. Potential use of the data management system for sister data and for other projects were noted.
    • The in vitro testing from Operation MULTIS will also require the use of a data management and querying system. The current system, where in vivo data is uploaded, can be used to host the in vitro data. Or, an additional system can be launched. The Cleveland Clinic team has not decided on the strategy yet. It will depend on the ease of hosting and querying both data sets together.
    • Ahmet is also interested in using the system for his other projects, e.g. Open Knee(s). This will test the applicability of the data management strategy to projects with potentially different needs. The concepts of raw data, derivative data, and metadata management will likely be equally applicable.
  3. The group discussed promotion venues for the data management and querying system.
    • Ahmet recommended development of a white paper for the data management and querying system. Joy thinks that it may work. She needs to talk to Scott Delp about restructuring it. This may turn in to a journal paper and may relate well with Data Commons initiative of NIH. Another application beyond our laboratory may be needed.
    • Ahmet recommended to reach out to Osteoarthritis Initiative (OAI), which collected large amounts of longitudinal data (images, biological metrics, clinical metrics). OAI data are currently disseminated by mailing harddrives. The data set may be a good challenge for the data management and querying system and the services that are offered through SimTk.

    • Joy asked about deadlines for meetings with the US Army. Ahmet noted that he did not get an update about the review meeting. He reminded that he submitted an abstract to the Military Healthcare Research Symposium. The date and location of that conference have not been decided yet. Nonetheless, Ahmet believed that current state of the data management and querying system would provide a good showcase.
    • The Cleveland Clinic team still needs to provide examples of navigation and querying of the data management system to assist potential users.
    • The groups also decided that external users (outside the immediate development and user base) may need to be invited for usability testing.

RecurringMeetings/2017-04-04-1300 (last edited 2017-04-11 13:41:24 by aerdemir)