SimTK: Snorkel: Project Home

Snorkel

Share
Follow	Project Follow publicly Follow privately Unfollow See followers (2) List Map More info

AboutDownloads News

About
- Project summary
- Project statistics
- Downloads Summary
- Project Activity Plots
- Geography of Use
- Publications
- Team members
Downloads
News

Snorkel is an open-source system that generates training data for information extraction systems, also known as predictive systems.

Snorkel is an open-source system that introduces a new approach for rapidly creating, modeling, and managing data for training predictive systems. It is currently focused on accelerating the development of structured or "dark" data extraction applications for domains in which large labeled training sets are not available or easy to obtain. Examples include biomedical literature and clinical notes. Initial results show that Snorkel with its use of weakly labeled, noisy training data can achieve the same performance as fully supervised learning approaches with “gold standard” labeled training data.

Snorkel has applicability in many domains. Example biomedical domains where Snorkel is being used include the microbiome, joint replacements, and cancer. To learn more or to download, visit http://snorkel.stanford.edu.

To view recordings, slides, and other materials from the July 2017 workshop, click Downloads.

SimTK is maintained through Grant R01GM124443 01A1 from the National Institutes of Health (NIH). It was initially developed as part of the Simbios project funded by the NIH as part of the NIH Roadmap for Medical Research, Grant U54 GM072970.

Version 4.2.2. Website design by Viewfarm. Icons created by SimTK team using art by GraphBerry from www.flaticon.com under a CC BY 3.0 license. Forked from FusionForge 6.0.5.

People also viewed

See all