noRNAlize Alain Laederach, Quentin Vicens Copyright 2006, Stanford University Introduction This SimTK download is designed to facilitate SHAPE data normalization based on the methods described in Vicens et al. 2007, RNA structure mapping in crystals to reveal lattice contacts, RNA, submitted. To use this software, it is necessary to obtain Matlab (from the Mathworks at http://www.mathworks.com) as well as SAFA (freely available from http://safa.stanford.edu). SHAPE (Selective 2'-Hydroxyl Acylation and Primer Extension) probes RNA backbone flexibility and accessibility. Traditionally, RNA constructs with a 3' extension are used to normalize SHAPE data. noRNAlize performs a statistical analysis of the weak SHAPE signals (generally in helical RNA regions) and uses this information to normalize signals. This eliminates the need for the 3' extension. noRNAlize is designed to take SAFA output (in the form of a tab delimited txt file) and renormalize the raw data based on an analysis of the weak intensities in the file. The procedure is repeated so as to also evaluate the error on the estimates. The resulting normalized data can be used to compare absolute intensities across multiple experimental conditions. Usage noRNAlize uses SAFA output as its input. To run noRNAlize in matlab: 1.) Change to the directory where the noRNALize.m file is >> cd 002_normalization-noRNAlize/input-files 2.) Run noRNAlize >> [data_vec_merge,data_error_merge]=noRNAlize Note that the data is returned in the two vectors (data_vec_merge and data_error_merge). The final data is output in the f2_231-122.subnormerge.txt file, which averages both gel images. To normalize your own SHAPE data using noRNAlize: 1.) Open your .gel and .fas files in SAFA. Assign bands for one load as described in Das et al. 2005, SAFA: Semi-automated footprinting analysis software for high-throughput quantification of nucleic acid footprinting experiments, RNA, 11:344-354. You should obtain a .txt file containing the integrated intensities across the load (i.e. named "f1_207-107.txt" for residues 107-207 assigned on the 1st load). As a tutorial, you may quantify in SAFA the "doneassign_1stload_207-107.mat" and "doneassign-2ndload-231-122.mat" files that contain the band assignment on two loads of the same shape.gel file for the P4P6deltaC209 sequence. 2.) Repeat for as many loads/gel (i.e. "f2_231-122.txt" for residues 122-231 assigned on the 2nd load). 3.) Edit the header of noRNAlize in Matlab in order to indicate the names and the locations of the .txt files output by SAFA. Edit the lane numbers to specify which lanes you want to normalize. 4.) Run noRNAlize. noRNAlize outputs the following files: f1_207-107.norm.txt f2_231-122.norm.txt These files contain the intensities normalized to 10% of the weakest intensities across each load. noRNAlize also outputs a list of the corresponding invariant residues in the Matlab window. f1_207-107.subback.txt f2_231-122.subback.txt These files contain the intensities of the (+)NMIA lanes after substraction of the intensities of the (-)NMIA lanes. f1_207-107.subbacknorm.txt f2_231-122.subbacknorm.txt These files contain the intensities of the subback.txt files normalized to 1 assuming that the reactivity of NMIA is the same across all lanes (5% of the intensities are excluded and therefore >1). f2_231-122.subnormerge.txt This file contains the normalized reactivities averaged between the different loads. This file can further be merged with additional independent experiments normalized in the same way. noRNAlize also automatically outputs Fig1-Fig5 that are graphical representations of the previous files. Fig5-xtal1.jpg shows for example the output of expt. #3 described in Vicens et al. 2007 Contents This download contains the following files. 001_integration-SAFA/input-files: total 14760 -rw-r--r-- 1 alain 191 Apr 20 2006 P4P6-deltaC209.fas -rwxrwxrwx 1 alain 7533758 Jun 16 16:02 shape.gel The .fas and .gel files are inputs for SAFA. The lanes in the gel represent: Lane labeling of the shape.gel file corresponding to expt. #3 in the article (first load, from left to right): 1. Sequencing lane using ddC 2. +NMIA (standard solution) 3. ÐNMIA (standard solution) 4. +NMIA (drop solution) 5. ÐNMIA (drop solution) 6. +NMIA (crystal #1) 7. ÐNMIA (crystal #1) 8. Sequencing lane using ddC 9. Sequencing lane using ddA 10. Sequencing lane using ddT 11. Sequencing lane using ddG 12. +NMIA (crystal #2) 13. ÐNMIA (crystal #2) 14. n/a 15. n/a Same loading pattern on the second load. Note: ¥ The presence of a 2Õ-O NMIA adduct causes the reverse transcriptase to stop exactly one nucleotide prior to the modified residue. ¥ The addition of a column corresponding to residue numbers by SAFA on the most left column introduces a shift of +1 in the numbering of the lanes after integration (e.g. lane #2 on the gel corresponding to +NMIA (standard solution) is column #3 in the .txt file output from SAFA) 001_integration-SAFA/output-files: total 0 drwxr-xr-x 4 alain 136 Nov 12 14:55 1st_load drwxr-xr-x 5 alain 170 Nov 13 10:38 2nd_load These are SAFA .txt and .mat files that are a result of the gel analysis. 002_normalization-noRNAlize/input-files: total 32 -rw-r--r-- 1 alain 7493 Nov 12 16:24 noRNAlize.m noRNAlize matlab m-file. 002_normalization-noRNAlize/output-files: total 440 -rw-r--r-- 1 alain 141287 Nov 12 16:32 Fig5-xtal1.jpg -rw-r--r-- 1 alain 11413 Aug 4 16:03 f1_207-107.norm.txt -rw-r--r-- 1 alain 6565 Aug 4 16:03 f1_207-107.subback.txt -rw-r--r-- 1 alain 6565 Aug 4 16:03 f1_207-107.subbacknorm.txt -rw-r--r-- 1 alain 12430 Aug 4 16:03 f2_231-122.norm.txt -rw-r--r-- 1 alain 7150 Aug 4 16:03 f2_231-122.subback.txt -rw-r--r-- 1 alain 7150 Aug 4 16:04 f2_231-122.subbacknorm.txt -rw-r--r-- 1 alain 16385 Aug 4 16:04 f2_231-122.subnormerge.txt noRNAlize output files.