This tool (rnaDB.py) requires BioPython and RNAVIEW; installation instructions for which may be found (at the time of writing) here: http://www.biopython.org and here: http://ndbserver.rutgers.edu/services/download/index.html rnaDB.py builds a database of rna helix information through automated calls to RNAView, and then allows various queries to the database and can build new pdb files, via the extract command, that contain only helical portions of the original pdb file, or only non-helical portions, for any or all of the chains in that file. A typical usage would be to start by running the script to build the database for a set of pdb files (specified here in a text file containing a list of pdb IDs). > python rnaDB.py --build -l my_pdbID_list.txt next we can ask for an inventory of helices found by RNAView: > python rnaDB.py --inventory next we can extract the helical portions of all these files into new pdb format files, with names based off of the original pdb files and placed into a subdirectory ('rna_pdbs'). > python rnaDB.py --extract -l my_pdbID_list.txt The full set of commands available are as follows: command format: python rnaDB.py [options] [...] MUST include either --audit, --build, --extract, --inventory or --printHelices : for --build, a pdb filename, or with the -l option a name of a textfile containing a list of pdb files, or a wildcard-glob for either. : for --extract, a pdb ID, or pdbID:chains such as 1A34:BC, or with the -l option a name of a textfile containing a list of such, or a wildcard-glob for such textfiles. options: --audit: checks existing entries for helix patterns known to need special attention --build: new pdb entries will be added to the rnaDB --extract: helical (default) or non-helical RNA portions of specified pdbIDs and chains will be extracted to separate pdb files --db=: specifies rna_db file name, default "rna_db.dat" -h, --help: display this help message. This tool requires BioPython and RNAVIEW; installation instructions for which are found here: http://www.biopython.org/docs/install/Installation.html and here: http://ndbserver.rutgers.edu/services/download/index.html -l : command line identifiers are textfiles containing a whitespace separated list of identifiers to be processed --tmp=: specifies directory for temporary files, default is "rnaDB_tmp" --dest=: specifies directory for extracted pdb files, default is "rna_pdbs" --helices: deprecated, replaced with intra_helices, below, for clarity. --intra_helices: only works with --extract, specifies single-stranded (intra-molecular) helical portions are to be extracted --all_helices (default): only works with --extract, specifies all helical portions are to be extracted --inter_helices: only works with --extract, specifies only intermolecular helical portions are to be extracted --invert: only works with --extract, specifies extraction of the portions not meeting helical and/or other requirements --noDL: disables downloading of pdb files from pdb.org --printHelices: prints helix descriptors for helices in specified entries --basePairs: adds inidividual basePair details to --printHelices output --savexml: saves xml files created during build process