Dynalign from RNAstructure 4.5, released 5/2/07.

This readme accompanies Dynalign, an algorithm for
simultaneously predicting the lowest free energy RNA
secondary structure common to two sequences and the
alignment of the sequences.  It was described in
detail in: Mathews & Turner, Journal of Molecular
Biology, 317:191-203 (2002) and Mathews,
Bioinformatics, 21:2246-2253 (2005).

This version of Dynalign includes suboptimal
structure and alignment prediction.

This version also uses the accelerations described
in Uzilov, Keegan, & Mathews, BMC Bioinformatics, 
7:173 (2006) and those described in Harmanci, 
Sharma, & Mathews, BMC Bioinformatics, 8:130 (2007).

This version is capable of being run on a
multi-processor (SMP) machine for faster processing,
thanks to the work of Chris Connett, Andrew Yohn,
and Paul Tymann of the Rochester Institute of 
Technology.  A POSIX-compliant threading library 
(e.g. Pthreads) is required by this feature.

To compile Dynalign, edit the included Makefile to
indicate your C++ compiler of choice.  The default
is the gnu compiler, gcc.

Dynalign, Dynalign for SMP, and the dot plot
utility, dynalign_dotplot, can then be compiled with
"make dynalign", "make dynalign-smp", and "make
dynalign_dotplot", respectively.

Next, place the Dynalign executables and data files
(.dat files) in their final locations.  Dynalign
reads the thermodynamic data files at each execution
to access the free energy parameters.  The
environment variable $DATAPATH is used to direct the
program to the directory that contains the
parameters.  (If $DATAPATH is undefined, Dynalign
assumes that they sit in the pwd.)  Define 
$DATAPATH in the following way
for data files in /usr/local/dynalign: On tcsh or
csh: setenv DATAPATH /usr/local/dynalign/ On bash or
sh: export DATAPATH=/usr/local/dynalign/ Be sure to
include the final slash.  This should probably be
placed in your login script so that it does not need
to be specified everytime you use Dynalign.

Sequences for input in Dynalign are assumed to be in
the following (.seq) format:

;(first line of file) Comments must start with a 
;semicolon
;There can be any number of comments
A single line title must immediately follow
AAA GCGG UUTGTT UTCUTaaTCTXXXXUCAGG1  

where the terminal 1 is required at the end of the
sequence and all whitespace is ignored in the
sequence.  Lower case nucleotides ARE NOT ALLOWED to
base pair.  X represents a nucleotide that neither
pairs nor stacks.  Two tRNA sequences are included
as examples, RD0260.seq and RA7680.seq.

Dynalign can be run from the command line using a
configuration file, or in an interactive mode:

dynalign [configfile]

If a single command-line argument is given, it is
read as the configuration filename.  See
ReadMe_configuration.txt for more information on the
configuration file format.

With no command-line arguments, Dynalign will
interactively prompt for the required information.

Here is a summary of required input:
inseq1 and inseq22 are the two input sequence
files. 

outct1 and outct2 are the output files for structures and
are in the connect table format.

alignment is the output file for the alignments.

The following are the optional inputs:
M is the maximum separation parameter.  This is 
largely deprecated.  We advise entering -99 so that
the alignment constraint as described by Harmanci
et al. is used.  
This version of dynalign uses the following scheme 
for M (if M is used):  For 
nucleotide i from sequence 1 to align to nucleotide
k in sequence 2:
| i * N2/N1 - k | <= M
where N1 is the length of sequence 1 and N2 is the
length of sequence 2.  M = 6 works for 5S rRNA and
tRNA.  A larger M may be required for
longer sequences.  

The gap_cost (fgap) is used to discourage the introduction
of gaps in the alignemnt.  0.4 kcal/mol/gap is the
recommended value.

max_percent_diff is the maximum percent difference
in free energy in the suboptimal structures.  20
(for 20%) is a good starting value.

bp_window specifies how different the suboptimal
structures must be from each other.  2 is probably a
good starting point and smaller values will result
in more suboptimal structures.

Similarly, the align_window specifies how different
the suboptimal alignments must be from each other.
1 is a good starting point.

single_bp_inserts specifies whether single base pair
inserts are allowed; 0 = no; 1 = yes.  A savefile
can be specified.  These are required to generate
dot plot information.

Files can be specified that contain structure and/
or alignment constraints.  These are described in
ReadMe_constraint.txt.  0=no constraints; 1=read
constraint file.

Local alignments can be performed by using a 
configuration file and setting local = 1.  Note that
local alignment only applies to the calculation by
Dynalign.  The alignment pre-filter always runs 
in global mode. By default, Dynalign runs in global
mode.

In the SMP version of Dynalign, num_processors
should be set to the total number of processors (or
processor cores on multi-core CPUs) available on the
system.

Dot plots are described in ReadMe_dotplot.txt.

Dynalign is also available as a user-friendly GUI
for Microsoft Windows in RNAstructure.  It can be
downloaded at http://rna.urmc.rochester.edu .
RNAstructure also works using WINE on Linux.

Note that on Macintosh OS X, there is a small
default stack limit.  To run Dynalign, the stack
limit needs to be increased.  On the default shell,
bash, use:
ulimit 4096
or, if you are using tcsh, use:
limit stack 4096

If you have any trouble with Dynalign, please 
email David Mathews:
David_Mathews@urmc.rochester.edu