Put an example PDB line in here somewhere to make this more concrete --Chris Objective: 1) Add an 'R' immediately before each residue name in an RNA crystal file. 2) Append a '5' immediately after each residue name within the first residue of an RNA crystal file. 3) Append a '3' immediately after each residue name within the last residue of an RNA crystal file. Method: -The highest-level control structure will be a while loop that runs until the end of the file, and saves/deletes each line of the crystal file to the variable 'pdbLine'. This will be done with a 'gets' statement. Each line that passes through the while loop (some lines will have been altered) will be saved to an 'outputPdbFile'. -Given the nature of this problem, it will be easier to make substitutions at the residue level rather than the atom level. This method involves saving all permutations of 'pdbLine' that share the same residue number into a higher level variable called 'pdbBlock'. -The command "string range $pdbLine 23 25" identifies the units digit of the residue number. This string command will be placed within the highest-level while/gets loop, and its output will be saved to the variable 'residueNumber'. Every line whose 'residueNumber' matches the 'residueNumber' of the preceding line will be saved to 'pdbBlock'. When a line's 'residueNumber' does NOT match the previous one, 'pdbBlock' will be deleted and the current line will be saved to 'pdbBlock'. The next matching lines will be saved as well until a non-matching line occurs. -How will the program know when the next line's 18th character (its residueNumber) does not match the current line's one? A regexp statement with a variable 'residueNumber' will be used to search each line. However, the regexp statement will be applied BEFORE the output of "string range $pdbLine 23 25" is saved to 'residueNumber'. Therefore, the 'residueNumber' variable in the regexp statement refers to the line before the current line. -The important blocks are the first one (which has the initial ATOM line) and the last one (which ends with a TER) because they are the only ones to require two substitutions per atom line. -I suggest a do-while loop (that will be initiated when it encounters the first ATOM line) be placed within the highest-level while/gets loop. This loop will be terminated/restarted when the regexp statement previously mentioned is false. The first time the loop runs, (ie the DO) it will add an 'R' before the 18th character and a '5' after the 18th character. Eventually, the loop will hit a new residue, and 'pdbBlock' will be flushed out, and the loop will restart (as a WHILE). For the WHILE component of the loop, an 'R' will simply be added before every 18th character. -When the loop reaches a TER, it will have reached the last 'pdbBLock'. I will add an "if (regexp {TER} $pdbBlock)" statement at the BEGINNING (before the subsequent statements can delete 'pdbBlock') of the WHILE section of the do-while loop. If true, the statement will add an 'R' before the 18th character and a '3' after the 18th character. Example Pdb Line: ATOM 58 P A A 3 -3.334 -9.556 11.282 1.00 20.02 P