parameters.csv file for RNA modeling
- Deepak kumar
- Posts: 47
- Joined: Thu Dec 12, 2013 9:13 am
Re: parameters.csv file for RNA modeling
hi Sam,
I repaired the residue 30 and other geometric problems with the residues.
last.2.pdb:
https://www.dropbox.com/s/e2zqqq0ecrelzxu/last.2.pdb
This model looks fine now. Please let me know about your comments. Also, I would like to ask some basic queries about this model (could be generalized to RNA modeling itself):
a) The region 188-219 and 103-137 seem to have tertiary contacts but not all of these tertiary contacts are according to the secondary structure information provided in the literature. Although, these contacts present in the current model make proper secondary structure folds, so can this be considered a reliable model? supposing that the native structure is not available.
b) The RMSD of the template and the target is 5.546 (structural core). The RMSd of the target and the model of the target is 13.64 (for all 229 pairs) whereas the RMSD of the target and the model is same 5.546 (same to that of template and target). Since, this is the sanity test performed on the available structure, what is the relevance of RMSD? Does RMSD 13.64 say that the model is not good?
It would be nice to have your comments.
thanks.
cheers!
Deepak
I repaired the residue 30 and other geometric problems with the residues.
last.2.pdb:
https://www.dropbox.com/s/e2zqqq0ecrelzxu/last.2.pdb
This model looks fine now. Please let me know about your comments. Also, I would like to ask some basic queries about this model (could be generalized to RNA modeling itself):
a) The region 188-219 and 103-137 seem to have tertiary contacts but not all of these tertiary contacts are according to the secondary structure information provided in the literature. Although, these contacts present in the current model make proper secondary structure folds, so can this be considered a reliable model? supposing that the native structure is not available.
b) The RMSD of the template and the target is 5.546 (structural core). The RMSd of the target and the model of the target is 13.64 (for all 229 pairs) whereas the RMSD of the target and the model is same 5.546 (same to that of template and target). Since, this is the sanity test performed on the available structure, what is the relevance of RMSD? Does RMSD 13.64 say that the model is not good?
It would be nice to have your comments.
thanks.
cheers!
Deepak
- Samuel Flores
- Posts: 189
- Joined: Mon Apr 30, 2007 1:06 pm
Re: parameters.csv file for RNA modeling
Hi Deepak,
Regarding regions 188-219 and 103-137, I am not sure I understand. Are the tertiary contacts not reported in the literature, or do they actually contradict the literature?
In the former case, that's great! If your model takes limited existing information and infers additional information, then it's predictive. Are the tertiary contacts actually in the crystal, which (as I understand) is available?
In the latter case, you will have to resolve this conflict. If the tertiary contacts come from ModeRNA they may be the result of threading to a fragment from a different molecule, which would not fold this way in the context of the Group I Intron. On the other hand, the error could be in the secondary structure prediction. You should read into how it was determined. The secondary structure of the Group I Intron is well known by now, and there should be many sequences .. on the other hand if you got it from an old paper it may be inaccurate. If you got it from mFold, that is unfortunately quite unreliable. In any case, there is no choice but to dive into the literature and determine which data you believe more.
As for question (b), I suspect you are using inconsistent terminology. Some papers call the structure produced by threading the "target," whereas I and others sometimes call this the "model." The correct structure which you are using as a gold standard but which is not the template, I would call the "experimental structure." I don't know what you mean by "model of the target." However the RMSD between template and target should be 1Å or less -- the springs should pull the latter onto the former very well. In our PSB paper the RMSD between the model (or "target" if you wish) and the template was about 4.5Å I believe. I would say RMSD of 13.64 is not good -- I've gotten better RMSDs even without a template (albeit on a smaller molecule). Here I would say you are not finished, you need to figure out which parts of the model are contributing most to the error, and why.
Sam
Regarding regions 188-219 and 103-137, I am not sure I understand. Are the tertiary contacts not reported in the literature, or do they actually contradict the literature?
In the former case, that's great! If your model takes limited existing information and infers additional information, then it's predictive. Are the tertiary contacts actually in the crystal, which (as I understand) is available?
In the latter case, you will have to resolve this conflict. If the tertiary contacts come from ModeRNA they may be the result of threading to a fragment from a different molecule, which would not fold this way in the context of the Group I Intron. On the other hand, the error could be in the secondary structure prediction. You should read into how it was determined. The secondary structure of the Group I Intron is well known by now, and there should be many sequences .. on the other hand if you got it from an old paper it may be inaccurate. If you got it from mFold, that is unfortunately quite unreliable. In any case, there is no choice but to dive into the literature and determine which data you believe more.
As for question (b), I suspect you are using inconsistent terminology. Some papers call the structure produced by threading the "target," whereas I and others sometimes call this the "model." The correct structure which you are using as a gold standard but which is not the template, I would call the "experimental structure." I don't know what you mean by "model of the target." However the RMSD between template and target should be 1Å or less -- the springs should pull the latter onto the former very well. In our PSB paper the RMSD between the model (or "target" if you wish) and the template was about 4.5Å I believe. I would say RMSD of 13.64 is not good -- I've gotten better RMSDs even without a template (albeit on a smaller molecule). Here I would say you are not finished, you need to figure out which parts of the model are contributing most to the error, and why.
Sam
- Deepak kumar
- Posts: 47
- Joined: Thu Dec 12, 2013 9:13 am
Re: parameters.csv file for RNA modeling
Thanks again Sam for the valuable information.
The tertiary structure contacts available in the crystal structure for the region 103-137 and 188-219 are :
for the region 103-137 : most of the tertiary structure contacts are not as per the reference (crystal structure)
for the region 188-219 : some of the residues follow the tertiary contacts as per crystal structure but not all of them.
Also, by "model of the target" i mean the model generated for the crystal structure already available. I will explain my terminologies here better :
Template : in this particular case chain A crystal structure
target : the crytal structure of chain B already available (as already mentioned before that this is the sanity test i am performing to benchmark the modeling process)
model/model of target : model generated for the crystal structure (target,chain B) using the structural alignment/threading between "template" and "target". hope I am clear.
Regarding the RMSD , I would like to tell you that the RMSd I presented you were calculated by program "chimera" by superposing the model and the crystal structure of target (RMSD 13.64), for "template" and "target" is 5.546. I also have a program to do this but since the model is generated from different programs the number of atoms differ from the crystal structure thus making it difficult to calculate the RMSD. If you have any suggestion on how to make an efficient RMSd calculation? because may be the RMSD calculation is not accurate at my side.
Thanks
cheers!
Deepak
The tertiary structure contacts available in the crystal structure for the region 103-137 and 188-219 are :
for the region 103-137 : most of the tertiary structure contacts are not as per the reference (crystal structure)
for the region 188-219 : some of the residues follow the tertiary contacts as per crystal structure but not all of them.
Also, by "model of the target" i mean the model generated for the crystal structure already available. I will explain my terminologies here better :
Template : in this particular case chain A crystal structure
target : the crytal structure of chain B already available (as already mentioned before that this is the sanity test i am performing to benchmark the modeling process)
model/model of target : model generated for the crystal structure (target,chain B) using the structural alignment/threading between "template" and "target". hope I am clear.
Regarding the RMSD , I would like to tell you that the RMSd I presented you were calculated by program "chimera" by superposing the model and the crystal structure of target (RMSD 13.64), for "template" and "target" is 5.546. I also have a program to do this but since the model is generated from different programs the number of atoms differ from the crystal structure thus making it difficult to calculate the RMSD. If you have any suggestion on how to make an efficient RMSd calculation? because may be the RMSD calculation is not accurate at my side.
Thanks
cheers!
Deepak
- Samuel Flores
- Posts: 189
- Joined: Mon Apr 30, 2007 1:06 pm
Re: parameters.csv file for RNA modeling
Deepak,
You cannot just make up your own terminology! Please follow that in the literature. Some authors choose different terminology from other authors, but in that case you should just pick one author and stick to his/her terminology.
The "template" is always the known structure which is used as a basis for threading. This is usually a homolog -- coming from a different organism or cellular compartment. This is chain B in your MMB runs.
The "model" or "target" is the molecule of (presumed) unknown structure. It is homologous to the template, to a greater or lesser extent. This is chain A in your MMB run.
The "experimental structure" is the experimentally observed, folded structure used as a gold standard. Same molecule as the "model" or "target." This molecule is not involved in your MMB runs. The "experimental structure" is used for validation only, and should not be referred to until the very end, if you are developing a new threading method. If you had this structure in a practical situation, you would never be doing threading.
Please rephrase your question using official terminology. Otherwise it's just too confusing.
RMSD should be calculated very carefully. Don't just use standard push-button programs blindly. If it were me, I would do it in VMD using TCL commands, e.g. atomselect, measure fit, move, and measure RMSD. You have to be very clear about what it is you are calculating. As I say, if you are comparing only the template vs. model regions that are actually flexibly aligned, you should get RMSD < 1Å. If you align pieces that were rigid throughout your MMB run, that will probably give a higher RMSD. The model vs. experimental structure RMSD should be higher than 1Å, but 13Å sounds way too high.
Sam
I don't think there is any threading paper in which the template is referred to as the "target."
You cannot just make up your own terminology! Please follow that in the literature. Some authors choose different terminology from other authors, but in that case you should just pick one author and stick to his/her terminology.
The "template" is always the known structure which is used as a basis for threading. This is usually a homolog -- coming from a different organism or cellular compartment. This is chain B in your MMB runs.
The "model" or "target" is the molecule of (presumed) unknown structure. It is homologous to the template, to a greater or lesser extent. This is chain A in your MMB run.
The "experimental structure" is the experimentally observed, folded structure used as a gold standard. Same molecule as the "model" or "target." This molecule is not involved in your MMB runs. The "experimental structure" is used for validation only, and should not be referred to until the very end, if you are developing a new threading method. If you had this structure in a practical situation, you would never be doing threading.
Please rephrase your question using official terminology. Otherwise it's just too confusing.
RMSD should be calculated very carefully. Don't just use standard push-button programs blindly. If it were me, I would do it in VMD using TCL commands, e.g. atomselect, measure fit, move, and measure RMSD. You have to be very clear about what it is you are calculating. As I say, if you are comparing only the template vs. model regions that are actually flexibly aligned, you should get RMSD < 1Å. If you align pieces that were rigid throughout your MMB run, that will probably give a higher RMSD. The model vs. experimental structure RMSD should be higher than 1Å, but 13Å sounds way too high.
Sam
I don't think there is any threading paper in which the template is referred to as the "target."
- Deepak kumar
- Posts: 47
- Joined: Thu Dec 12, 2013 9:13 am
Re: parameters.csv file for RNA modeling
Hi Sam,
thank you for giving me right guidance on this issue too:
I have the correct RMSD calculations following this commands in vmd :
a) set sel1 [atomselect template "backbone and resid 1 to 50"]
b) set sel2 [atomselect model "backbone and resid 1 to 50"]
c) set tmat [measure fit $sel1 $sel2]
d) set movsel [atomselect template "all"]
e) $movsel move $tmat
f) measure rmsd $sel1 $sel2
template and model : 2.84
https://www.dropbox.com/s/blyc475fgmdi1 ... ligned.pdb
model and reference crystal structure : 6.17
https://www.dropbox.com/s/65x9mla5ew7qa ... ligned.pdb
Could you lease have alook and give your comments? I think the RMSDs are now correct and considerable.
thanks!
cheers!
thank you for giving me right guidance on this issue too:
I have the correct RMSD calculations following this commands in vmd :
a) set sel1 [atomselect template "backbone and resid 1 to 50"]
b) set sel2 [atomselect model "backbone and resid 1 to 50"]
c) set tmat [measure fit $sel1 $sel2]
d) set movsel [atomselect template "all"]
e) $movsel move $tmat
f) measure rmsd $sel1 $sel2
template and model : 2.84
https://www.dropbox.com/s/blyc475fgmdi1 ... ligned.pdb
model and reference crystal structure : 6.17
https://www.dropbox.com/s/65x9mla5ew7qa ... ligned.pdb
Could you lease have alook and give your comments? I think the RMSDs are now correct and considerable.
thanks!
cheers!
- Samuel Flores
- Posts: 189
- Joined: Mon Apr 30, 2007 1:06 pm
Re: parameters.csv file for RNA modeling
This is exactly the correct way to calculate RMSD.
2.84Å is starting to sound better, for the template vs. model RMSD. I think it's not lower, only because you did the local threading using ModeRNA, which uses many templates. This might be OK; locally non-optimal does not necessarily add up to globally bad.
Now you just need to replace "resid 1 to 50" with the full set of residues for which you want to compute RMSD. You might want to compute them for several sets of residues. This could be three groups: all residues which were flexibly aligned, all residues which were flexibly or rigidly aligned, and all residues which have correspondence with each other, according to your sequence alignment. I expect these would give progressively increasing RMSDs (unless the second and third sets are identical). If there is a section of your model for which you have insufficient information, you might want to calculate separately a fourth RMSD that leaves out this section.
Sam
2.84Å is starting to sound better, for the template vs. model RMSD. I think it's not lower, only because you did the local threading using ModeRNA, which uses many templates. This might be OK; locally non-optimal does not necessarily add up to globally bad.
Now you just need to replace "resid 1 to 50" with the full set of residues for which you want to compute RMSD. You might want to compute them for several sets of residues. This could be three groups: all residues which were flexibly aligned, all residues which were flexibly or rigidly aligned, and all residues which have correspondence with each other, according to your sequence alignment. I expect these would give progressively increasing RMSDs (unless the second and third sets are identical). If there is a section of your model for which you have insufficient information, you might want to calculate separately a fourth RMSD that leaves out this section.
Sam
- Samuel Flores
- Posts: 189
- Joined: Mon Apr 30, 2007 1:06 pm
Re: parameters.csv file for RNA modeling
by the way I didn't actually look at your structures. In this capacity I'm just providing technical, not so much scientific support.
Sam
Sam
- Samuel Flores
- Posts: 189
- Joined: Mon Apr 30, 2007 1:06 pm
Re: parameters.csv file for RNA modeling
How is this going?
I took a quick look but the two structures you provide are identical to each other. There is only one RNA chain in each.
I took a quick look but the two structures you provide are identical to each other. There is only one RNA chain in each.
- Deepak kumar
- Posts: 47
- Joined: Thu Dec 12, 2013 9:13 am
Re: parameters.csv file for RNA modeling
Hi Sam
I am at present working on other models along with the model whose result i sent you in my last message. Since the result of chain A model obtained from ranbuilder is very close to good I have taken this model to do simulation. I have a program that does the rna simulation based on secondary structure constraints. For the region 103-137 I have provided the secondary structure constraints on the model obtained from rnabuilder (because this is the region causing most of the problem) and ran the simulation. It's running, I will send you the results once the simulation is finished for your comments.
I would like to ask you that :
Is there a possibility to provide the secondary structure constraints (as baseInteractions) for the region 103-137 in rnabuilder and let it simulate (keeping the constraints) to see if it could find the conformation with lowest energy that could be close to crystal structure? If that is possible, it would be easy to get an accurate model.
The structures i sent you in last message :
template and model : 2.84 RMSD is the model and the template aligned structure (chain A model, chain B template)
https://www.dropbox.com/s/blyc475fgmdi1 ... ligned.pdb
model and reference crystal structure : 6.17 (is the aligned structure of model and the crystal structure , chain A model and chain A crystal structure) ; because both chains are same they seem to be same structure but now I have renamed the chain for the crystal structure.
chain A is the model
chain Z is the crystal structure.
https://www.dropbox.com/s/65x9mla5ew7qa ... ligned.pdb
Please have alook .
Thanks.
cheers!
I am at present working on other models along with the model whose result i sent you in my last message. Since the result of chain A model obtained from ranbuilder is very close to good I have taken this model to do simulation. I have a program that does the rna simulation based on secondary structure constraints. For the region 103-137 I have provided the secondary structure constraints on the model obtained from rnabuilder (because this is the region causing most of the problem) and ran the simulation. It's running, I will send you the results once the simulation is finished for your comments.
I would like to ask you that :
Is there a possibility to provide the secondary structure constraints (as baseInteractions) for the region 103-137 in rnabuilder and let it simulate (keeping the constraints) to see if it could find the conformation with lowest energy that could be close to crystal structure? If that is possible, it would be easy to get an accurate model.
The structures i sent you in last message :
template and model : 2.84 RMSD is the model and the template aligned structure (chain A model, chain B template)
https://www.dropbox.com/s/blyc475fgmdi1 ... ligned.pdb
model and reference crystal structure : 6.17 (is the aligned structure of model and the crystal structure , chain A model and chain A crystal structure) ; because both chains are same they seem to be same structure but now I have renamed the chain for the crystal structure.
chain A is the model
chain Z is the crystal structure.
https://www.dropbox.com/s/65x9mla5ew7qa ... ligned.pdb
Please have alook .
Thanks.
cheers!
- Samuel Flores
- Posts: 189
- Joined: Mon Apr 30, 2007 1:06 pm
Re: parameters.csv file for RNA modeling
quoting:
Sam
I don't know what you mean by "simulate". An MMB run computes the equations of motion for a given macromolecular system, with given forces, constraints, and flexibility. So yes, you can turn on the baseInteraction's as you have done so far. Yes, you can increase the flexibility if you want to change a certain region, again as you have done. You can even turn on the MD force field in a limited region (see Reference Guide). However I don't think that will automatically lead to a structure which is closer to the reference structure. That depends on the appropriateness of the forces and constraints you impose. Remember, this is not a de novo structure prediction code. It's just a way to combine molecules, forces, and constraints to help you solve a structural problem.I would like to ask you that :
Is there a possibility to provide the secondary structure constraints (as
baseInteractions) for the region 103-137 in rnabuilder and let it simulate
(keeping the constraints) to see if it could find the conformation with
lowest energy that could be close to crystal structure? If that is
possible, it would be easy to get an accurate model.
Sam