6.3 Error - Spearate indices parsed into a single index
- George Pantelopulos
- Posts: 64
- Joined: Mon Jun 01, 2015 2:15 pm
6.3 Error - Spearate indices parsed into a single index
Hi all,
I'm trying to use GMX gro and topology (+itp) files generated by CHARMM-GUI for lipid bilayer simulations in Openmm 6.3 and I'm getting a remarkably bizarre error in which all of the separate residues in the gro file are being strung together into a single residue in the topology, with each separate residue becoming a chain, after being read in by Openmm.
Here you can see that, after minimizing the system, the water molecules have seemingly been treated as one long chain: To give a sample of what is contained in the .gro file generated by CHARMM-GUI:
56SSM C18F34101 8.650 1.140 3.797
56SSM H18F34102 8.671 1.244 3.829
56SSM H18G34103 8.660 1.096 3.898
56SSM H18H34104 8.552 1.105 3.764
1TIP3 OH234105 4.591 4.639 5.920
1TIP3 H134106 4.672 4.672 5.959
1TIP3 H234107 4.531 4.713 5.927
And the PDB that openmm writes...
ATOM 34380 C18F SSM T 1 83.291 12.590 35.141 1.00 0.00 C
ATOM 34381 H18F SSM T 1 83.539 13.669 35.233 1.00 0.00 H
ATOM 34382 H18G SSM T 1 83.092 12.192 36.159 1.00 0.00 H
ATOM 34383 H18H SSM T 1 82.350 12.450 34.567 1.00 0.00 H
TER 34384 SSM T 1
ATOM 34385 O HOH U 1 40.593 45.727 62.474 1.00 0.00 O
ATOM 34386 H1 HOH U 1 50.369 53.587 58.493 1.00 0.00 H
ATOM 34387 H2 HOH U 1 50.388 53.569 58.527 1.00 0.00 H
TER 34388 HOH U 1
And despite these TER arguments, all atoms in here seem to be treated as a single molecule regardless, as suggested by all atoms having the same residue index.
The molecules in the GMX topology file:
[ molecules ]
; Compound #mols
CHL1 56
POPC 168
SSM 56
TIP3 9258
SOD 30
CLA 30
Where TIP3 is TIP3P water, SOD is sodium, CLA is chloride.
Thank you for any assistance, this is very odd!
George
I'm trying to use GMX gro and topology (+itp) files generated by CHARMM-GUI for lipid bilayer simulations in Openmm 6.3 and I'm getting a remarkably bizarre error in which all of the separate residues in the gro file are being strung together into a single residue in the topology, with each separate residue becoming a chain, after being read in by Openmm.
Here you can see that, after minimizing the system, the water molecules have seemingly been treated as one long chain: To give a sample of what is contained in the .gro file generated by CHARMM-GUI:
56SSM C18F34101 8.650 1.140 3.797
56SSM H18F34102 8.671 1.244 3.829
56SSM H18G34103 8.660 1.096 3.898
56SSM H18H34104 8.552 1.105 3.764
1TIP3 OH234105 4.591 4.639 5.920
1TIP3 H134106 4.672 4.672 5.959
1TIP3 H234107 4.531 4.713 5.927
And the PDB that openmm writes...
ATOM 34380 C18F SSM T 1 83.291 12.590 35.141 1.00 0.00 C
ATOM 34381 H18F SSM T 1 83.539 13.669 35.233 1.00 0.00 H
ATOM 34382 H18G SSM T 1 83.092 12.192 36.159 1.00 0.00 H
ATOM 34383 H18H SSM T 1 82.350 12.450 34.567 1.00 0.00 H
TER 34384 SSM T 1
ATOM 34385 O HOH U 1 40.593 45.727 62.474 1.00 0.00 O
ATOM 34386 H1 HOH U 1 50.369 53.587 58.493 1.00 0.00 H
ATOM 34387 H2 HOH U 1 50.388 53.569 58.527 1.00 0.00 H
TER 34388 HOH U 1
And despite these TER arguments, all atoms in here seem to be treated as a single molecule regardless, as suggested by all atoms having the same residue index.
The molecules in the GMX topology file:
[ molecules ]
; Compound #mols
CHL1 56
POPC 168
SSM 56
TIP3 9258
SOD 30
CLA 30
Where TIP3 is TIP3P water, SOD is sodium, CLA is chloride.
Thank you for any assistance, this is very odd!
George
- Peter Eastman
- Posts: 2588
- Joined: Thu Aug 09, 2007 1:25 pm
Re: 6.3 Error - Spearate indices parsed into a single index
Hi George,
I'm not sure exactly what problem you're describing. Are you just referring to the fact that the waters all have residue ID 1? Residue IDs are specific to a particular chain, not global. So two different chains can each have a residue 1.
Peter
I'm not sure exactly what problem you're describing. Are you just referring to the fact that the waters all have residue ID 1? Residue IDs are specific to a particular chain, not global. So two different chains can each have a residue 1.
Peter
- George Pantelopulos
- Posts: 64
- Joined: Mon Jun 01, 2015 2:15 pm
Re: 6.3 Error - Spearate indices parsed into a single index
Hi Peter,
The GRO file which is being read by GromacsGroFile contains many molecules, particularly molecules and lipids, of different residue indices.
GromacsGroFile seems to be interpreting all atoms everything as belonging to a single residue index. If everything had a qnique chain identifier this might be OK... But it seems that these chain identifiers are being repeated while all atoms are in the same residue index. And so it appears that many atoms that should be separated are being treated as one molecule despite being separated by a TER argument. At least, this may explain why, during minimization, it appears that atoms within each chain of the same identifier are being linked together to form the odd structure I posted above.
Taking a couple slices out of the starting gro file...
...
17POPC H16X 6420 2.853 4.595 3.731
17POPC H16Y 6421 2.744 4.731 3.698
17POPC H16Z 6422 2.754 4.573 3.612
...
1TIP3 OH234105 4.591 4.639 5.920
1TIP3 H134106 4.672 4.672 5.959
1TIP3 H234107 4.531 4.713 5.927
2TIP3 OH234108 5.567 6.638 6.148
2TIP3 H134109 5.554 6.732 6.165
2TIP3 H234110 5.612 6.635 6.063
3TIP3 OH234111 5.751 6.762 5.957
3TIP3 H134112 5.765 6.838 5.898
3TIP3 H234113 5.766 6.802 6.044
And a few slices out of the PDB file that is written out by Openmm based on the GROMACS topology and simulation coordinates...
...
ATOM 6492 H16X POP U 1 28.979 44.815 37.326 1.00 0.00 H
ATOM 6493 H16Y POP U 1 27.949 45.743 36.227 1.00 0.00 H
ATOM 6494 H16Z POP U 1 28.397 44.072 35.757 1.00 0.00 H
TER 6495 POP U 1
...
ATOM 34385 O HOH U 1 40.593 45.727 62.474 1.00 0.00 O
ATOM 34386 H1 HOH U 1 50.369 53.587 58.493 1.00 0.00 H
ATOM 34387 H2 HOH U 1 50.388 53.569 58.527 1.00 0.00 H
TER 34388 HOH U 1
...
ATOM 34697 O HOH U 1 52.443 81.380 20.454 1.00 0.00 O
ATOM 34698 H1 HOH U 1 59.447 81.146 21.597 1.00 0.00 H
ATOM 34699 H2 HOH U 1 59.554 81.147 21.568 1.00 0.00 H
TER 34700 HOH U 1
Is this clear? Any idea of how to approach this?
The GRO file which is being read by GromacsGroFile contains many molecules, particularly molecules and lipids, of different residue indices.
GromacsGroFile seems to be interpreting all atoms everything as belonging to a single residue index. If everything had a qnique chain identifier this might be OK... But it seems that these chain identifiers are being repeated while all atoms are in the same residue index. And so it appears that many atoms that should be separated are being treated as one molecule despite being separated by a TER argument. At least, this may explain why, during minimization, it appears that atoms within each chain of the same identifier are being linked together to form the odd structure I posted above.
Taking a couple slices out of the starting gro file...
...
17POPC H16X 6420 2.853 4.595 3.731
17POPC H16Y 6421 2.744 4.731 3.698
17POPC H16Z 6422 2.754 4.573 3.612
...
1TIP3 OH234105 4.591 4.639 5.920
1TIP3 H134106 4.672 4.672 5.959
1TIP3 H234107 4.531 4.713 5.927
2TIP3 OH234108 5.567 6.638 6.148
2TIP3 H134109 5.554 6.732 6.165
2TIP3 H234110 5.612 6.635 6.063
3TIP3 OH234111 5.751 6.762 5.957
3TIP3 H134112 5.765 6.838 5.898
3TIP3 H234113 5.766 6.802 6.044
And a few slices out of the PDB file that is written out by Openmm based on the GROMACS topology and simulation coordinates...
...
ATOM 6492 H16X POP U 1 28.979 44.815 37.326 1.00 0.00 H
ATOM 6493 H16Y POP U 1 27.949 45.743 36.227 1.00 0.00 H
ATOM 6494 H16Z POP U 1 28.397 44.072 35.757 1.00 0.00 H
TER 6495 POP U 1
...
ATOM 34385 O HOH U 1 40.593 45.727 62.474 1.00 0.00 O
ATOM 34386 H1 HOH U 1 50.369 53.587 58.493 1.00 0.00 H
ATOM 34387 H2 HOH U 1 50.388 53.569 58.527 1.00 0.00 H
TER 34388 HOH U 1
...
ATOM 34697 O HOH U 1 52.443 81.380 20.454 1.00 0.00 O
ATOM 34698 H1 HOH U 1 59.447 81.146 21.597 1.00 0.00 H
ATOM 34699 H2 HOH U 1 59.554 81.147 21.568 1.00 0.00 H
TER 34700 HOH U 1
Is this clear? Any idea of how to approach this?
- Peter Eastman
- Posts: 2588
- Joined: Thu Aug 09, 2007 1:25 pm
Re: 6.3 Error - Spearate indices parsed into a single index
It's hard to be sure the forum hasn't mangled the formatting, but it looks to me like your GRO file only has four columns for the residue number. Could you check that? There are supposed to be five columns (so an additional space at the start of each line). See http://manual.gromacs.org/current/online/gro.html.
If that's not the problem, could post the actual file so I can take a look at it?
Peter
If that's not the problem, could post the actual file so I can take a look at it?
Peter
- George Pantelopulos
- Posts: 64
- Joined: Mon Jun 01, 2015 2:15 pm
Re: 6.3 Error - Spearate indices parsed into a single index
Ahh, maybe that has something to do with it...
The gro file describes a somewhat large system of 61938 particles.
This GRO file has 5 columns for atom numbers, which is standard according to the gromacs manual. This is not necessarily mangled... but it leaves no space between the atom name and the atom number which I can imagine could open up some sort of can of worms with parsers.
I've attached the gro file to this message. Thanks again for your help and your patience!
The gro file describes a somewhat large system of 61938 particles.
This GRO file has 5 columns for atom numbers, which is standard according to the gromacs manual. This is not necessarily mangled... but it leaves no space between the atom name and the atom number which I can imagine could open up some sort of can of worms with parsers.
I've attached the gro file to this message. Thanks again for your help and your patience!
- Attachments
-
- step5_charmm2gmx.gro.txt
- Input gro file
- (2.66 MiB) Downloaded 27 times
- Peter Eastman
- Posts: 2588
- Joined: Thu Aug 09, 2007 1:25 pm
Re: 6.3 Error - Spearate indices parsed into a single index
Could you post the top file too? That's what it builds the Topology from.
Peter
Peter
- George Pantelopulos
- Posts: 64
- Joined: Mon Jun 01, 2015 2:15 pm
Re: 6.3 Error - Spearate indices parsed into a single index
Yep! The .itp files are within as well.
- Peter Eastman
- Posts: 2588
- Joined: Thu Aug 09, 2007 1:25 pm
Re: 6.3 Error - Spearate indices parsed into a single index
When I try to load it, I get this error:
It looks like that file is missing?
Peter
Code: Select all
ValueError: Could not locate #include file: SOD.itp
Peter
- George Pantelopulos
- Posts: 64
- Joined: Mon Jun 01, 2015 2:15 pm
Re: 6.3 Error - Spearate indices parsed into a single index
Oh my apologies! That itp is with the others now.
- Attachments
-
- topology_files.tar
- (185 KiB) Downloaded 23 times
- Peter Eastman
- Posts: 2588
- Joined: Thu Aug 09, 2007 1:25 pm
Re: 6.3 Error - Spearate indices parsed into a single index
Hi George,
I tested your files, and they seem to be working correctly. In this case, "correctly" means, "As correctly as can be expected under the circumstances."
First, I loaded both files:
The Topology is created based on the TOP file. The GRO file is used only for coordinates. In this case, the Topology includes 9598 chains:
That matches the number of molecules defined in the TOP file:
I then had it print out a PDB file:
Unfortunately, the PDB format is terribly outdated and not at all designed to be used in the ways people use it today. When writing a PDB file, you're supposed to distinguish the chains in two different ways: by giving a unique identifier to each one, and by including a TER record at the end of each one. But there's only a single column for the identifier! When the format was first defined, they were thinking about multiple protein chains in a complex, so the number of them would always be very small. They weren't thinking of including thousands of molecules in a single PDB file. If each chain identifier is only a single letter, there's no way to give a unique identifier to each of thousands of molecules. So OpenMM does the best it can, and just keeps looping through the alphabet:
The much newer PDBx/mmCIF format fixes this problem by allowing an arbitrary length identifier for each chain. So if you write that format instead, OpenMM can give an actual unique identifier to each chain.
Peter
I tested your files, and they seem to be working correctly. In this case, "correctly" means, "As correctly as can be expected under the circumstances."
First, I loaded both files:
Code: Select all
>>> top = GromacsTopFile('topol.top.txt')
>>> gro = GromacsGroFile('step5_charmm2gmx.gro.txt')
Code: Select all
>>> len(list(top.topology.chains()))
9598
Code: Select all
[ molecules ]
; Compound #mols
CHL1 56
POPC 168
SSM 56
TIP3 9258
SOD 30
CLA 30
Code: Select all
>>> PDBFile.writeFile(top.topology, gro.positions)
Code: Select all
ATOM 53017 O HOH Y 1 53.730 79.270 75.080 1.00 0.00 O
ATOM 53018 H1 HOH Y 1 54.320 78.810 75.690 1.00 0.00 H
ATOM 53019 H2 HOH Y 1 52.870 78.820 75.190 1.00 0.00 H
TER 53020 HOH Y 1
ATOM 53021 O HOH Z 1 55.770 78.100 76.610 1.00 0.00 O
ATOM 53022 H1 HOH Z 1 56.180 77.400 76.050 1.00 0.00 H
ATOM 53023 H2 HOH Z 1 56.500 78.750 76.670 1.00 0.00 H
TER 53024 HOH Z 1
ATOM 53025 O HOH A 1 39.340 76.140 77.660 1.00 0.00 O
ATOM 53026 H1 HOH A 1 39.830 76.750 77.080 1.00 0.00 H
ATOM 53027 H2 HOH A 1 38.750 76.750 78.110 1.00 0.00 H
TER 53028 HOH A 1
ATOM 53029 O HOH B 1 41.190 78.090 76.090 1.00 0.00 O
ATOM 53030 H1 HOH B 1 41.700 77.410 75.610 1.00 0.00 H
ATOM 53031 H2 HOH B 1 41.820 78.370 76.780 1.00 0.00 H
TER 53032 HOH B 1
Peter