-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix CHARMM files parsing and system building #23
Conversation
PeriodicBoxVectors() is already set set for CHARMM when building the system so it does not need to be set again in line 269.
Hi @MiaoLab20 @dyelar @lvotapka do you have any update on this issue? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameters attribute on line 128 of config.py should probably be kept a list due to the for loop on line 135, which assumes a loop, not a string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the loop over the parameters attribute being removed? We want the users to be able to submit multiple parameters files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar issue on line 189 of the proposed changes for parser.py: the possibility of multiple parameter files is being disallowed by making this change.
Ideally, I think that if we are going to be setting the value of switchDistance and ewaldErrorTolerance in gamdSimulation.py (lines 165-166), we would want to be able set the default values in config.py, expose an xml value for changing it from the default, and then use that config value in those places. I'm still looking into why when I tried running the simulations with this code and the files provided in issue #22 , the simulations blew up in stage 3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to make this a bit easier rather than just my reference in my previous comment. I didn't replicate the changes Lane already requested, since I have no additional input to his request.
gamd/gamdSimulation.py
Outdated
return psf | ||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this code should reside in the parser.py file, since you are reading and parsing these configuration files. They would set the appropriate values within a configuration object, if they existed to override whatever the defaults are. Reading and parsing configuration files doesn't really belong within with gamdSimulation.py. It should read a series of configuration values, which would then determine how to set up simulation. It's how we have the separation of concerns setup to keep the code manageable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. I moved: 1) parameter file reading and parsing and 2) box vector parsing to parser.py file. I left in gamdSimulation.py only the code necessary to apply the box vectors to the psf and build the system.
gamdSimulation.simulation.context.setPositions(positions.positions) | ||
if box_vectors is not None: | ||
if box_vectors is not None and config.input_files.charmm is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm don't I understand why this change was made yet . The box_vectors variable should either contain the necessary parameters, or it should be None indicating it doesn't need to be set. Can you reply or add a comment why we also need to check that the charm input_files is None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CHARMM systems can only be built by psf.createSystem() if box vectors are assigned beforehand to the psf object (hence the code snippet at line 156) . Thus, this update of box_vectors here is not needed for CHARMM systems and might create issues setting it twice.
gamd/parser.py
Outdated
for xml_params_filename in charmm_tag: | ||
charmm_config.parameters.append( | ||
assign_tag(xml_params_filename, str)) | ||
# parsing list of parameter files like CHARMM-GUI does |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lvotapka note that the parameter parsing used here loops over all parameter files listed in a "toppar.str" file as defined by the user. Multiple parameter files are accepted.
I believe this approach is preferable because: 1) it is commonly used in CHARMM and is compatible with CHARMM-GUI inputs for OpenMM; and 2) CHARMM commonly used +10 parameter files and listing all of them in the XML input seems unnecessary to me.
I updated the code reflect the requests you made. Particularly, parsing of parameter files and box vectors were moved to parser.py file. I am not very well versed in XML so if something can be improved, please let me know. |
…e to parse toppar files like CHARMM-GUI
@dyelar and @lvotapka , I pushed the new code reflecting the changes you requested via email. More specifically:
Users can now point directly to the files they want or they can provide a .str file containing a list of toppar files (the latter approach is used by CHARMM-GUI inputs for OpenMM, so users can take advantage of that).
While box lengths (a, b,c) are required, box angles are optional and their default value is 90 degress (rectangular box). Let me know if anything else should be fixed. |
Are we ready to accept this pull request, @dyelar ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found something I thought was important and another area for a quick clean up change. Let me know what you think.
gamd/parser.py
Outdated
if xml_params_filename.attrib["type"] == "charmm-gui-toppar": | ||
# parsing list of parameter files like CHARMM-GUI does | ||
extlist = ['rtf', 'prm', 'str'] | ||
parFiles = () | ||
for line in open(xml_params_filename.text, 'r'): | ||
if '!' in line: line = line.split('!')[0] | ||
parfile = line.strip() | ||
if len(parfile) != 0: | ||
ext = parfile.lower().split('.')[-1] | ||
if not ext in extlist: continue | ||
charmm_config.parameters.append(parfile) | ||
else: | ||
charmm_config.parameters.append( | ||
assign_tag(xml_params_filename, str)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you extract this into it's own method. Something like parse_and_assign_charmm_gui_toppar_file and call it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify a bit more? I could create a function on parser.py or move this to config.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I was thinking just add another method. Something like the following:
def parse_and_assign_charmm_gui_toppar_file(self, charmm_config, xml_params_filename):
extlist = ['rtf', 'prm', 'str']
parFiles = ()
for line in open(xml_params_filename.text, 'r'):
if '!' in line: line = line.split('!')[0]
parfile = line.strip()
if len(parfile) != 0:
ext = parfile.lower().split('.')[-1]
if not ext in extlist: continue
charmm_config.parameters.append(parfile)
return charmm_config
Then on line 232, just replace the code with
charmm_config = self.parse_and_assign_charmm_gui_toppar_file(charmm_config, xml_params_filename)
The thought being to try and have the methods be smaller and limited to doing a single thing as much as possible, so that they don't get to large.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. I thought it was better to make that parsing function a method inside CharmmConfig class (config.py).
So on line 233 I am just calling:
charmm_config.parse_charmm_gui_toppar(xml_params_filename)
Which calls the method (config.py, line 147):
def parse_charmm_gui_toppar(self, xml_params_filename):
extlist = ['rtf', 'prm', 'str']
# inputfile is the toppar.str with the list of parameter files
for line in open(xml_params_filename.text, 'r'):
if '!' in line: line = line.split('!')[0]
parfile = line.strip()
if len(parfile) != 0:
ext = parfile.lower().split('.')[-1]
if not ext in extlist: continue
# Appending each file listed in inputfile to the existing
# list of parameters files to be read
self.parameters.append(parfile)
return
I wanted to make sure users can use both parsing methods in the same input (directly pointing to parameter files or listing them in a toppar.str file). I think having all methods in config.py is probably cleaner.
gamd/gamdSimulation.py
Outdated
boxinfo = config.input_files.charmm.box_vectors | ||
boxlx = boxinfo["a"] | ||
boxly = boxinfo["b"] | ||
boxlz = boxinfo["c"] | ||
alpha = boxinfo["alpha"] | ||
beta = boxinfo["beta"] | ||
gamma = boxinfo["gamma"] | ||
psf.setBox(boxlx, boxly, boxlz, alpha, beta, gamma) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this code makes an assumption that tags for a, b, and c exist. Based on what I read in the parser, it is quite possible for them to not exist and we are just using a default. I assume that means that setBox wouldn't be called at all?
I was thinking that the best way to handle this might be to add the following pieces of code:
in config.py in the CharmmConfig, add the following methods and variables:
self.config_box_vector_defined = False
def get_box_vectors(self):
errorMessage = "The box vectors were only partially defined. " \
"At least a, b, and c must be present."
if not self.is_whole_box_defined():
raise RuntimeError(errorMessage)
return [self.box_vectors["a"], self.box_vectors["b"],
self.box_vectors["c"],
self.alpha, self.beta, self.gamma]
def is_whole_box_defined(self):
lengths = ["a", "b", "c"]
result = True
for length in lengths:
result = result and length in self.box_vectors
return result
def is_config_box_vector_defined(self):
return self.config_box_vector_defined
in parser.py, add the following once we are in the box-vector tag:
self.config_box_vector_defined = True
Then, right here, we should be able to just change it to something like the following:
if config.input_files.charmm.is_config_box_vector_defined:
psf.setBox(*config.input_files.charmm.get_box_vectors())
This way, if they user doesn't define a box vector, we aren't making the call. Of course, if the box vector is required for psf files, then we should throw an error instead if they didn't define it.
FYI, I have no idea if I got that code exactly correct for the calling convention, but hopefully it gets the idea across. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dyelar the box lengths are required to create the system object with psf files, so they must be defined by the user. Currently, we throw an error if any of the lengths "a", "b" or "c" are not provided (parser.py, line 222). We do, however, make an assumption that the box angles are all 90 degrees if users do not provide alpha, beta or gamma. I can either raise an error or I can raise a warning on that matter. What do you prefer?
If there is some sane default that we can set, then we should do that and
use it. If no sane default exists, it is required, and not provided, then
we should generate an error.
…On Wed, Oct 19, 2022 at 9:08 AM Marcelo D. Polêto ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In gamd/gamdSimulation.py
<#23 (comment)>:
> + boxinfo = config.input_files.charmm.box_vectors
+ boxlx = boxinfo["a"]
+ boxly = boxinfo["b"]
+ boxlz = boxinfo["c"]
+ alpha = boxinfo["alpha"]
+ beta = boxinfo["beta"]
+ gamma = boxinfo["gamma"]
+ psf.setBox(boxlx, boxly, boxlz, alpha, beta, gamma)
@dyelar <https://github.com/dyelar> the box lengths are required to
create the system object with psf files, so they must be defined by the
user. Currently, we throw an error if any of the lengths "a", "b" or "c"
are not provided (parser.py, line 222). We do, however, make an assumption
that the box angles are all 90 degrees if users do not provide alpha, beta
or gamma. I can either raise an error or I can raise a warning on that
matter. What do you prefer?
—
Reply to this email directly, view it on GitHub
<#23 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABALA2NNZEZ3TWAKBANFOOTWD76EFANCNFSM6AAAAAAQCT4CLQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
The primary reason to have it in parser.py is that you are parsing the file
and setting the config values. That's what we are doing in parser.py.
Config is meant to hold values and handle serializing it as output. We do
this, so that in the future, we will be able to handle different input file
formats (not XML, other parameter types, etc.) with the code change being
localized into one area. Having parsing code in config.py means that
someone will have to look in more than one area for getting parsing
information for configuration files.
…On Wed, Oct 19, 2022 at 1:16 PM Marcelo D. Polêto ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In gamd/parser.py
<#23 (comment)>:
> + if xml_params_filename.attrib["type"] == "charmm-gui-toppar":
+ # parsing list of parameter files like CHARMM-GUI does
+ extlist = ['rtf', 'prm', 'str']
+ parFiles = ()
+ for line in open(xml_params_filename.text, 'r'):
+ if '!' in line: line = line.split('!')[0]
+ parfile = line.strip()
+ if len(parfile) != 0:
+ ext = parfile.lower().split('.')[-1]
+ if not ext in extlist: continue
+ charmm_config.parameters.append(parfile)
+ else:
+ charmm_config.parameters.append(
+ assign_tag(xml_params_filename, str))
Thank you. I thought it was better to make that parsing function a method
inside CharmmConfig class (config.py).
So on line 233 I am just calling:
charmm_config.parse_charmm_gui_toppar(xml_params_filename)
Which calls the method (config.py, line 147):
def parse_charmm_gui_toppar(self, xml_params_filename):
extlist = ['rtf', 'prm', 'str']
# inputfile is the toppar.str with the list of parameter files
for line in open(xml_params_filename.text, 'r'):
if '!' in line: line = line.split('!')[0]
parfile = line.strip()
if len(parfile) != 0:
ext = parfile.lower().split('.')[-1]
if not ext in extlist: continue
# Appending each file listed in inputfile to the existing
# list of parameters files to be read
self.parameters.append(parfile)
return
I wanted to make sure users can use both parsing methods in the same input
(directly pointing to parameter files or listing them in a toppar.str
file). I think having all methods in config.py is probably cleaner.
—
Reply to this email directly, view it on GitHub
<#23 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABALA2K27M7Z3RK53E2BGDDWEA3HRANCNFSM6AAAAAAQCT4CLQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@dyelar I just pushed the fixes for your request. A brief explanation:
Note that in the example above, toppar.str file contains:
If you have any other comments or suggestions, feel free to make them. |
2. Looks good to me.
For 1, if the defaults provided by OpenMM for the angles don't really have
any meaning for us, perhaps we should also not set a default. If there
aren't any defaults, then inside the parser box-vector portion, we just
need to make sure that the value of the charmm.box_vectors = [a, b, c,
alpha, beta, gamma] XOR it generates the RuntimeException telling people
that all of the values need to be set for the charmm file type. It does
change some of the stuff back to what you had before
(config_box_vector_defined and associated method goes away, the angle
variables go away, the get_box_vector method goes away, and we just check
to see if it is complete in the parser prior to setting the config value
and throw the exception if not.), but this current iteration was really
based around assuming that some defaults were possible, so it makes total
sense to me to require everything or error if defaults don't make sense.
Would you like me to merge it and change it, or would you like to swap out
the extraneous stuff prior to the merge?
…On Wed, Oct 19, 2022 at 1:44 PM Marcelo D. Polêto ***@***.***> wrote:
@dyelar <https://github.com/dyelar> I just pushed the fixes for your
request. A brief explanation:
1. regarding box vectors: I decided to throw and error if any of the
box properties are not set. OpenMM does accept triclinc and orthorombic
boxes, so assuming a default if not a good idea. Thus, users will have to
define all box lengths ('a', 'b','c') and box angles ('alpha', 'beta',
'gamma'). Hence, expected entries in the XML should look like something
like this:
<box-vectors>
<a>2.9</a> <!-- unit.nanometers -->
<b>2.9</b> <!-- unit.nanometers -->
<c>2.9</c> <!-- unit.nanometers -->
<alpha>90</alpha> <!-- unit.degrees -->
<beta>90</beta> <!-- unit.degrees -->
<gamma>90</gamma> <!-- unit.degrees -->
</box-vectors>
1. regarding parameter file parsing: I just saw your last message and
I left the parameter file parser in parser.py as you requested. Now, users
can define their parameter files in two ways: 1) directly pointing to a
parameter file; or 2) pointing to a file containing a list of parameter
files to be read (like CHARMM-GUI inputs are parsed via toppar.str file).
Users can use both parsing methods by using something like below:
<parameters>
<parameter type="charmm-gui-toppar">toppar.str</parameter>
<parameter>toppar_c36/top_all36_na.rtf</parameter>
<parameter>toppar_c36/par_all36_na.prm</parameter>
<parameter>toppar_c36/toppar_water_ions.str</parameter>
</parameters>
Note that in the example above, toppar.str file contains:
toppar_c36/top_all36_prot.rtf
toppar_c36/par_all36m_prot.prm
If you have any other comments or suggestions, feel free to make them.
—
Reply to this email directly, view it on GitHub
<#23 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABALA2N77ERDVBSI7MWNQOLWEA6QHANCNFSM6AAAAAAQCT4CLQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@dyelar sorry for the delay. I agree with you. It is safer to require all the inputs and to not define any defaults. Could you merge and make the changes that you seem fit? I won't be able to make the changes in the next few days. |
Description
CHARMM36 systems are not being built and require box_vector assignments before creating the system with psf.CreateSystem().
Todos
Status