Configuration Files for htpolynet run
¶
The htpolynet run
configuration file tells HTPolyNet
what it needs in order to generate a polymerized system, beginning with structures of the individual monomers and a description of the polymerization chemistry. htpolynet run
expects at most ten distinct sections in a configuration file:
Section name |
Role |
---|---|
|
Just a descriptive title |
|
Directives for interacting with Gromacs |
|
Directives for interacting with AmberTools |
|
Directives for handling inconsistencies in the General Amber Force Field |
|
Directives for molecular constituents in the initial system |
|
Directives for how the densification phase is run |
|
Directives for how precure equilibration and annealing is run |
|
Directives for how the CURE is run |
|
Directives for how postcure equilibration and annealing is run |
|
List of all reactions needed to build and cure the system |
Sections can appear in any order (since the whole YAML file is like a nested python dictionary). The gromacs
and ambertools
sections are mainly for specifying system-specific commands for running Gromacs and AmberTools executables; they have defaults that work for simple Linux workstations. densification
, precure
, and postcure
all specify series of MD simulations on the system. constituents
specifies the initial make-up of the system, and reactions
describe both how to build those constituents from input monomers (if necessary) as well as the types of bonds that you want to occur during polymerization. Both constituents
and reactions
sections are where HTPolyNet
extracts the names of molecular species in the system and how they are chemically interrelated. CURE
is the most complicated section, and it desribes how the CURE algorithm is to be run. We consider each of these eight sections (minus Title
) below.
All the details¶
In this section we show all subdirectives for each of the five main directives in the configuration file. Please note that, although default values are quoted for some of these parameters, these default values are not guaranteed to be optimal or even functional in all possible scenarios.
gromacs
: This directive specifies parametersHTPolyNet
uses when invoking the Gromacs executable.gromacs
subdirectiveType
Description (default)
gmx
str
gmx
command (defaultgmx
)gmx_options
quoted string
options to pass to all
gmx
calls (default-quiet -nobackup
)mdrun
quoted string
mdrun
command (defaultgmx (options) mdrun
)mdrun_single_molecule
quoted string
version of
mdrun
to use for any single-molecule Gromacs runsmdrun_options
dict
command-line arguments to pass to
mdrun
(none)If you are running on a supercomputer with a native installation of Gromacs, it is likely you should point the parameter
gmx
to the fully resolved pathname ofgmx_mpi
(or load the appropriate module), and use themdrun
parameters to specify thempirun
ormpiexec
syntax needed to launchgmx_mpi mdrun
. Thegromacs_single_molecule
subdirective allows you to specify a particular form ofmdrun
appropriate for single-molecule simulations. These are most often used as part of parameterization or conformer generation. Typically, it’s best to run these on a single processor without domain decomposition.The
gromacs
directive is optional; if none is specified the default values are used.ambertools
: This directive specifies parametersHTPolyNet
uses when working with the AmberTools suite.ambertools
subdirectiveType
Description (default)
charge_method
string
charge model used by
antechamber
(defaultgas
)For now, you can choose any charging method compatible with
antechamber
. Theantechamber
directive is optional.GAFF
In very rare instances, AmberTools will generate GAFF atom types and parameters that are internally inconsistent, or at least are not understandable by the
parmed
package that translates them into Gromacs topology files. Directives in this section instructHTPolyNet
how to resolve these inconsistencies.GAFF
subdirectiveType
Description (default)
resolve_type_discrepancies
resolve record
Directives for resolving atom type discrepancies
GAFF.resolve_type_discrepancies
directivesDescription (default)
typename
Name of interaction type in
top
file (dihedraltypes
)funcidx
Function index assigned to this type in the
top
filerule
What rule to apply to resolve discrepancies (
stiffest
)Currently, the only type of discrepancy that can be handled is one in which four atom types define one dihedral type in one
top
file but a different one in another. If these to topologies are merged to make a composite system, Gromacs will flag this as an error in the topology file. This directive allows you to check for all such discrepancies before Gromacs does, and keep only the one that follows the stipulatedrule
. For dihedrals, thestiffest
rule means that of the conflicting types, the one with the largest energetic parameters is kept.constituents
This directive is a set of one or more “key”:”record” pairs in which each key is the name of a molecule (here, “STY”) and the record is a dictionary of keyword:value pairs. The allowable keywords in a
constituent
record are as follows.constituents
record keywordType
Description (default)
count
int
(optional) number of these molecules in the system
stereocenters
list
(optional) list of names of chiral carbon atoms if any
symmetry_equivalent_atoms
list
(optional) list of sets of symmetry equivalent atom names, if any
conformers
conformers record
(optional) parameters specifying if and how initial conformers are generated
In the example below, we are requesting a system of 100 styrene molecules. The key
STY
signals toHTPolyNet
that it should look for eitherSTY.mol2
orSTY.pdb
in./lib/molecules/inputs
or it should look forSTY.gro
,STY.itp
,STY.top
, andSTY.grx
in./lib/molecules/parameterized
. The latter is the case if eitherhtpolynet run
orhtpolynet parameterized
has already been run withSTY.mol2
orSTY.pdb
. Multiple records inconstituents
should all have the “key”:”record” syntax and be separated by commas.Any entry in
constituents
for which thecount
is 0 or missing is treated as an intermediate for which stereocenters or symmetry-equivalent atoms are specified.HTPolyNet
allows you the option to use multiple conformers of flexible molecules to build the initial liquid system. It can use eitherobabel
’sconfomers
capability or an MD simulation viagromacs
to generate these. Theconformers
record has two subdirectives:conformers
record keywordType
Description (default)
count
int
(required) number of unique conformers to generate (per stereoisomer)
generator
conformer generator record
(optional) parameters specifying how conformers are generated
The
conformers.generator
record has several subdirectives:generator
record keywordType
Description (default)
name
str
(required)
obabel
orgromacs
params
generator parameters record
(optional) parameters specifying the generator’s operation (only relevant for
gromacs
)The
conformers.generator.params
record has several subdirectives:params
record keywordType
Description (default)
ensemble
str
nvt
is the only option that makes sensetemperature
float
(optional) Temperature of the conformer-generating MD simulation
ps
float
(optional) Duration of the conformer-generating MD simulation
pad
float
(optional) Box-size padding for the vacuum MD simulation
If an entry in
constituents
has noconfomers
member directive, then the confomer used for it in building the system is whatever one is in either the inputmol2
file (after AmberTools andparmed
convert it to agro
file) or thegro
file of the constructed molecule.densification
This directive instructs
HTPolyNet
how to run the initial densification of the fresh simulation system. It has two subdirectives:densification
subdirectiveType
Description (default)
initial_density
float
density in kg/m^3 at which molecules are placed randomly into a box to make the initial coordinates (default 300.0)
equilibration
list
list of MD records
The
equilibration
subdirective should contain one or more MD records. An MD record is a dictionary of keyword:value pairs:MD record keyword
Type
Description
ensemble
string
(required) min (minimization), npt, or nvt
temperature
float
(required if
ensemble
is nvt or npt) Temperature in K assigned toref_t
in Gromacsmdp
filepressure
float
(required if
ensemble
is npt) Pressure in bar assigned toref_p
in Gromacsmdp
filensteps
int
(optional; required if
ps
not provided) Duration of MD simulation in number of time stepsps
float
(optional; required if
nsteps
not set) Duration of MD simulation in picosecondsrepeat
int
(optional) number of times to repeat this simulation in series; default is 0 (i.e., run once)
The
repeat
subdirective is especially useful for densifications that start at very low initial densities. It is better to run several short NPT simulations than a single long one so that the box size shrinkage doesn’t overwhelm Gromacs’ domain decomposition algorithm.precure
The
precure
directive instructsHTPolyNet
on running a series of MD simulations after densification but before the cure. There are three allowable subdirectives forprecure
:precure
subdirectiveType
Description (default)
preequilibration
MD record
optional MD simulation
anneal
Anneal record
Description of an annealing simulation after the optional
preequilibration
postequilibration
MD record
optional MD simulation
Both the
preequilibration
andpostequilibration
directives contain MD records described above. The Anneal record has the following subdirectives:Anneal record subdirective
Type
Description (default)
ncycles
int
number of annealing cycles
initial_temperature
float
(optional) Initial temperaure in K, really only sets the
gen-temp
mdp
parametercycle_segments
list
list of cycle records
A cycle record corresponds to an “annealing-point” in the Gromacs
mdp
file.Cycle record subdirective
Type
Description (default)
T
float
Targe temperature in K
ps
float
cycle duration; if prior
T
is different, simulation is brought to thisT
in this amount of time; if priorT
is the same, simulation is held at thisT
for this amount of time.Each cycle consists of one pass through the cycle segments. In the example here, one cycle consists of Gromacs taking the system from 300 to 600 K in the first 20 ps, then holding at 600 for 20 pm, then reducing to 300 K over 20 ps and holding it there for 20 ps.
CURE
This directive contains all instructions governing the CURE algorithm. There are five possible subdirectives:
CURE
subdirectiveType
Description (default)
controls
list
Control parameter values
drag
list
Dragging parameter values
relax
list
Bond relaxation parameter values
equilibrate
MD record
CURE iteration equilibration parameters
gromacs
list
any
mdp
keyword:value pairs to include in allmdp
files in theCURE
sequenceCURE.controls
parametersCURE.controls
parameterType
Description (default)
initial_search_radius
float
initial search radius in nm (default 0.5)
radial_increment
float
increment in nm by which search radius is increased if no bonds are found at current radius (default 0.25)
max_iterations
int
absolute maximum number of allowed iterations (default 150),
desired_conversion
float [0-1]
target conversion between 0 and 1.0 (default 0.95)
late_threshhold
float [0-1]
conversion above which bond probabilities are ignored
min_allowable_bondcycle_length
int
minimum number of C atoms allowed in a cycle of C-C bonds that form via polymerization (default 0)
The
min_allowable_bondcycle_length
refers to the fact that in systems that polymerize via activation of carbon-carbon double bonds, it is possible in the HTPolyNet implementation that the “head” of a chain of C-C bonds can attack the “tail” and form a cycle, because those represent atom types that can react. It is unclear whether such cycles actually form; if a monomer remains bound to a radical initiator it is hard to see how the head of the growing chain could attack it, but maybe it could. Settingmin_allowable_bondcycle_length
to zero (the default) disallows any bonds that would form cycles involving only atoms that were once part of C=C double bonds. (Think about the backbone of polystyrene, for example.) In a given CURE iteration, HTPolyNet tests the full set of suggested bonds to see if together they result in any cycles, and for each nascent cycle longer thanmin_allowable_bondcycle_length
, HTPolyNet will disallow the nascent bond that has the longest initial length.
CURE.drag
parameters: Dragging refers to a series of MD simulations (called “stages”) in which harmonic restraints are applied to each pair of atoms assigned to form a bond, but before the bonds actually form. Dragging is useful to reduce 1-4 distances that ultimately arise when bonds form. Each stage in the series uses a specially modified topology file in which “new” bonds of type 6 are added, one for each pair of to-be-bonded atoms. Each of these bonds has a parameterkb
, the spring constant, andb0
, the equilibrium length. Thedrag
directive governs how thoseb0
parameters are linearly decreased through the set of stages to slowly bring the atoms closer together. Thelimit
parameter is the target distance of dragging, andincrement
determines the number of stages it will take to get there.
CURE.drag
parameterType
Description (default)
increment
float
minimum amount by which target
drag
distance is decreased in steps (default 0.08)
limit
float
distance in nm to which all bonds are dragged (default 0.3)
equilibration
MD record
describes the MD simulations used to equilibrate at each stage
CURE.relax
parameters: Relaxation refers to a series of MD simulations (also called “stages”) in which thekb
andb0
parameters of each new bond are “attenuated” from a weak (lowkb
), long (largeb0
) state to the state dictated by the force field. Theincrement
determines the number of stages are performed.
CURE.relax
parameterType
Description (default)
increment
float
minimum amount by which
b0
parameters are decreased in steps (default 0.08)
equilibration
MD record
describes the MD simulations used to equilibrate at each stage
gromacs
parameters: These parameters govern modification tomdp
files used in the dragging and relaxation MD simulations.HTPolyNet
adjusts the cutoff distances to conform to the longest unrelaxed bond in the system, and therdefault
parameter provides the floor below which it will not go any lower.
CURE.gromacs
parameterType
Description (default)
rdefault
float
minimum cutoff radius (default 0.9)
postcure
The
postcure
directive instructsHTPolyNet
on running a series of MD simulations after cure. Its form is identical to that ofprecure
, namely with optionalpreequilibration
,anneal
, andpostequilibration
subdirectives.
reactions
The
reactions
directive contains a list of reaction records that specify the chemisty of any bonds that form to either build molecular constituents or polymers/crosslinks. A reaction is defined by the precise pairs of atom types that become new covalent bonds. To precisely define each such pair, the reaction must also identify one or more reactant molecules. Each reaction also names a single product molecule. HTPolyNet will build oligomer templates using these reactions and then GAFF-parameterize them. The parameterizations are used during CURE to re-type atoms and reset charges after each new bond is formed.reaction
record directivesType
Description
name
str
descriptive name
stage
str
one of
cure
,cap
,build
, orparam
probability
float
probability that bond will form in one iteration if identified (1.0)
reactants
dict
keyword: reactant key, value: reactant molecule name
product
str
name of product molecule
atoms
dict
keyword: atom key, value: atom record
bonds
list
list of bond records, one item per bond formed in reaction
The
stage
value signifies howHTPolyNet
uses the reaction. It will generate GAFF parameters and topologies for any product of a reaction with stagecure
,cap
, orparam
.cure
reactions are those assigned to take place during CURE.cap
reactions are optional and take place once the CURE has finished; these can be used to revert the active form of any unreacted monomers back to their proper forms.param
reactions are only performed in the beginning when molecular constituents are being built. If you want to build the molecular constituents out of simpler monomers, you will likely want to useparam
reactions.The
atoms
directive is a dictionary of atom records where the key is an atom “key”, which is referenced in bond record.Atom records uniquely identify atoms in reactants, assigning them a shorthand key that is used in subsequent bond records.
Atom record subdirective
type
Description
reactant
arb.
Reactant key that references the
reactants
directive of the reactionresid
int
Residue number in the reactant containing this atom
atom
str
Atom name (originates in monomer
mol2
orpdb
file)z
int
Number of possible bonds atom can participate in
Bond records specify the bond(s) that form during this reaction.
Bond record subdirective
type
Description
atoms
list
The two atom keys that define the atoms that form the bond
order
int
Order (1=single, 2=double) of resulting bond
In the example here, we define two unique reactions. One is the C1-C2 bond that links two styrene monomers, and the other is the intramolecular C1-C2 double bond that “reverts” the active form of a monomer back to its “proper” form. Since that reaction’s
stage
iscap
, this signifies that it is formed only after CURE has finished.
A Simple Configuration Example: Polymerizing styrene¶
Consider the following methodological description:
A system of 100 molecules of styrene was prepared by initially placing 100 styrene molecules in a box at a density of 0.3 g/cc. This system was minimized and subjected to 10 ps of NVT MD at 300 K and then 200 ps of NPT MD at 300 K and 10 bar. The pressure was reduced to 1 bar and NPT MD was run for an additional 200 ps. The system was then put through two cycles of constant-pressure annealing at 1 bar in which a cycle is 20 ps at 300 K, 20 ps of linear temperature increase to 600 K, 20 ps at 600 K, and 20 ps of linear temperature decrease back to 300 K. This was followed by 100 ps of equilibration at 1 bar and 300 K. Then the CURE algorithm was used with an initial search radius of 0.5 nm and increment of 0.25 nm to create new intermonomer bonds until a conversion of 95% was reached. At each CURE iteration, the system was driven using a series of steered MD simulations to first force all pairs of atoms designated to be bonded but are more than 0.6 nm apart to approach one another to within 0.3 nm. When all such designated pairs were within 0.3 nm of each other, any sacrificial H’s were deleted and new bonds and accompanying angles and dihedrals were introduced into the topology. The bond lengths were then linearly reduced to their equilibrium values through a series of MD simulations. Once all bonds were at their equilibrium lengths, the iteration was completed by a 100 ps NPT MD simulation at 1 bar and 300 K, before beginning the next CURE iteration. Following CURE, the system was again annealed following the previous protocol and then NPT equilibrated at 1 bar and 300 K for 200 ps.
A simple YAML htpolynet run
configuration file that encodes this description might look like this.
Title: polystyrene
constituents:
STY:
count: 100
densification:
initial_density: 300.0 # kg/m3
equilibration:
- ensemble: min
- ensemble: nvt
temperature: 300
ps: 10
- ensemble: npt
temperature: 300
pressure: 10
ps: 200
precure:
preequilibration:
ensemble: npt
temperature: 300 # K
pressure: 1 # bar
ps: 200
anneal:
ncycles: 2
initial_temperature: 300
cycle_segments:
- T: 300
ps: 0
- T: 600
ps: 20
- T: 600
ps: 20
- T: 300
ps: 20
- T: 300
ps: 20
postequilibration:
ensemble: npt
temperature: 300 # K
pressure: 1 # bar
ps: 100
CURE:
controls:
initial_search_radius: 0.5 # nm
radial_increment: 0.25 # nm
max_iterations: 150
desired_conversion: 0.95
drag:
trigger_distance: 0.6 # nm
increment: 0.08 # nm
limit: 0.3 # nm
equilibration:
- ensemble: min
- ensemble: nvt
temperature: 600
nsteps: 1000
- ensemble: npt
temperature: 600
pressure: 1
nsteps: 2000
relax:
increment: 0.08 # nm
equilibration:
- ensemble: min
- ensemble: nvt
temperature: 600
nsteps: 1000
- ensemble: npt
temperature: 600
pressure: 1
nsteps: 2000
equilibrate:
ensemble: npt
temperature: 300 # K
pressure: 1 # bar
ps: 100
gromacs:
rdefault: 0.9 # nm
postcure:
anneal:
ncycles: 2
initial_temperature: 300
cycle_segments:
- T: 300
ps: 0
- T: 600
ps: 20
- T: 600
ps: 20
- T: 300
ps: 20
- T: 300
ps: 20
postequilibration:
ensemble: npt
temperature: 300 # K
pressure: 1 # bar
ps: 100
reactions:
- name: sty1_1
stage: cure
reactants:
1: STY
2: STY
product: STY~C1-C2~STY,
probability: 1.0
atoms:
A:
reactant: 1
resid: 1
atom: C1
z: 1
B:
reactant: 2
resid: 1
atom: C2
z: 1
bonds:
- atoms: [A, B]
order: 1
- name: styCC,
stage: cap,
reactants:
1: STY
product: STYCC,
probability: 1.0,
atoms:
A:
reactant: 1
resid: 1
atom: C1
z: 1
B:
reactant: 1
resid: 1
atom: C2
z: 1
bonds:
- atoms: [A, B]
order: 2
A note on YAML format: notice here that the left-most indents are all words; each of these is a key that refers to a specification for the run. Here is what this configuration specifies. First, we are starting with 100 styrene molecules. So, HTPolyNet
expects to find an input file name STY.mol2
or STY.pdb
in the system or user molecules/inputs
library. A liquid system of these 100 styrenes is densified starting from an initial density of 300 kg/m3using first an energy minimzation, then an NVT MD simulation at 300 K for 10 picoseconds, then an NPT MD simulation at 300 K and 10 bar for 200 ps. (The 10 bar helps to ensure rapid densification, but don’t feel pressured to use it.) To prepare for cure, the system is brought to 1 bar in the precure stage, where it is also annealed by raising the temprature from 300 to 600 K and bringing it back down, for two cycles. The CURE is run such that
The search radius begins at 0.5 nm and goes up in increments of 0.25 nm;
The algorithm bails out after 150 iterations; and
The desired cure conversion is 95%;
Prebond dragging is permitted if any newly identified bond is more than 0.6 nm in length, and the dragging happens in increments of 0.08 nm and each increment involves an energy minimization, an NVT MD simulation, and an NPT MD simulation, all at 600 K. (I find curing at elevated temperature keeps the system from jamming up, but don’t feel forced to use this temperature.) Bond relaxation takes place using a similar series of MD stages. Remember that dragging is performed on the system before bonds are formed and atoms deleted, while bond relaxation occurs after the bonds are formed and the sacrificial, valence-conserving H atoms are deleted. Finally, when all new bonds are relaxed, a single NPT MD simulation is performed to end an iteration. Postcure involves an annealing simulation much like the precure stage, followed by an NPT MD simulation.
Finally, we stipulate the reactions. In this system, there is really only one reaction: the one in which the C1 of one styrene bonds to the C2 of another. The reaction named sty1_1
specifies this reaction, and causes HTPolyNet
to parameterize the dimeric product named STY~C1-C2~STY
. This molecule provides a template for atom types, charges, and new bonded interactions that must be merged into a system if such a bond forms. The other reaction, styCC
, specifies a cap
reaction that reverts any unreacted styrene back to its proper form (with the C-C double bond). Capping reactions are 100% optional; don’t feel forced to use them.