Configuration Files for htpolynet run¶
The htpolynet run configuration file tells HTPolyNet what it needs in order to generate a polymerized system, beginning with structures of the individual monomers and a description of the polymerization chemistry. htpolynet run expects at most ten distinct sections in a configuration file:
Section name |
Role |
|---|---|
|
Just a descriptive title |
|
Directives for interacting with Gromacs |
|
Directives for interacting with AmberTools |
|
Directives for handling inconsistencies in the General Amber Force Field |
|
Directives for molecular constituents in the initial system |
|
Directives for how the densification phase is run |
|
Directives for how precure equilibration and annealing is run |
|
Directives for how the CURE is run |
|
Directives for how postcure equilibration and annealing is run |
|
List of all reactions needed to build and cure the system |
Sections can appear in any order (since the whole YAML file is like a nested python dictionary). The gromacs and ambertools sections are mainly for specifying system-specific commands for running Gromacs and AmberTools executables; they have defaults that work for simple Linux workstations. densification, precure, and postcure all specify series of MD simulations on the system. constituents specifies the initial make-up of the system, and reactions describe both how to build those constituents from input monomers (if necessary) as well as the types of bonds that you want to occur during polymerization. Both constituents and reactions sections are where HTPolyNet extracts the names of molecular species in the system and how they are chemically interrelated. CURE is the most complicated section, and it desribes how the CURE algorithm is to be run. We consider each of these eight sections (minus Title) below.
All the details¶
In this section we show all subdirectives for each of the five main directives in the configuration file. Please note that, although default values are quoted for some of these parameters, these default values are not guaranteed to be optimal or even functional in all possible scenarios.
gromacs: This directive specifies parametersHTPolyNetuses when invoking the Gromacs executable.gromacssubdirectiveType
Description (default)
gmxstr
gmxcommand (defaultgmx)gmx_optionsquoted string
options to pass to all
gmxcalls (default-quiet -nobackup)mdrunquoted string
mdruncommand (defaultgmx (options) mdrun)mdrun_single_moleculequoted string
version of
mdrunto use for any single-molecule Gromacs runsmdrun_optionsdict
command-line arguments to pass to
mdrun(none)If you are running on a supercomputer with a native installation of Gromacs, it is likely you should point the parameter
gmxto the fully resolved pathname ofgmx_mpi(or load the appropriate module), and use themdrunparameters to specify thempirunormpiexecsyntax needed to launchgmx_mpi mdrun. Thegromacs_single_moleculesubdirective allows you to specify a particular form ofmdrunappropriate for single-molecule simulations. These are most often used as part of parameterization or conformer generation. Typically, it’s best to run these on a single processor without domain decomposition.The
gromacsdirective is optional; if none is specified the default values are used.ambertools: This directive specifies parametersHTPolyNetuses when working with the AmberTools suite.ambertoolssubdirectiveType
Description (default)
charge_methodstring
charge model used by
antechamber(defaultgas)For now, you can choose any charging method compatible with
antechamber. Theantechamberdirective is optional.GAFFIn very rare instances, AmberTools will generate GAFF atom types and parameters that are internally inconsistent, or at least are not understandable by the
parmedpackage that translates them into Gromacs topology files. Directives in this section instructHTPolyNethow to resolve these inconsistencies.GAFFsubdirectiveType
Description (default)
resolve_type_discrepanciesresolve record
Directives for resolving atom type discrepancies
GAFF.resolve_type_discrepanciesdirectivesDescription (default)
typenameName of interaction type in
topfile (dihedraltypes)funcidxFunction index assigned to this type in the
topfileruleWhat rule to apply to resolve discrepancies (
stiffest)Currently, the only type of discrepancy that can be handled is one in which four atom types define one dihedral type in one
topfile but a different one in another. If these to topologies are merged to make a composite system, Gromacs will flag this as an error in the topology file. This directive allows you to check for all such discrepancies before Gromacs does, and keep only the one that follows the stipulatedrule. For dihedrals, thestiffestrule means that of the conflicting types, the one with the largest energetic parameters is kept.constituentsThis directive is a set of one or more “key”:”record” pairs in which each key is the name of a molecule (here, “STY”) and the record is a dictionary of keyword:value pairs. The allowable keywords in a
constituentrecord are as follows.constituentsrecord keywordType
Description (default)
countint
(optional) number of these molecules in the system
stereocenterslist
(optional) list of names of chiral carbon atoms if any
symmetry_equivalent_atomslist
(optional) list of sets of symmetry equivalent atom names, if any
conformersconformers record
(optional) parameters specifying if and how initial conformers are generated
In the example below, we are requesting a system of 100 styrene molecules. The key
STYsignals toHTPolyNetthat it should look for eitherSTY.mol2orSTY.pdbin./lib/molecules/inputsor it should look forSTY.gro,STY.itp,STY.top, andSTY.grxin./lib/molecules/parameterized. The latter is the case if eitherhtpolynet runorhtpolynet parameterizedhas already been run withSTY.mol2orSTY.pdb. Multiple records inconstituentsshould all have the “key”:”record” syntax and be separated by commas.Any entry in
constituentsfor which thecountis 0 or missing is treated as an intermediate for which stereocenters or symmetry-equivalent atoms are specified.HTPolyNetallows you the option to use multiple conformers of flexible molecules to build the initial liquid system. It can use eitherobabel’sconfomerscapability or an MD simulation viagromacsto generate these. Theconformersrecord has two subdirectives:conformersrecord keywordType
Description (default)
countint
(required) number of unique conformers to generate (per stereoisomer)
generatorconformer generator record
(optional) parameters specifying how conformers are generated
The
conformers.generatorrecord has several subdirectives:generatorrecord keywordType
Description (default)
namestr
(required)
obabelorgromacsparamsgenerator parameters record
(optional) parameters specifying the generator’s operation (only relevant for
gromacs)The
conformers.generator.paramsrecord has several subdirectives:paramsrecord keywordType
Description (default)
ensemblestr
nvtis the only option that makes sensetemperaturefloat
(optional) Temperature of the conformer-generating MD simulation
psfloat
(optional) Duration of the conformer-generating MD simulation
padfloat
(optional) Box-size padding for the vacuum MD simulation
If an entry in
constituentshas noconfomersmember directive, then the confomer used for it in building the system is whatever one is in either the inputmol2file (after AmberTools andparmedconvert it to agrofile) or thegrofile of the constructed molecule.densificationThis directive instructs
HTPolyNethow to run the initial densification of the fresh simulation system. It has two subdirectives:densificationsubdirectiveType
Description (default)
initial_densityfloat
density in kg/m^3 at which molecules are placed randomly into a box to make the initial coordinates (default 300.0)
equilibrationlist
list of MD records
The
equilibrationsubdirective should contain one or more MD records. An MD record is a dictionary of keyword:value pairs:MD record keyword
Type
Description
ensemblestring
(required) min (minimization), npt, or nvt
temperaturefloat
(required if
ensembleis nvt or npt) Temperature in K assigned toref_tin Gromacsmdpfilepressurefloat
(required if
ensembleis npt) Pressure in bar assigned toref_pin Gromacsmdpfilenstepsint
(optional; required if
psnot provided) Duration of MD simulation in number of time stepspsfloat
(optional; required if
nstepsnot set) Duration of MD simulation in picosecondsrepeatint
(optional) number of times to repeat this simulation in series; default is 0 (i.e., run once)
The
repeatsubdirective is especially useful for densifications that start at very low initial densities. It is better to run several short NPT simulations than a single long one so that the box size shrinkage doesn’t overwhelm Gromacs’ domain decomposition algorithm.precureThe
precuredirective instructsHTPolyNeton running a series of MD simulations after densification but before the cure. There are three allowable subdirectives forprecure:precuresubdirectiveType
Description (default)
preequilibrationMD record
optional MD simulation
annealAnneal record
Description of an annealing simulation after the optional
preequilibrationpostequilibrationMD record
optional MD simulation
Both the
preequilibrationandpostequilibrationdirectives contain MD records described above. The Anneal record has the following subdirectives:Anneal record subdirective
Type
Description (default)
ncyclesint
number of annealing cycles
initial_temperaturefloat
(optional) Initial temperaure in K, really only sets the
gen-tempmdpparametercycle_segmentslist
list of cycle records
A cycle record corresponds to an “annealing-point” in the Gromacs
mdpfile.Cycle record subdirective
Type
Description (default)
Tfloat
Targe temperature in K
psfloat
cycle duration; if prior
Tis different, simulation is brought to thisTin this amount of time; if priorTis the same, simulation is held at thisTfor this amount of time.Each cycle consists of one pass through the cycle segments. In the example here, one cycle consists of Gromacs taking the system from 300 to 600 K in the first 20 ps, then holding at 600 for 20 pm, then reducing to 300 K over 20 ps and holding it there for 20 ps.
CUREThis directive contains all instructions governing the CURE algorithm. There are five possible subdirectives:
CUREsubdirectiveType
Description (default)
controlslist
Control parameter values
draglist
Dragging parameter values
relaxlist
Bond relaxation parameter values
equilibrateMD record
CURE iteration equilibration parameters
gromacslist
any
mdpkeyword:value pairs to include in allmdpfiles in theCUREsequenceCURE.controlsparametersCURE.controlsparameterType
Description (default)
initial_search_radiusfloat
initial search radius in nm (default 0.5)
radial_incrementfloat
increment in nm by which search radius is increased if no bonds are found at current radius (default 0.25)
max_iterationsint
absolute maximum number of allowed iterations (default 150),
desired_conversionfloat [0-1]
target conversion between 0 and 1.0 (default 0.95)
late_threshholdfloat [0-1]
conversion above which bond probabilities are ignored
min_allowable_bondcycle_lengthint
minimum number of C atoms allowed in a cycle of C-C bonds that form via polymerization (default 0)
The
min_allowable_bondcycle_lengthrefers to the fact that in systems that polymerize via activation of carbon-carbon double bonds, it is possible in the HTPolyNet implementation that the “head” of a chain of C-C bonds can attack the “tail” and form a cycle, because those represent atom types that can react. It is unclear whether such cycles actually form; if a monomer remains bound to a radical initiator it is hard to see how the head of the growing chain could attack it, but maybe it could. Settingmin_allowable_bondcycle_lengthto zero (the default) disallows any bonds that would form cycles involving only atoms that were once part of C=C double bonds. (Think about the backbone of polystyrene, for example.) In a given CURE iteration, HTPolyNet tests the full set of suggested bonds to see if together they result in any cycles, and for each nascent cycle longer thanmin_allowable_bondcycle_length, HTPolyNet will disallow the nascent bond that has the longest initial length.
CURE.dragparameters: Dragging refers to a series of MD simulations (called “stages”) in which harmonic restraints are applied to each pair of atoms assigned to form a bond, but before the bonds actually form. Dragging is useful to reduce 1-4 distances that ultimately arise when bonds form. Each stage in the series uses a specially modified topology file in which “new” bonds of type 6 are added, one for each pair of to-be-bonded atoms. Each of these bonds has a parameterkb, the spring constant, andb0, the equilibrium length. Thedragdirective governs how thoseb0parameters are linearly decreased through the set of stages to slowly bring the atoms closer together. Thelimitparameter is the target distance of dragging, andincrementdetermines the number of stages it will take to get there.
CURE.dragparameterType
Description (default)
incrementfloat
minimum amount by which target
dragdistance is decreased in steps (default 0.08)
limitfloat
distance in nm to which all bonds are dragged (default 0.3)
equilibrationMD record
describes the MD simulations used to equilibrate at each stage
CURE.relaxparameters: Relaxation refers to a series of MD simulations (also called “stages”) in which thekbandb0parameters of each new bond are “attenuated” from a weak (lowkb), long (largeb0) state to the state dictated by the force field. Theincrementdetermines the number of stages are performed.
CURE.relaxparameterType
Description (default)
incrementfloat
minimum amount by which
b0parameters are decreased in steps (default 0.08)
equilibrationMD record
describes the MD simulations used to equilibrate at each stage
gromacsparameters: These parameters govern modification tomdpfiles used in the dragging and relaxation MD simulations.HTPolyNetadjusts the cutoff distances to conform to the longest unrelaxed bond in the system, and therdefaultparameter provides the floor below which it will not go any lower.
CURE.gromacsparameterType
Description (default)
rdefaultfloat
minimum cutoff radius (default 0.9)
postcureThe
postcuredirective instructsHTPolyNeton running a series of MD simulations after cure. Its form is identical to that ofprecure, namely with optionalpreequilibration,anneal, andpostequilibrationsubdirectives.
reactionsThe
reactionsdirective contains a list of reaction records that specify the chemisty of any bonds that form to either build molecular constituents or polymers/crosslinks. A reaction is defined by the precise pairs of atom types that become new covalent bonds. To precisely define each such pair, the reaction must also identify one or more reactant molecules. Each reaction also names a single product molecule. HTPolyNet will build oligomer templates using these reactions and then GAFF-parameterize them. The parameterizations are used during CURE to re-type atoms and reset charges after each new bond is formed.reactionrecord directivesType
Description
namestr
descriptive name
stagestr
one of
cure,cap,build, orparamprobabilityfloat
probability that bond will form in one iteration if identified (1.0)
reactantsdict
keyword: reactant key, value: reactant molecule name
productstr
name of product molecule
atomsdict
keyword: atom key, value: atom record
bondslist
list of bond records, one item per bond formed in reaction
The
stagevalue signifies howHTPolyNetuses the reaction. It will generate GAFF parameters and topologies for any product of a reaction with stagecure,cap, orparam.curereactions are those assigned to take place during CURE.capreactions are optional and take place once the CURE has finished; these can be used to revert the active form of any unreacted monomers back to their proper forms.paramreactions are only performed in the beginning when molecular constituents are being built. If you want to build the molecular constituents out of simpler monomers, you will likely want to useparamreactions.The
atomsdirective is a dictionary of atom records where the key is an atom “key”, which is referenced in bond record.Atom records uniquely identify atoms in reactants, assigning them a shorthand key that is used in subsequent bond records.
Atom record subdirective
type
Description
reactantarb.
Reactant key that references the
reactantsdirective of the reactionresidint
Residue number in the reactant containing this atom
atomstr
Atom name (originates in monomer
mol2orpdbfile)zint
Number of possible bonds atom can participate in
Bond records specify the bond(s) that form during this reaction.
Bond record subdirective
type
Description
atomslist
The two atom keys that define the atoms that form the bond
orderint
Order (1=single, 2=double) of resulting bond
In the example here, we define two unique reactions. One is the C1-C2 bond that links two styrene monomers, and the other is the intramolecular C1-C2 double bond that “reverts” the active form of a monomer back to its “proper” form. Since that reaction’s
stageiscap, this signifies that it is formed only after CURE has finished.
A Simple Configuration Example: Polymerizing styrene¶
Consider the following methodological description:
A system of 100 molecules of styrene was prepared by initially placing 100 styrene molecules in a box at a density of 0.3 g/cc. This system was minimized and subjected to 10 ps of NVT MD at 300 K and then 200 ps of NPT MD at 300 K and 10 bar. The pressure was reduced to 1 bar and NPT MD was run for an additional 200 ps. The system was then put through two cycles of constant-pressure annealing at 1 bar in which a cycle is 20 ps at 300 K, 20 ps of linear temperature increase to 600 K, 20 ps at 600 K, and 20 ps of linear temperature decrease back to 300 K. This was followed by 100 ps of equilibration at 1 bar and 300 K. Then the CURE algorithm was used with an initial search radius of 0.5 nm and increment of 0.25 nm to create new intermonomer bonds until a conversion of 95% was reached. At each CURE iteration, the system was driven using a series of steered MD simulations to first force all pairs of atoms designated to be bonded but are more than 0.6 nm apart to approach one another to within 0.3 nm. When all such designated pairs were within 0.3 nm of each other, any sacrificial H’s were deleted and new bonds and accompanying angles and dihedrals were introduced into the topology. The bond lengths were then linearly reduced to their equilibrium values through a series of MD simulations. Once all bonds were at their equilibrium lengths, the iteration was completed by a 100 ps NPT MD simulation at 1 bar and 300 K, before beginning the next CURE iteration. Following CURE, the system was again annealed following the previous protocol and then NPT equilibrated at 1 bar and 300 K for 200 ps.
A simple YAML htpolynet run configuration file that encodes this description might look like this.
Title: polystyrene
constituents:
STY:
count: 100
densification:
initial_density: 300.0 # kg/m3
equilibration:
- ensemble: min
- ensemble: nvt
temperature: 300
ps: 10
- ensemble: npt
temperature: 300
pressure: 10
ps: 200
precure:
preequilibration:
ensemble: npt
temperature: 300 # K
pressure: 1 # bar
ps: 200
anneal:
ncycles: 2
initial_temperature: 300
cycle_segments:
- T: 300
ps: 0
- T: 600
ps: 20
- T: 600
ps: 20
- T: 300
ps: 20
- T: 300
ps: 20
postequilibration:
ensemble: npt
temperature: 300 # K
pressure: 1 # bar
ps: 100
CURE:
controls:
initial_search_radius: 0.5 # nm
radial_increment: 0.25 # nm
max_iterations: 150
desired_conversion: 0.95
drag:
trigger_distance: 0.6 # nm
increment: 0.08 # nm
limit: 0.3 # nm
equilibration:
- ensemble: min
- ensemble: nvt
temperature: 600
nsteps: 1000
- ensemble: npt
temperature: 600
pressure: 1
nsteps: 2000
relax:
increment: 0.08 # nm
equilibration:
- ensemble: min
- ensemble: nvt
temperature: 600
nsteps: 1000
- ensemble: npt
temperature: 600
pressure: 1
nsteps: 2000
equilibrate:
ensemble: npt
temperature: 300 # K
pressure: 1 # bar
ps: 100
gromacs:
rdefault: 0.9 # nm
postcure:
anneal:
ncycles: 2
initial_temperature: 300
cycle_segments:
- T: 300
ps: 0
- T: 600
ps: 20
- T: 600
ps: 20
- T: 300
ps: 20
- T: 300
ps: 20
postequilibration:
ensemble: npt
temperature: 300 # K
pressure: 1 # bar
ps: 100
reactions:
- name: sty1_1
stage: cure
reactants:
1: STY
2: STY
product: STY~C1-C2~STY,
probability: 1.0
atoms:
A:
reactant: 1
resid: 1
atom: C1
z: 1
B:
reactant: 2
resid: 1
atom: C2
z: 1
bonds:
- atoms: [A, B]
order: 1
- name: styCC,
stage: cap,
reactants:
1: STY
product: STYCC,
probability: 1.0,
atoms:
A:
reactant: 1
resid: 1
atom: C1
z: 1
B:
reactant: 1
resid: 1
atom: C2
z: 1
bonds:
- atoms: [A, B]
order: 2
A note on YAML format: notice here that the left-most indents are all words; each of these is a key that refers to a specification for the run. Here is what this configuration specifies. First, we are starting with 100 styrene molecules. So, HTPolyNet expects to find an input file name STY.mol2 or STY.pdb in the system or user molecules/inputs library. A liquid system of these 100 styrenes is densified starting from an initial density of 300 kg/m3using first an energy minimzation, then an NVT MD simulation at 300 K for 10 picoseconds, then an NPT MD simulation at 300 K and 10 bar for 200 ps. (The 10 bar helps to ensure rapid densification, but don’t feel pressured to use it.) To prepare for cure, the system is brought to 1 bar in the precure stage, where it is also annealed by raising the temprature from 300 to 600 K and bringing it back down, for two cycles. The CURE is run such that
The search radius begins at 0.5 nm and goes up in increments of 0.25 nm;
The algorithm bails out after 150 iterations; and
The desired cure conversion is 95%;
Prebond dragging is permitted if any newly identified bond is more than 0.6 nm in length, and the dragging happens in increments of 0.08 nm and each increment involves an energy minimization, an NVT MD simulation, and an NPT MD simulation, all at 600 K. (I find curing at elevated temperature keeps the system from jamming up, but don’t feel forced to use this temperature.) Bond relaxation takes place using a similar series of MD stages. Remember that dragging is performed on the system before bonds are formed and atoms deleted, while bond relaxation occurs after the bonds are formed and the sacrificial, valence-conserving H atoms are deleted. Finally, when all new bonds are relaxed, a single NPT MD simulation is performed to end an iteration. Postcure involves an annealing simulation much like the precure stage, followed by an NPT MD simulation.
Finally, we stipulate the reactions. In this system, there is really only one reaction: the one in which the C1 of one styrene bonds to the C2 of another. The reaction named sty1_1 specifies this reaction, and causes HTPolyNet to parameterize the dimeric product named STY~C1-C2~STY. This molecule provides a template for atom types, charges, and new bonded interactions that must be merged into a system if such a bond forms. The other reaction, styCC, specifies a cap reaction that reverts any unreacted styrene back to its proper form (with the C-C double bond). Capping reactions are 100% optional; don’t feel forced to use them.