Interaction between molecular poses - ProkopHapala/FireCore GitHub Wiki

Introduction

Multiple Global Optimization algorithms for molecules, as well as other algorithms for exploration of configuration space require to (1) calculate distance $\Delta(A,B)$ between two molecular configurations $A$, $B$ in configuration-space and (2) calculate generalized spring-forces $F(\Delta(A,B))=K(\Delta(A,B)-L)$ between these two configurations which keep them at certain fixed distance $L$ (not to close or not too far). Examples of these algorithms are:

  1. Minima Hopping and Basin Hopping - Distance between two configurations is needed to estimate if we found new configuration or the same
  2. Dimer method - Dimer methods finds softest degree of freedom (e.g. softest vibration mode) by relaxing two molecular configurations which are kept at fixed distance in configuration space.
  3. Nudged Elastic Band(NEB) - Chain of configurations is relaxed. We keep them at certain distance by springs. We also calculate force component tangential ($f^\parallel$) and perpendicur ($f^\perp$) to the chain.
  4. Metadynamics - We run molecular dynamics and remember history of previous states in configuration-space (${\vec R}_i$) the molecular trajectory. We add artificial energy term $E_i( {\vec R}_i )$ of Gaussian shape to the potential energy surface (PES) at positions of previous samples. During the molecular dynamics simulation motion of molecule is affected by forces (i.e. derivatives of PES) from all these additional Gaussians from history which push the molecule out from the area which it already explored.

$$V_{Hist}({\vec R})= \sum_i{V_i( \Delta({\vec R},{\vec R_i}) )}= \sum_i{\exp(- \frac{ \Delta^2({\vec R},{\vec R_i})}{\sigma_i^2})} $$

Choice of coordinates and metric $\Delta(,)$

The distance between molecular configuration is rather ambiguous term which strongly depend on our goals and point-of-interest. For example, we may study molecule of DNA. We are interested if it forms single-strand, double helix, A-,B- or Z- form, how are the basepairs aligned and oriented with respect to each other. We may be not interested in which exact orientation are the -OH groups on the ribose, or in which exact vibration state are the hydrogen atoms in C-H groups, since these degrees of freedom will average-out every few picoseconds.

The simplest definition of distance between two configurations $A$, $B$ is simply euclidean distance calculated form all cartesian atomic positions ${\vec r_i}$.

$$ \Delta_e(A,B) = \sqrt{ \sum_i{ | {\vec r}^A_{i} - {\vec r}^B_{i} |^2 }} $$

However this distance is problematic for many purposes.

  1. It is not invariant with respect to rigid translation and rotation of the whole molecule,
    • This is problem for molecule in vacuum or in solvent because rotated and translated molecule is physically still the same object.
    • However, if symmetry of space is broken, that this is actually useful feature. For example if we have molecule in external electric field or other force-field, such as molecule deposited on surface of solid material or molecules packed in crystal, energy and other physical properties of the molecule depend on its rotation and translation.
  2. It is not invariant with respect to permutation of equvalent atoms.
    • This is generally a problem if we study chemical reactions at high temperature where chemical bonds can be broken and re-established. For example in melted iron. In that case we can find $N!$ equivalent local energy minima (where $N$ is number of iron atoms), which differ just by permutation of atomic indexed $i$ and our cartesian metric would not know they are actually the same. Already for 20 atoms we will have to explore 2.432902e+18 configurations. We will quickly run out of computer memory and time.
    • This is actually not such a big problem for low temperature chemistry when we assume than bonding-topology of the molecule (molecular graph) is preserved in all processes. Especially if we use classical force-field with fixed bonding topology. This means that we do not have to worry that e.g. in the molecule of adenine nitrogen atom N1 in the -NH2 group swap its place with nitrogen atom N2 in the pyridine group.
      • However, even it this case permutation symmetry may be problem if we have multiple equivalent groups. Trivial example may be swap of two hydrogen atoms in the -NH2 group, which can rather easily rotate.
  3. Too sensitive to irrelevant atomic details. As we described on example of DNA, cartesian distance of atoms may be large for two molecular configurations just because all atoms are a little displaced from their position due to vibration or due to sigma-bond rotation of -OH,-NH2,-CH3 or other aliphatic groups. But we may be not interested in these degrees of freedom, and consider DNA to be in the same configuration whenever basepairs are in the same relative orientation (position & rotation) with respect to each other.
  4. For large systems with many configurations it may be relatively costly to store in memory positions of all atoms, and relatively costly to evaluate distance between all atoms. This is problem only when using fast forcefields, where we can easily explore thousands or even milions of configurations per second (using paralelized GPU hardware).

Distance between Group poses $\Delta_G(,)$

For our purposes will be more useful molecular group configuration metric $\Delta_G(,)$ defined as sum of distances between poses (i.e. position and rotation) for selected chemical groups $g$ (i.e. sub-selection of atoms). Mathematically, for two molecular configurations $A,B$ we define metric $\Delta_G(A,B)$ as follows:

$$ \Delta_G(A,B) = \sqrt{ \sum_g{ \Delta^2(a_g,b_g) }} $$

where $a_g$ is pose of group $g$ in configuration $A$, and $b_g$ is pose of group $g$ in configuration $B$. We further define metric for two poses of group as follows:

$$ \Delta_g(a_g,b_g) = \sqrt{ K_c^g \Delta^2({\vec c}^A_g,{\vec c}^B_g) + \Delta^2({\vec O}^A_g,{\vec O}^B_g) } $$

Where ${\vec c}^A_g$ is centroid (e.g. center of mass) of group $g$ in configuration $A$, and ${\vec O}^A_g$ is orientation (rotation) of group $g$ in configuration $A$. We also introduce weighting constants $K_c^g$, $K_u^g$, $K_v^g$ using which we tune how important is position and orientation of each group for the total metric. We call these weighting constant $K$ to denote that in fact they define interaction stiffness of these degrees of freedom.

For centroids we calculate simple euclidean distance $\Delta_e(,)$ between cartesian coordinates ${\vec c}^A_g = (x_A,y_A,z_A)$:

$$ \Delta_e({\vec c}^A_g,{\vec c}^B_g) = \sqrt{ (x_A-x_B)^2 + (y_A-y_B)^2 + (z_A-z_B)^2 } $$

For rotations our definition of distance depend on choice of representation of the rotation. There are multiple ways how to represent rotation of objects in 3D space.

  1. Euler angles: The most intutitive (naieve?), but computationally very inconvenient is representation by [Euler angles]. This representation have several bad properties, most importaint of which are (1) the transformation between Euler angles and cartesian coordinates of atoms is computationally costly (require evaluation of goniometric functions sin(),cos(), and its inverse arcsin(),arccos(),atan2()) (2) The representation is discontinuous (e.g. $\phi$ jumps to 0 after you pass $2\pi$, $\theta$ jumps to 0 after you pass $\pi$) which cause that two configuration may look oriented very differently in angular representation, while in reality they are oriented very similarly. (3) There are poles where angular coordinates become degenerate and Jacobian of transformation diverges, which is connected to well known problem of gimbal lock. Handling problems (2),(3) in computer code requires hadling many special conditions, and generally complicated and slow code. => Therefore, we avoid using Euler angles.

  2. Quaternions: Perhaps most efficient and elegant representation of 3D rotation are unitary quaternions also called versors. We will probably consider them later, but discussion of non-intutitive quaternion algebra now can obscure the essence of this article. Transformation from cartesian coordinates and back is rather simple and computationally efficient, but it involves intermediate step of using 3x3 rotation matrix.

  3. Orthonormal 3D vectors: Therefore for simplicity of discussion, coding and numerical efficiency we will first represent rotation using symmetric orthonormal 3x3 matrices $({\vec u},{\vec v},{\vec w})$, despite this representation is rather redudant (9 numbers to describe 3 degrees of freedom). Immediately we can reduce this redudancy by considering only two normalized direction vectors ${\vec u},{\vec v}$ while the 3rd vector is evaluated implicitly as ${\vec w}={\vec u}\times{\vec v}$.

Now we can express the distance between two orientations ${\vec O}^A$ and ${\vec O}^B$ simply as distance between direction of normalized vectors.

$$ \Delta^2_O({\vec O}^g_A,{\vec O}^g_B) = K_u^g \Delta^2({\vec u}^g_A,{\vec u}^g_B) + K_v^g \Delta^2({\vec v}^g_A,{\vec v}^g_B) $$

However, to measure distance between direction vectors it is preferable to use cosine distnace rather than euclidean distance. It is both numerically faster and also it better represents problem at hand. Considering we work with normalized vectors ( ${\vec u}={\vec v}=1$ ) cosine distance reduce to dot product:

$$\Delta^2_{cos}({\vec u}_A,{\vec u}B) = 1 - u_x^A u_x^B + u_y^A u_y^B + u_z^A u_z^B = 1 - cos( \theta{uv} )$$

NOTE: Notice that for small angles between vectors $\vec u_A$ and $\vec u_B$ the metric $\Delta^2_{cos}( \vec u_A, \vec u_B )$ approximate angular distance $\theta_{ab}^2$ , this is because $1 - cos(\theta) \approx \theta^2$ for small angles.

Calculation of group poses

As described in previous section each group $g$ is represented by 3 vectors $\vec c_g$, $\vec u_g$, $\vec v_g$, where $\vec c_g$ is position of centroid, $\vec u_g$ and $\vec v_g$ are normalized direction vector. For optimal performance $\vec u_g$ and $\vec v_g$ should be roughly orthogonal to each other (but not necessarily exactly). For group composed of $n$ atoms with cartesian coordinates $r_i$ ( $i \in [1..n]$ ) we can evaluate all these vectors rather easily by weighted sum of atomic positions.

$$ {\vec c}_g = \sum_i{ c_i {\vec r}_i } $$

$$ {\vec u}_g = \sum_i{ u_i ({\vec r}_i - {\vec c}_g) } $$

$$ {\vec v}_g = \sum_i{ v_i ({\vec r}_i - {\vec c}_g) } $$

The only question is, what are the coefficients $c_i$, $u_i$, $v_i$ and how we can obtain them? Generally we pre-compute these weights from some reference pose of the group.

weighting coeficients for centroid (Group positon)

Evaluation of centroid ${\vec c}_g$ is very straightforward. Simplest option is just to average over all atomic coordinates by choosing same weight cooeficient for all atoms $c_i = 1/n$. If we choose $c_i = m_i /\sum_i{m_i}$ than ${\vec c}_g$ will be center of mass which focus preferentially on heavy atoms. But we do not necessarily care about heavy atoms, we may choose weights arbitrarily to filter those atoms which we consider most important (e.g. atoms involved in hydrogen bonds on DNA bases, or anchoring point of the group) and ascribe them higher weight. We can also set weight of some atoms to zero ($c_i=0$) to ignore them, and to spare some computing power if the group is large. We only need to full the condition that the weighted sum is normalized $\sum_i c_i = 1$ which we can easily do by renormalization $c_i = {\tilde c}_i /\sum_i{{\tilde c}_i}$.

weighting coeficients for direction vecors (Group rotation)

For orientation vectors ${\vec u}_g$ this is just slightly more complicated. For simplicity consider that in the reference configuration we rotate and translate the molecule (resp. the group) so that the direction we care-about are alined with x,y axis (e.g. $\vec u=\hat x$,$\vec v=\hat y$), and the centroid $\vec c_i$ is in the origin $\vec c_i=(x=0,y=0)$. In that case we can simply take the we simply take the $u_i=x_i$ and $v_i=y_i$. Therefore

$${\tilde u} = \sum{ x_i {\vec r}_i }$$

which is similar to calculation of dipole moment along x-direction (or Mathematically speaking we calculate the first moment of the atomic positions along $x$ direction). Similarly to centroid, we may want to pay more attention to some atoms, so we can introduce additional weighting $w_i$, which may or may not be same as in case of centroid ($w_i=c_i$). In that case $\tilde u_i = w_i x_i$ and

$${\tilde u} = \sum{ w_i x_i {\vec r}_i }$$

Finally, it would be convenient (but not necessary) if the direction vectors are already roughly normalized (so that we do not have to re-normalize them), at least if the group is only rigidly transformed (not deformed). We can easily guarantee this by renormalizing the coefficient before we save them.

$$ u_i = \frac{\tilde u_i}{|\tilde u|} = \frac{w_i x_i}{|\sum{ w_i x_i {\vec r}_i }|}$$

NOTE: This calculation of group orientation works well not only for rigid groups, but also for slightly flexible groups (such as DNA nucleobases deformed in classical forcefield). For centroid ${\vec c}_g$ it works always without problems. For orientation vectors $u_i$ the most common problem is that these vectors become non-unitary (not normalized) and non-orthogonal to each other, despite they are in the reference configuration. This can be easily corrected by re-noralization and re-orthogonalization as long as deformation of the group is not large. For small deformations we can even used efficinet approximative renormlization and re-orthogonalization using taylor expansion to avoid computational cost of sqrt() function.

Forces

For arbitrary configuration metric $\Delta_M$ and two configurations $A$, $B$ we can calculate fictitious spring force acting on atoms of both configurations which keeps the configuration in certain distance $L$ by stiffness constant $K$ as derivative of fictitious spring energy $E=K(\Delta_M(A,B) - L)^2/2$ by derivative according to all coordinates:

$$ {\vec F}({\vec r}^A_1,.. {\vec r}^A_n,{\vec r}^B_1,.. {\vec r}^B_n) = K (\Delta_M(A,B) -L) \frac{ \partial \Delta_M(A,B) }{\partial({\vec r}^A_1,.. {\vec r}^A_n,{\vec r}^B_1,.. {\vec r}^B_n) } $$

Now we find inter-configuration interaction forces ${\vec f}^{AB}_i$ acting on individual atom $i$ using our group-pose metric $\Delta_M(A,B)$ by subtituting all the previous definitions and using chain-derivative-rule.

$$ \Delta_G(A,B) = \sqrt{ \sum_g{ \Delta^2(a_g,b_g) }} $$

$$ {\vec f}^A_i({\vec r}^A_i) = K (\Delta_G(A,B) -L) \frac{ \partial \sqrt{ \sum_g{ \Delta^2(a_g,b_g) }}}{\partial {\vec r}^A_i} = K \frac{\Delta_G(A,B) -L}{\Delta_G(A,B)} \frac{ \partial \Delta^2_G(A,B) }{\partial {\vec r}^A_i} $$

by subtituting all the previous definitions of Group pose distance $\Delta_G(,)$ we can now easily evaluate this by chain-derivative-rule. We assume that each atom $i$ belong to max one group $g$ (i.e. the groups does not overlap). This makes evaluation of the forces more efficient especially on parallel hardware. Than we can write:

$$\frac{ \partial \Delta^2_G(A,B) }{\partial {\vec r}^A_i} = \frac{ \partial \sum_g{ \Delta^2(a_g,b_g) } }{\partial {\vec r}^A_i} = K_c^g \frac{ \partial \Delta^2_e ({\vec c}_A, {\vec c}B) }{\partial {\vec r}^A_i} + K_u^g \frac{ \partial \Delta^2{cos}({\vec u}_A, {\vec u}B) }{\partial {\vec r}^A_i} + K_v^g \frac{ \partial \Delta^2{cos}({\vec v}_A, {\vec v}_B) }{\partial {\vec r}^A_i} $$

derivative of centroid distances position is simple:

$$\frac{ \partial \Delta^2_e({\vec c}_A,{\vec c}_B) }{\partial {\vec r}^A_i} = c_i ({\vec r}^A_i- {\vec r}^B_i) $$

derivative of direction distances is slightly more involved:

$$ \frac{ \partial \Delta^2_{cos}({\vec u}_A,{\vec u}_B) }{\partial {\vec r}^A_i} = \frac{ \partial \sum_i \sum_j{u_i u_j (x_i^A x_j^B + y_i^A y_j^B + z_i^A z_j^B) } }{\partial {\vec r}^A_i} = u_i \sum_j{ u_j {\vec r}^B_j } = u_i {\vec u}_B $$

But we have to conserve unitary norm of vector $\vec u_A$ therefore we need to take only component of $\vec u_B$ orthogonal to $\vec u_A$, therefore

$$ \frac{ \partial \Delta^2_{cos}({\vec u}_A,{\vec u}_B) }{\partial {\vec r}^A_i} = u_i ( {\vec u}_B - \langle{\vec u}_B|{\vec u}_A \rangle {\vec u}_A ) $$

So final expression for fictitious force on atom $i$ of configuration $A$ is:

$$ {\vec f}_i^A = K \frac{\Delta_G(A,B) -L}{\Delta_G(A,B)} [ {K_c^g c_i ( {\vec r}^A_i - {\vec r}^B_i ) + K_u^g u_i ( {\vec u}_B - \langle{\vec u}_B|{\vec u}_A \rangle {\vec u}_A ) } + K_v^g v_i ( {\vec v}_B - \langle{\vec v}_B|{\vec v}_A \rangle {\vec v}_A ) ] $$

Implementation

GPU Implementation

Relevant Codes