Tutorial04 - ccdc-opensource/dash GitHub Wiki

Tutorial 4: Handling a Structure in Which There is a Space Group Ambiguity

Introduction

The object of this tutorial is to guide you through the structure solution of decafluoroquaterphenyl (DFQP). It assumes that you have completed the previous tutorials. In this tutorial, you will learn how to:

  • Handle a structure solution in which there is a space group ambiguity.

  • Deal with a structure that has a potential centre of symmetry.

  • Deal with a more prominent background than you have encountered so far.

  • Handle a difficult Pawley fitting problem.

Data

The data set Tutorial_4.xye is a laboratory X-ray diffraction data set collected by Dr. Lubo Smrcok. The incident wavelength was 1.789 Å.

Stage 1: Reading the Data

  • Open DASH and select the directory where the data resides.

  • Select View data / determine peak positions and click Next >.

  • Select the file Tutorial_4.xye using the Browse... button.

  • Click Next >.

  • Check that the wavelength and radiation source have been set correctly and click Next >.

Stage 2: Examining the Data and Removing the Background

The data spans 4 to 61° 2θ. Remember that this data has been collected at a relatively long wavelength and so the real-space resolution of the data is only ~1.8 Å. Truncate the data to 2.0 Å resolution.

Having examined the data, we really want to strip out the background. This is because the data, whilst good, are nowhere near as good as the synchrotron data sets that you examined in Tutorials 1 and 2. If we defer background modelling until the Pawley fit stage, we have an additional set of parameters to worry about at that stage. With the weaker data at higher angles, there is always a chance that correlations between the weak peaks and the background parameters may cause instabilities in the fit. These can be avoided by removing the background at this stage.

Select Preview using the default value of 100 for the filter window size. Examine the background fit carefully at all points in the pattern, but especially at the low and high angle regions. The fit is excellent, with only a marginal underestimation at low angle. Try altering the window size to 50 and select Preview. Note the better fit at low angle, and also the increased flexibility in the background shape that decreasing the window size has brought about. Although you could proceed with either value (as both give an excellent fit) return to the less structured background by changing the window size back to 100 and click Apply to strip off the background. The background-subtracted pattern is displayed. Examine it closely before proceeding to the next stage.

Stage 3. Fitting the Peaks to Determine the Exact Peak Positions

Select the first 22 peaks using the method described in Tutorial 1.

Here is a guide to the positions (° 2θ) of the first 22 peaks:

8.7661 17.3011 18.7498 20.7870 21.3166
21.5902 23.6311 24.4517 24.9854 25.4535
27.1252 27.7534 28.1742 29.1876 30.7017
33.5830 33.7804 34.1816 34.7082 34.9715
35.2415 35.4262
  • Click Next >.

  • Select Run> to run DICVOL or use another indexing program as described in Tutorial 1.

Stage 4. Indexing

Your indexing program may reveal a number of possible unit cells. The unit cell with the highest figures of merit should be monoclinic with volume ~1794 Å3. The DICVOL program returns a monoclinic cell with a = 24.04678 Å, b = 6.15668 Å, c = 12.42973 Å and beta = 102.753°, Volume = 1794.80 Å3 with figures of merit M(22) = 16.1 and F(22) = 29.3. Whilst not fantastic figures of merit, we can note that there are nearly 100 calculated peaks for this cell, as against the 22 that were input. This might indicate that there are a lot of systematic absences, or it may indicate that the cell is wrong.

Stage 5. Stop and Think

Does the cell make sense? In this case, given that the molecule may well adopt a planar configuration, it is difficult to estimate the likely molecular volume. Assuming 4 molecules per cell and dividing 1800 Å3 by 4, we get 450 Å3, which is certainly enough to accommodate the molecule’s backbone of 24 carbon atoms (15 Å3) + 10 fluorines (10 Å3) = 460 Å3. So the cell is worth checking.

Stage 6. Checking the Cell and Determining the Space Group

It is clear that there are a great many excess tick marks, indicating probable systematic absences, this means that the space group must be of substantially higher symmetry than P2. Zoom into the 10 - 16° region of the pattern and watch the correspondence between the tick marks and the observed reflections as you scroll through some of the possible space groups. You will see that many of the space groups can be ruled out immediately, for example, the primitive cells predict many peaks that are not observed. A centred cell is therefore likely, and so C2/c is a likely choice (see Appendix D: Definitions of DASH Figures of Merit). Select this group and examine the pattern closely. Things look good at low angle but the peak at ~24.5° is misplaced.

Altering the setting to I2/a results in excellent agreement throughout the pattern, so this appears to be the best choice. Note, however, that Ia has the same systematic absences as I2/a and therefore gives exactly the same level of agreement. Using the table in Appendix D of the DASH User Guide, the centrosymmetric space group I2/a (C2/c) is about 7 times more common than the non-centrosymmetric space group Ia (Cc). As the molecule possesses a molecular centre of symmetry in the middle of the bond between the two central rings, I2/a is certainly the more likely choice.

Stage 7. Extracting Intensities

Pawley fitting this pattern in either I2/a or Ia will give identical results (the absences are the same) and so we will fit Ia. We want to delete the last group of 3 peaks as they are highly overlapped, in the region of 35°, (sweep this range and select the Delete key). The program now detects that it has peaks available for unit cell refinement and so the Pawley Refinement Status window appears automatically, as the peaks widths for all the indexing peaks that you fitted earlier are still available to DASH.

Select Refine. The initial 3 cycles of least squares refinement only involve the two terms corresponding to the linear background and to the individual reflection intensities; accept these three cycles. Using the cell constants listed in Stage 4, the Pawley χ2 is about 2.7. Proceed and refine the unit cell and zero-point along with the background and the intensities. You may get a warning from DASH stating that errors have been detected due to instabilities in the Pawley fit. If so, reject the fit and increase the Overlap criterion to 2.0. Select Refine and you should get a stable refinement with a Pawley χ2 of about 1.9.

Accept your best Pawley fit and save it as Tutorial_4.sdi.

Stage 8. Molecule Construction

Construct a 3D molecular description of the molecule using your favourite modelling software and save it in pdb, mol or mol2 format. Save this as Tutorial_4-full.pdb, Tutorial_4-full.mol or Tutorial_4-full.mol2. If you do not have a model builder to hand there are files provided with the tutorial: Tutorial_4-full.mol2 and Tutorial_4-half.mol2.

Stage 9. Setting up the Structure Solution Run

  • Continue on from the Pawley fitting stage by pressing Solve >.

  • Click on the icon and select Tutorial_4-full.mol2 (the file that you created in Stage 8); a Z-matrix file called Tutorial_4-full_1.zmatrix will be generated automatically.

  • Read in the Tutorial_4-full.zmatrix file, which has three moveable torsion angles.

At this point DASH will confirm that there are 9 independent parameters. These parameters are listed when you click Next >. There are 3 parameters describing the positional co-ordinates, 4 (3 of which independent) describing the molecular orientation within the unit cell and 3 variable torsion angles. All F boxes are unticked by default, indicating that all 10 parameters are allowed to vary during structure solution. Click Next >, leave the parameters set at their default values, click Next > again, then Solve >; the simulated annealing process begins.

Stage 10. Monitoring Structure Solution Progress

The progress of the structure solution can be followed by monitoring the profile χ2 and the difference plot.

The profile χ2 should fall fairly quickly to below 20 and the fit to the data should look not bad, with the residual misfits distributed throughout the pattern.

Stage 11. Examining the Output Structure

View the structure using the View button in the Results from Simulated Annealing window. The molecular conformation and the packing look reasonable. However, we have still only explored Ia.

Stage 12. Exploring the Possibility of I2/a

A: Space Group Ia

There is a quick and easy way to explore whether or not the true space group is Ia or I2/a whilst performing all SA runs in Ia In the previous run, the centre of mass of the molecule was allowed to roam freely throughout the unit cell. If the space group truly is I2/a, then the centre should lie either on the origin or on the 2-fold axis. Accordingly we can:

  • Constrain the centre of mass to lie at 0,0,0.

  • Constrain the centre of mass to lie on 0.25,y,0.

and repeat the structure solution runs in Ia to see the fits that are obtained.

How to constrain the molecule to lie on special positions

  • To constrain the centre of the molecule to lie on the origin of the cell, stop the current annealing run by pressing Stop, and return to the Parameter Bounds window.

  • Enter values of 0.0 for the initial values of x(frag1), y(frag1) and z(frag1) and then click the F check box for each of these variables in order to fix the x,y,z position of the molecule within the unit cell at the fractional co-ordinates 0,0,0. Note that by default, DASH uses the centre-of-mass of the molecule as the x,y,z reference point, and for the DFQP molecule, this corresponds to the midpoint of the central bond. You can now proceed with the simulated annealing run knowing that the centre-of-mass of the molecule will always be constrained to lie at 0,0,0.

  • Similarly, for the SA in which we wish to hold the centre-of-mass on the 2-fold axis, return to the Parameter Bounds window. Following the same procedure as just outlined, fix x(frag1) at 0.25, leave y(frag1) to vary and fix z(frag1) at 0.0.

Fixing the centre of mass at 0,0,0 causes the structure solution to stick at a very high profile χ2, around 130. In contrast, after constraining the centre of mass of the molecule to lie on the 2-fold axis, the profile χ2 falls rapidly to around 20 and local minimisation reduces this still further to around 18. The packing motif is identical to that obtained in Ia.

Note that a 2-fold rotation of the molecule about the b-axis does not give an exact mapping from one half of the molecule to the other, as in Ia, there is no constraint upon the torsion angles to produce this. However, it is so close to doing so that it is safe to conclude that the molecule crystallises in space group I2/a and that a Rietveld refinement with only half a molecule in the asymmetric unit will be successful.

B: Space Group I2/a

You can (if you want) re-fit the diffraction data using the same unit cell and selecting space group I2/a but there is in fact an easier way of running the SA in I2/a. As stated previously, Ia and I2/a have the same systematic absences and so Pawley fitting in either space group will give the same result. Accordingly, we can use the Pawley fit files already created and simply modify the Tutorial_4.sdi file to inform the SA that we now wish to solve in I2/a. Copy Tutorial_4.sdi to Tutorial_4-half.sdi and open the file in a text editor. Look for the following line:

SpaceGroup 52 9:b3 I 1 a 1

and change it to:

SpaceGroup 69 15:b3 I 1 2/a 1

and then save the file. You have now told the program that the SA must now be performed in space group I2/a (consult 17 Appendix D: Definitions of DASH Figures of Merit for an explanation of the format of the Space Group line), whilst leaving the pointers to the existing Pawley fit files.

Next, as we are now in I2/a, we only require half a molecule to fill the asymmetric unit.

Construct a Tutorial_4-half.mol file based on the above diagram and read it into DASH. The resultant Tutorial_4-half_1.zmatrix file only has a single torsion angle - the torsion angle between the two ‘halves’ of the full molecule is automatically determined by the orientation of the molecule within the unit cell.

Using the Wizard, load the new Tutorial_4-half.sdi file and the Tutorial_4-half_1.zmatrix file. Do not fix any of the variable parameters (i.e. allow the molecule to roam the unit cell) and start a SA run. The profile χ2 should fall rapidly to around 20. Viewing the molecule shows that the space group symmetry is indeed constructing the whole molecule (though the central bond is not displayed on-screen the distance C-C can be measured to be about 1.6 Å).

There is therefore no doubt that this molecule crystallises in space group I2/a. Note that the molecule is sitting on a 2-fold rotation axis and not the centre of symmetry.

Stage 13. Conclusion

This tutorial has shown that there are several ways to solve the crystal structure of DFQP using global optimisation, all of them equally valid. Structure solutions of this complexity using DASH take so little time to execute that it is worth investigating the various possibilities in order to be certain that you have the correct answer.

The final fit to the data is not that great, but the chemical sense of the structure is such that there is no doubt that the structure is correct. The published Rietveld refined structure (Smrcok et al.) for this molecule confirms this. Accordingly, note that it is entirely possible to obtain a profile χ2 that is a factor of 10 higher than the Pawley χ2 and still have the correct structure.

Some remarks on Rietveld Refinement are in order. The published structure reported the results of an unrestrained Rietveld refinement, which shows quite severe distortion of the benzene rings. This is a natural consequence of allowing too many variables to be optimised against the rather limited data, especially this set of laboratory data of lower accuracy than synchrotron. A tradition has grown up of allowing unrestrained refinement of all atomic positions in order to prove that the crystal structure is correct. This certainly proves that the atoms all fit well with the low resolution electron density represented here by only 174 reflections, which are extracted by Pawley fit corresponding to the complete data set in 2θ, corresponding to 1.763 Å resolution. However a more realistic model for the real crystal structure is obtained if one uses the DASH Rigid-group Rietveld refinement.

It has be seen in the previous section (12) that the low-resolution data gives an unreasonably long value for the central C-C bond, 1.60 Å, when we refine with a half-molecule in I2/a. A better model for the full crystal structure is to use the constrained full molecule placed with its centre of mass on the crystallographic 2-fold axis at (0.0, y, 0.25). If the DASH Rietveld refinement is applied to the data available to 1.763 Å resolution, in space group Ia, we obtain typical solutions with Chi-sqd of about 86 and Profile Chi-sqd of 11.3. An example refinement of the global isotropic temperature factor scale, followed by refinement of Translations, (y only), and Rotations gave values of 0.6717, and Chi-sqd of 85.87, 11.30. A check of the shortest inter-molecular contacts shows shortest C...H 2.60 Å, F...F 2.56 Å, and H41...F16 2.09 Å. This latter value is rather closer than expected being 0.5 Å shorter than the van der Waals radii sum, but the other short-contact values can be seen in CSD single crystal structures.

References

*DICVOL Program:
*D. Louer & M. Louer (1972) J. Appl. Crystallogr. 5, 271-275.
A. Boultif & D. Louer (1991) J. Appl. Crystallogr. 24, 987-993.

*Crystal structure of decafluoroquaterphenyl:
*L. Smrcok, B. Koppelhuber-Bitschnau, K. Shankland, W. I. F. David, D. Tunega and R. Resel (2001) Z. Kristallogr. 216, 63-66.

⚠️ **GitHub.com Fallback** ⚠️