aboutsummaryrefslogtreecommitdiff
path: root/doc/articles/pointgroup.rst
blob: 9122d5df5e2b21ba3618861ba4fe3f07b3534632 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
===============================================
How to choose the right point group for merging
===============================================

A common question from our users is how to choose the correct symmetry for
merging, i.e. the correct ``-y`` option.  It's actually not that difficult, but
it does touch on several areas of crystallography theory. This document aims
to be a gentle introduction to the process, introducing the concepts step by
step. For a somewhat terse explanation, see section 6 of the following paper:

T. A. White, A. Barty, F. Stellato et al
"Crystallographic data processing for free-electron laser sources"
Acta Cryst. D69 (2013), p1231–1240.
`doi:10.1107/S0907444913013620 <http://dx.doi.org/10.1107/S0907444913013620>`_

Another useful article is the following:

M. Nespolo, M. I. Aroyo and B. Souvignier
"Crystallographic shelves: space-group hierarchy explained"
J. Applied Cryst. 51 (2018) p1481-1491
`doi:10.1107/S1600576718012724 <https://doi.org/10.1107/S1600576718012724>`_

Step 1: Temporarily forget about space groups
=============================================

To merge your reflection data, CrystFEL needs to know which reflections are
symmetrically equivalent.  This information is given by the *point group*.
The 230 space groups can be classified into 17 categories, each corresponding
to a single point group.  If you know the *space* group for your crystals in
advance, that's a big advantage.  However, for now you only need to know the
*point* group.

If you're working on an unknown structure, don't get ahead of yourself!
Many crystallographic data processing programs start suggesting possible space
groups very early in the process, such as when the patterns are indexed.
The space group is **only a hypothesis until the structure is solved**, so you
always need to take these early suggestions with a pinch of salt.  CrystFEL's
design philosophy is not to deal with space group determination at all.
CrystFEL will never ask you to tell it the space group of your structure ahead
of time, nor will it suggest one automatically for your structure [#f1]_.

The following table shows the point group corresponding to each of the space
groups.  To keep things simple, the table only contains the `Sohncke space
groups <https://dictionary.iucr.org/Sohncke_groups>`_, which are the ones
relevant to biological structures.  The point groups are given in exactly the
form you will type them into CrystFEL:

===========    ============
Point group    Space groups
===========    ============
``1``          P1
``2``          P2, P2\ :sub:`1`, C2 (pay special attention to step 3 below)
``222``        P222, P222\ :sub:`1`, P2\ :sub:`1`\ 2\ :sub:`1`\ 2, P2\ :sub:`1`\ 2\ :sub:`1`\ 2\ :sub:`1`, C222\ :sub:`1`, C222, F222, I222, I2\ :sub:`1`\ 2\ :sub:`1`\ 2\ :sub:`1`
``4``          P4 P4\ :sub:`1`, P4\ :sub:`2`, P4\ :sub:`3`, I4, I4\ :sub:`1`
``422``        P422, P42\ :sub:`1`\ 2, P4\ :sub:`1`\ 22, P4\ :sub:`1`\ 2\ :sub:`1`\ 2, P4\ :sub:`2`\ 22, P4\ :sub:`2`\ 2\ :sub:`1`\ 2, P4\ :sub:`3`\ 22, P4\ :sub:`3`\ 2\ :sub:`1`\ 2, I4222, I4\ :sub:`1`\ 22
``32_R``       R32 (rhombohedral axes, pay special attention to step 6)
``3_R``        R3 (rhombohedral axes, pay special attention to step 6)
``3_H``        H3 (hexagonal axes, pay special attention to step 6), P3, P3\ :sub:`1`, P3\ :sub:`2`
``321_H``      H32 (hexagonal axes, pay special attention to step 6), P321, P3\ :sub:`1`\ 21, P3\ :sub:`2`\ 21
``312_H``      P312, P3\ :sub:`1`\ 12, P3\ :sub:`2`\ 12
``6``          P6, P6\ :sub:`1`, P6\ :sub:`2`, P6\ :sub:`3`, P6\ :sub:`4`, P6\ :sub:`5`
``622``        P622, P6\ :sub:`1`\ 22, P6\ :sub:`2`\ 22, P6\ :sub:`3`\ 22, P6\ :sub:`4`\ 22, P6\ :sub:`5`\ 22
``23``         P23, F23, I23, P2\ :sub:`1`\ 3, I2\ :sub:`1`\ 3
``432``        P432, P4\ :sub:`2`\ 32, F432, F4\ :sub:`1`\ 32, I432, P4\ :sub:`3`\ 32, P4\ :sub:`1`\ 32, I4\ :sub:`1`\ 32
===========    ============

Notice that, in most cases, the correct point group can easily be recognised
from the space group, without memorizing the entire table.

If you are in the fortunate situation of knowing the space group for your
sample before processing the data, look up the point group in the table above
and keep it in mind as you read the next sections.  If you can't find your
space group in the table (for example, *A112*), your source of information is
using a non-standard setting.  Everything should become clear in step 3.

If you don't know the space group, no problem: we will work everything out in
the steps below.


Step 2: Determine the apparent symmetry
=======================================

The orientation of each crystal in your dataset was determined by the indexing
procedure inside ``indexamajig``.  There's a choice of indexing algorithms
which work in many different ways, but they all share one thing in common: they
only look at the positions of the Bragg peaks, not the intensities.

As you should know from basic diffraction theory, the positions of the Bragg
peaks are determined by the translational symmetry of the structure (the
*lattice*), whereas the intensities are determined by the contents of the
unit cell.

This leads to a problem for some symmetry classes.  If the overall crystal
structure, taking into account the unit cell contents, has lower symmetry than
the lattice, there will be an *indexing ambiguity*.  In these cases, the Bragg
peak positions don't provide enough information to correctly determine the
orientation of the crystal.  The results will be an equal mixture of correctly
indexed patterns, and ones where the Miller indices for the reflections are
wrong.  But, we're getting ahead of ourselves...

Just by looking at the parameters of the lattice (the unit cell parameters), we
can determine the symmetry that the merged dataset will exhibit.  This is the
symmetry that the indexing algorithm is able to discern (by looking at the
Bragg positions only), and therefore which reflections should be considered
symmetrically equivalent.  This is the point group which we will tell to
``process_hkl`` or ``partialator``.

The following table shows the possible cases and the point group to use in
each case.  Use the furthest down row that is compatible with your data, for
example if the axis lengths are all equal (*a=b=c*) and the angles are all 90°,
you should use ``432``, even though ``32_R``, ``422``, ``222``, ``2`` and ``1``
would all fit.

For this step, what matters are the *approximate* symmetries of the lattice.
You should consider an angle to be equal to 90° if it's within about 1° of that
value, and axis lengths to be equal if they're within about 1% of the same
length.  If ``indexamajig`` gets confused between the axes (shown by double
peaks in the ``cell_explorer`` histograms), then they should be considered
equal.  Conversely, if ``indexamajig`` was able to tell the axes apart (clear,
single peaks for each axis length, with significantly different lengths and
angles), then you can consider them distinct.

The centering of the cell (P, A, B, C, I, F, R or H) is irrelevant at this
step, unless you have "H-centering", which is a special case that we will come
to later.

=================================== =======================
Unit cell parameters                Point group for merging
=================================== =======================
No restrictions                     ``1``
alpha=beta=90°                      ``2``
H-centering and a=b and gamma=120°  ``321_H``
a=b and gamma=120°                  ``622``
a=b=c                               ``32_R``
All angles 90°                      ``222``
All angles 90° and a=b              ``422``
All angles 90° and a=b=c            ``432``
=================================== =======================

Perhaps your cell parameters resemble one of the cases, but with the axes
"re-named".  For example, you might have beta=gamma=90°, alpha≠90°, and all
axes different.  This is the same as point group *2* above, but with the axes
*a,b,c* re-labelled as *b,c,a*.  We can deal with that, as described in the
next step.


Step 3: Make sure the "unique axis" is correct
==============================================

Let's say your point group is *2*.  In this case, there is a single twofold
axis of rotational symmetry.  The symmetry axis can be along the *a*, *b* or
*c* direction of the lattice - these letters are just the names we use to refer
to the axes.  In theory, you can define the unit cell any way you like, and
CrystFEL will be able to cope (with one exception, mentioned below).  However,
some possibilities are more "conventional" than others, and it can help to
avoid problems if you follow the established conventions.  For example, not all
software can handle all of the possibilities smoothly.  It's also easier to
compare structures when they're described in the same way.

You can tell the direction of the twofold rotation axis, because it has to be
along the axis perpendicular to the angle that isn't 90°.  For example, the
following cell parameters show that the twofold rotation axis is along *b*.
We refer to *b* as the *unique axis*:

a=34 Å, b=123 Å, c=44 Å, alpha=gamma=90°, beta=131°

The following cell has *unique axis a*:

a=92 Å, b=74 Å, c=34 Å, alpha=128°, beta=gamma=90°

However, *a* as the unique axis is a very unconventional situation.  It would
make things easier for yourself to change your target unit cell to make *b* or
*c* the unique axis, and re-run ``indexamajig`` [#f2]_.

**If you've been told that the space group is simply "P2", check carefully to
make sure which convention is being used, because unique axis b or c are
considered equally acceptable.**

If your non-90° angle is very close to 90°, then you should instead be using
point group *222*.  As mentioned above, what matters are the *approximate*
symmetries that can be discerned by the indexing algorithm.

Other types of unit cell have a 'unique' axis, as well.  For example, a
tetragonal cell has all angles 90°, two axes with the same length and one
different.  The different length axis could be labelled as *a*, *b* or *c*.
However, in this case, anything other than unique axis *c* is highly
unconventional.  Nevertheless, check carefully here as well.

When you tell ``process_hkl`` or ``partialator`` the symmetry, you'll need to
tell it the unique axis.  By default, CrystFEL programs assume that the unique
axis is *c*.   If you have anything else, append ``_uaa``, ``_uab`` or ``_uac``
to the point group symbol (from the tables above) to indicate which is the
'unique' axis.  For the first example from above, we would use ``2_uab``:

a=34 Å, b=123 Å, c=44 Å, alpha=gamma=90°, beta=131°

For the tetragonal unit cell parameters shown below, we would use ``422``,
which is a synonym for ``422_uac`` since the unique axis is assumed to be *c*:

a=123 Å, b=123 Å, c=44 Å, alpha=beta=gamma=90°


Step 4: Add an inversion center to merge Friedel pairs
======================================================

Remember that the point group tells CrystFEL which reflections to consider
as symmetrically equivalent.  The point group you have, at this point, will
*not* include an inversion center, i.e. it will *not* consider reflections
h,k,l and -h,-k,-l as equivalent.  This means that the merging process will
preserve any anomalous signal present in your data.

If you don't expect (or want) an anomalous signal, you can get better results
by merging Friedel pairs of reflections.  This doubles the number of
measurements per symmetrically unique reflection, which can make a large
improvement!  To do this, simply add the missing inversion center to the point
group.  This will change the point group symbol in a way that's not immediately
logical.  The following table shows the results of adding an inversion symmetry
to each of the point groups, so you just have to look up your case.

===========    =================================
Point group    Point group with inversion center
===========    =================================
``1``          ``-1``
``2``          ``2/m``
``222``        ``mmm``
``422``        ``4/mmm``
``32_R``       ``-3m_R``
``321_H``      ``-3m1_H``
``622``        ``6/mmm``
``432``        ``m-3m``
===========    =================================

The point group symbols in the table above look quite strange.  If you need to
look up one of these symbols in a crystallographic textbook, you just need to
know that the minus signs are supposed to indicate a "bar" over the following
digit.  However, there's usually no need to worry about that.

If you've added a unique axis suffix, add the same suffix to your new point
group.  For example, ``622_uab`` goes to ``6/mmm_uab`` (although, either of
these cases would be considered very unconventional).


Step 5: Worry about indexing ambiguities
========================================

At this point, you're in a position to merge your data.  If your prior
information about the point group from step 1 agreed with what you determined
in step 2, then everything is OK and you're finished already!  Simply give the
point group symbol to ``partialator`` or ``process_hkl`` with the ``-y``
argument (or via the CrystFEL GUI).  For example: ``-y 4/mmm``.

However, maybe something is still not right.  Perhaps the structure solution
software is complaining about "twinned data", strange statistics or "poor"
L-test results.  Or, perhaps your prior information about the structure doesn't
match the point group you determined in the previous steps.  In this case, you
may be facing an indexing ambiguity, where the true symmetry is lower than what
can be distinguished by the indexing algorithm.

An *indexing ambiguity* is when the positions of the Bragg peaks do *not* give
sufficient information to uniquely identify the orientation of the crystal.
Instead, there are a small number (usually 2) of possible orientations which
give the *same Bragg peak positions*.  The correct orientation can be
determined by looking at the peak intensities, so it requires a separate
processing step after indexing and integration.

Indexing ambiguities can be resolved in CrystFEL using ``ambigator``.  This
program takes a stream (from ``indexamajig``), works out the correct indexing
assignments, and writes a new stream with the incorrectly assigned reflections
re-labelled with their correct indices.  Here, "correct" means "consistent with
the other patterns in the dataset" - you should keep in mind that the indexing
ambiguity allows separately-processed datasets to have inconsistent labels.

The mechanics of running ``ambigator`` will be described in a separate
document.  However, you will need to know the "real" and "apparent" point
groups.  The apparent point group is the one we already determined.  The real
point group is so far unknown (unless you have prior information!), but there
are a small number of possibilities.  Here they are:

============================  ======================================================
Apparent point group          Real point group
============================  ======================================================
``422``                       ``4``
``32_R`` (rhombohedral axes)  ``3_R`` (rhombohedral axes)
``432``                       ``23``
``622``                       ``3_H`` (hexagonal axes) - double ambiguity, see below
``622``                       ``6``
``622``                       ``312_H`` (hexagonal axes)
``622``                       ``321_H`` (hexagonal axes)
============================  ======================================================

Notice that structures with hexagonal lattices (apparent point group *622*) are
particularly problematic, with quite a large number of real point groups giving
the apparent *622* symmetry.  One of those cases, point group ``3_H`` exhibits
a *double ambiguity* where there are four indexing possibilities for each
pattern, not just the usual two.


Step 6: Extra information about "H cells"
=========================================

A rhombohedral unit cell (all axes the same length, all angles the same but not
equal to 90°) can be represented in two ways.  The first way is using the axes
exactly as just described.  In this case, we talk about "rhombohedral axes" and
use space group symbols *R3* and *R32*.  The second way is to embed the
rhombohedral cell inside a hexagonal unit cell (a=b≠c, alpha=beta=90°,
gamma=120°) while having multiple lattice points (i.e. extra copies of the
crystal structure) within the unit cell.  In this case, we talk about
"hexagonal axes" and use space group symbols *H3* and *H32*.

You will find both representations in space group tables - for example
`here, in the International Tables Volume A <https://it.iucr.org/Ac/ch2o3v0001/sgtable2o3o155/>`_.
Rhombohedral axes are easier to think about, but hexagonal axes are commonly
used for protein structures.  If you've downloaded a rhombohedral structure
from the PDB, it's probably (but not always!) using hexagonal axes.

Different software packages use different conventions for labelling these
cells.  For example, you might also encounter *R3:h* and *R3:r* for hexagonal
and rhombohedral axes respectively.  Unfortunately, sometimes you might even
encounter programs which use *R3* to refer to *hexagonal* axes, and *H3* for
*rhombohedral* axes!  However, you can always tell the difference by looking
at the unit cell parameters.  For some more discussion, including a useful
diagram, see `this classic article
<http://www.phenix-online.org/phenixwebsite_static/mainsite/files/newsletter/CCN_2011_01.pdf#page=12>`_.

The most important thing to keep in mind is that representing the unit cell in
a different way will never change any of the physical properties.  If the
symmetry is *R3* or *H3*, there's an indexing ambiguity, and if it's *R32* or
*H32* then there's no ambiguity. The *R3* and *H3* cases are the same thing, as
are the *R32* and *H32* cases. In both cases, the number of symmetry
equivalents for each reflection is the same.  If there's a strange accidental
indexing ambiguity for one version (see step 7), the same accidental indexing
ambiguity applies to the other version as well.

However, you need to tell CrystFEL which representation you're using.  For all
trigonal point groups - that is, anything with a rhombohedral lattice, or a
hexagonal lattice but no sixfold symmetry - you will need to append either
``_H`` or ``_R`` to the space group symbol.  For example, for point group
*3* on rhombohedral axes, use ``3_R``.  For hexagonal axes, use ``3_H``.

You *cannot* use the unique axis and axis definition suffixes together, for
example ``321_H_uab``.  Always use unique axis *c* for trigonal cells on
hexagonal axes.

There's a further complication.  There are actually two ways that the
rhombohedral cell can be "embedded" into the hexagonal cell.  The two ways are
called *obverse* and *reverse*.  The *International Tables* uses the *obverse*
representation [#f3]_, and so does all the software that I know about.
This complication affects the point group symbol that you must use for space
group *R32*/*H32* (it makes no difference for *R3*/*H3*).  Here are all the
cases for *R32*/*H32*:

============   =========  ================================  ==================
Axes           Setting    Point group as given to CrystFEL  Comment
============   =========  ================================  ==================
Rhombohedral   n/a        ``32_R``
Hexagonal      Obverse    ``321_H``
Hexagonal      Reverse    ``312_H``                         Don't use this one
============   =========  ================================  ==================

Just "for fun", here's the same table for *R3*/*H3*:

============   =========  ================================  ==================
Axes           Setting    Point group as given to CrystFEL  Comment
============   =========  ================================  ==================
Rhombohedral   n/a        ``3_R``
Hexagonal      Obverse    ``3_H``
Hexagonal      Reverse    ``3_H``                           Same as for obverse
============   =========  ================================  ==================

As you can see, your life will be much easier if you just use rhombohedral axes
all the time.  However, due to the prevalence of hexagonal axes in deposited
structures, this is likely to mean that you have to convert from one
representation to the other.  Converting atomic locations (i.e. a structural
model) is outside the scope of CrystFEL, but CrystFEL *can* convert just the
unit cell parameters.  For example, given an "H-centered" unit cell file::

  CrystFEL unit cell file version 1.0

  lattice_type = hexagonal
  centering = H
  unique_axis = c

  a = 66.2 A
  b = 66.2 A
  c = 150.2 A

  al = 90.0 deg
  be = 90.0 deg
  ga = 120.0 deg

CrystFEL's ``cell_tool`` can calculate the rhombohedral representation::

  $ cell_tool -p example.cell --uncenter
  Input unit cell: cell-example.cell
  ------------------> The input unit cell:
  hexagonal H, unique axis c, right handed.
  a      b      c            alpha   beta  gamma
   66.20  66.20 150.20 A     90.00  90.00 120.00 deg
  ------------------> The primitive unit cell:
  rhombohedral R, right handed.                                <<-----------
  a      b      c            alpha   beta  gamma               <<-----------  Look here!
   62.99  62.99  62.99 A     63.40  63.40  63.40 deg           <<-----------
  ------------------> The centering transformation:
  [    1    0    1 ]
  [   -1    1    1 ]
  [    0   -1    1 ]
  ------------------> The un-centering transformation:
  [  2/3 -1/3 -1/3 ]
  [  1/3  1/3 -2/3 ]
  [  1/3  1/3  1/3 ]



Step 7: "It still isn't working!"
=================================

The ambiguities described in step 5 are the most common cases, but there are
more possibilities.  Sometimes, the lattice parameters "accidentally" give rise
to indexing ambiguities.  As noted above, it's the *apparent* symmetries of the
lattice that matter here.  For example, unless the indexing is *very* accurate
(within 1/20 of a degree), the following unit cell will need to be merged with
point group *222* (or *mmm* to merge Friedel pairs), even though it is
technically monoclinic:

a=63 Å, b=82 Å, c=95 Å, alpha=gamma=90°, beta=90.04°

In this case, there will be an indexing ambiguity, because the true symmetry
is *2* (unique axis *b*), but the apparent symmetry is *222*.

Things can get even more complicated than this, and some very "interesting"
ambiguities have turned up over the years.  CrystFEL's ``cell_tool`` utility
can analyse your unit cell and spot possible ambiguities.  See `the manual
<https://desy.de/~twhite/crystfel/manual-cell_tool.html>`_ for details.

Crystal structures seem to have a way of finding new ways to cause trouble.
So, if things are still not working, or if you're just confused, we're happy to
help.  Just send an email!  See the `contact <https://desy.de/~twhite/crystfel/contact.html>`_
page on the CrystFEL website for details.

**Good luck, and may all your indexing be unambiguous!**


.. rubric:: Footnotes

.. [#f1] There are a couple of small exceptions here, when the data is exported
   to XScale or MTZ format.  These formats *require* a space group to be
   nominated, because of the aforementioned reliance on early space group
   nomination.  Here, CrystFEL chooses the lowest-symmetry space group that
   reflects the point symmetry according to which the merging was performed.
   The "downstream" structure solution software should be clever enough to
   assign the correct space group, regardless of what's in the data file.

.. [#f2] It's also possible to change the indexing assignments in the stream
   without re-running indexing, but this could be considered "advanced" usage.
   As mentioned above, it's also possible to continue using the non-standard
   setting, at least as far as CrystFEL is concerned.  However, in that case
   you can expect to have difficulty with other software or when depositing the
   structure.

.. [#f3] If you're interested, this is made explicit in section 2.1.3.6.6 of
   International Tables Volume A (2016 edition), which you can read
   `here <https://it.iucr.org/Ac/ch2o1v0001/sec2o1o3/>`_ (subscription
   required).