aboutsummaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorThomas White <taw@physics.org>2022-11-29 13:53:14 +0100
committerThomas White <taw@physics.org>2022-11-29 14:23:24 +0100
commite62fd471478f3501aea5174cbdb13514ed037a5d (patch)
treee7aa81dddd123ed42d798723319e87d394ba5dfd /doc
parentc474eb204732cfb26cb659f9f03de9a712215720 (diff)
Add doc/articles/pointgroup.rst
Diffstat (limited to 'doc')
-rw-r--r--doc/articles/pointgroup.rst463
1 files changed, 463 insertions, 0 deletions
diff --git a/doc/articles/pointgroup.rst b/doc/articles/pointgroup.rst
new file mode 100644
index 00000000..0128a105
--- /dev/null
+++ b/doc/articles/pointgroup.rst
@@ -0,0 +1,463 @@
+===============================================
+How to choose the right point group for merging
+===============================================
+
+A common question from our users is how to choose the correct symmetry for
+merging, i.e. the correct ``-y`` option. It's actually not that difficult, but
+it does touch on several areas of crystallography theory. This document aims
+to be a gentle introduction to the process, introducing the concepts step by
+step. For a somewhat terse explanation, see section 6 of the following paper:
+
+T. A. White, A. Barty, F. Stellato et al
+"Crystallographic data processing for free-electron laser sources"
+Acta Cryst. D69 (2013), p1231–1240.
+`doi:10.1107/S0907444913013620 <http://dx.doi.org/10.1107/S0907444913013620>`_
+
+Another useful article is the following:
+
+M. Nespolo, M. I. Aroyo and B. Souvignier
+"Crystallographic shelves: space-group hierarchy explained"
+J. Applied Cryst. 51 (2018) p1481-1491
+`doi:10.1107/S1600576718012724 <https://doi.org/10.1107/S1600576718012724>`_
+
+Step 1: Temporarily forget about space groups
+=============================================
+
+To merge your reflection data, CrystFEL needs to know which reflections are
+symmetrically equivalent. This information is given by the *point group*.
+The 230 space groups can be classified into 17 categories, each corresponding
+to a single point group. If you know the *space* group for your crystals in
+advance, that's a big advantage. However, for now you only need to know the
+*point* group.
+
+If you're working on an unknown structure, don't get ahead of yourself!
+Many crystallographic data processing programs start suggesting possible space
+groups very early in the process, such as when the patterns are indexed.
+The space group is **only a hypothesis until the structure is solved**, so you
+always need to take these early suggestions with a pinch of salt. CrystFEL's
+design philosophy is not to deal with space group determination at all.
+CrystFEL will never ask you to tell it the space group of your structure ahead
+of time, nor will it suggest one automatically for your structure [#f1]_.
+
+The following table shows the point group corresponding to each of the space
+groups. To keep things simple, the table only contains the `Sohncke space
+groups <https://dictionary.iucr.org/Sohncke_groups>`_, which are the ones
+relevant to biological structures. The point groups are given in exactly the
+form you will type them into CrystFEL:
+
+=========== ============
+Point group Space groups
+=========== ============
+``1`` P1
+``2`` P2, P2\ :sub:`1`, C2 (pay special attention to step 3 below)
+``222`` P222, P222\ :sub:`1`, P2\ :sub:`1`\ 2\ :sub:`1`\ 2, P2\ :sub:`1`\ 2\ :sub:`1`\ 2\ :sub:`1`, C222\ :sub:`1`, C222, F222, I222, I2\ :sub:`1`\ 2\ :sub:`1`\ 2\ :sub:`1`
+``4`` P4 P4\ :sub:`1`, P4\ :sub:`2`, P4\ :sub:`3`, I4, I4\ :sub:`1`
+``422`` P422, P42\ :sub:`1`\ 2, P4\ :sub:`1`\ 22, P4\ :sub:`1`\ 2\ :sub:`1`\ 2, P4\ :sub:`2`\ 22, P4\ :sub:`2`\ 2\ :sub:`1`\ 2, P4\ :sub:`3`\ 22, P4\ :sub:`3`\ 2\ :sub:`1`\ 2, I4222, I4\ :sub:`1`\ 22
+``32_R`` R32 (rhombohedral axes, pay special attention to step 6)
+``3_R`` R3 (rhombohedral axes, pay special attention to step 6)
+``3_H`` H3 (hexagonal axes, pay special attention to step 6), P3, P3\ :sub:`1`, P3\ :sub:`2`
+``321_H`` H32 (hexagonal axes, pay special attention to step 6), P321, P3\ :sub:`1`\ 21, P3\ :sub:`2`\ 21
+``312_H`` P312, P3\ :sub:`1`\ 12, P3\ :sub:`2`\ 12
+``6`` P6, P6\ :sub:`1`, P6\ :sub:`2`, P6\ :sub:`3`, P6\ :sub:`4`, P6\ :sub:`5`
+``622`` P622, P6\ :sub:`1`\ 22, P6\ :sub:`2`\ 22, P6\ :sub:`3`\ 22, P6\ :sub:`4`\ 22, P6\ :sub:`5`\ 22
+``23`` P23, F23, I23, P2\ :sub:`1`\ 3, I2\ :sub:`1`\ 3
+``432`` P432, P4\ :sub:`2`\ 32, F432, F4\ :sub:`1`\ 32, I432, P4\ :sub:`3`\ 32, P4\ :sub:`1`\ 32, I4\ :sub:`1`\ 32
+=========== ============
+
+Notice that, in most cases, the correct point group can easily be recognised
+from the space group, without memorizing the entire table.
+
+If you are in the fortunate situation of knowing the space group for your
+sample before processing the data, look up the point group in the table above
+and keep it in mind as you read the next sections. If you can't find your
+space group in the table (for example, *A112*), your source of information is
+using a non-standard setting. Everything should become clear in step 3.
+
+If you don't know the space group, no problem: we will work everything out in
+the steps below.
+
+
+Step 2: Determine the apparent symmetry
+=======================================
+
+The orientation of each crystal in your dataset was determined by the indexing
+procedure inside ``indexamajig``. There's a choice of indexing algorithms
+which work in many different ways, but they all share one thing in common: they
+only look at the positions of the Bragg peaks, not the intensities.
+
+As you should know from basic diffraction theory, the positions of the Bragg
+peaks are determined by the translational symmetry of the structure (the
+*lattice*), whereas the intensities are determined by the contents of the
+unit cell.
+
+This leads to a problem for some symmetry classes. If the overall crystal
+structure, taking into account the unit cell contents, has lower symmetry than
+the lattice, there will be an *indexing ambiguity*. In these cases, the Bragg
+peak positions don't provide enough information to correctly determine the
+orientation of the crystal. The results will be an equal mixture of correctly
+indexed patterns, and ones where the Miller indices for the reflections are
+wrong. But, we're getting ahead of ourselves...
+
+Just by looking at the parameters of the lattice (the unit cell parameters), we
+can determine the symmetry that the merged dataset will exhibit. This is the
+symmetry that the indexing algorithm is able to discern (by looking at the
+Bragg positions only), and therefore which reflections should be considered
+symmetrically equivalent. This is the point group which we will tell to
+``process_hkl`` or ``partialator``.
+
+The following table shows the possible cases and the point group to use in
+each case. Use the furthest down row that is compatible with your data, for
+example if the axis lengths are all equal (*a=b=c*) and the angles are all 90°,
+you should use ``432``, even though ``32_R``, ``422``, ``222``, ``2`` and ``1``
+would all fit.
+
+For this step, what matters are the *approximate* symmetries of the lattice.
+You should consider an angle to be equal to 90° if it's within about 1° of that
+value, and axis lengths to be equal if they're within about 1% of the same
+length. If ``indexamajig`` gets confused between the axes (shown by double
+peaks in the ``cell_explorer`` histograms), then they should be considered
+equal. Conversely, if ``indexamajig`` was able to tell the axes apart (clear,
+single peaks for each axis length, with significantly different lengths and
+angles), then you can consider them distinct.
+
+The centering of the cell (P, A, B, C, I, F, R or H) is irrelevant at this
+step, unless you have "H-centering", which is a special case that we will come
+to later.
+
+=================================== =======================
+Unit cell parameters Point group for merging
+=================================== =======================
+No restrictions ``1``
+alpha=beta=90° ``2``
+H-centering and a=b and gamma=120° ``321_H``
+a=b and gamma=120° ``622``
+a=b=c ``32_R``
+All angles 90° ``222``
+All angles 90° and a=b ``422``
+All angles 90° and a=b=c ``432``
+=================================== =======================
+
+Perhaps your cell parameters resemble one of the cases, but with the axes
+"re-named". For example, you might have beta=gamma=90°, alpha≠90°, and all
+axes different. This is the same as point group *2* above, but with the axes
+*a,b,c* re-labelled as *b,c,a*. We can deal with that, as described in the
+next step.
+
+
+Step 3: Make sure the "unique axis" is correct
+==============================================
+
+Let's say your point group is *2*. In this case, there is a single twofold
+axis of rotational symmetry. The symmetry axis can be along the *a*, *b* or
+*c* direction of the lattice - these letters are just the names we use to refer
+to the axes. In theory, you can define the unit cell any way you like, and
+CrystFEL will be able to cope (with one exception, mentioned below). However,
+some possibilities are more "conventional" than others, and it can help to
+avoid problems if you follow the established conventions. For example, not all
+software can handle all of the possibilities smoothly. It's also easier to
+compare structures when they're described in the same way.
+
+You can tell the direction of the twofold rotation axis, because it has to be
+along the axis perpendicular to the angle that isn't 90°. For example, the
+following cell parameters show that the twofold rotation axis is along *b*.
+We refer to *b* as the *unique axis*:
+
+a=34 Å, b=123 Å, c=44 Å, alpha=gamma=90°, beta=131°
+
+The following cell has *unique axis a*:
+
+a=92 Å, b=74 Å, c=34 Å, alpha=128°, beta=gamma=90°
+
+However, *a* as the unique axis is a very unconventional situation. It would
+make things easier for yourself to change your target unit cell to make *b* or
+*c* the unique axis, and re-run ``indexamajig`` [#f2]_.
+
+**If you've been told that the space group is simply "P2", check carefully to
+make sure which convention is being used, because unique axis b or c are
+considered equally acceptable.**
+
+If your non-90° angle is very close to 90°, then you should instead be using
+point group *222*. As mentioned above, what matters are the *approximate*
+symmetries that can be discerned by the indexing algorithm.
+
+Other types of unit cell have a 'unique' axis, as well. For example, a
+tetragonal cell has all angles 90°, two axes with the same length and one
+different. The different length axis could be labelled as *a*, *b* or *c*.
+However, in this case, anything other than unique axis *c* is highly
+unconventional. Nevertheless, check carefully here as well.
+
+When you tell ``process_hkl`` or ``partialator`` the symmetry, you'll need to
+tell it the unique axis. By default, CrystFEL programs assume that the unique
+axis is *c*. If you have anything else, append ``_uaa``, ``_uab`` or ``_uac``
+to the point group symbol (from the tables above) to indicate which is the
+'unique' axis. For the first example from above, we would use ``2_uab``:
+
+a=34 Å, b=123 Å, c=44 Å, alpha=gamma=90°, beta=131°
+
+For the tetragonal unit cell parameters shown below, we would use ``422``,
+which is a synonym for ``422_uac`` since the unique axis is assumed to be *c*:
+
+a=123 Å, b=123 Å, c=44 Å, alpha=beta=gamma=90°
+
+
+Step 4: Add an inversion center to merge Friedel pairs
+======================================================
+
+Remember that the point group tells CrystFEL which reflections to consider
+as symmetrically equivalent. The point group you have, at this point, will
+*not* include an inversion center, i.e. it will *not* consider reflections
+h,k,l and -h,-k,-l as equivalent. This means that the merging process will
+preserve any anomalous signal present in your data.
+
+If you don't expect (or want) an anomalous signal, you can get better results
+by merging Friedel pairs of reflections. This doubles the number of
+measurements per symmetrically unique reflection, which can make a large
+improvement! To do this, simply add the missing inversion center to the point
+group. This will change the point group symbol in a way that's not immediately
+logical. The following table shows the results of adding an inversion symmetry
+to each of the point groups, so you just have to look up your case.
+
+=========== =================================
+Point group Point group with inversion center
+=========== =================================
+``1`` ``-1``
+``2`` ``2/m``
+``222`` ``mmm``
+``422`` ``4/mmm``
+``32_R`` ``-3m_R``
+``321_H`` ``-3m1_H``
+``622`` ``6/mmm``
+``432`` ``m-3m``
+=========== =================================
+
+The point group symbols in the table above look quite strange. If you need to
+look up one of these symbols in a crystallographic textbook, you just need to
+know that the minus signs are supposed to indicate a "bar" over the following
+digit. However, there's usually no need to worry about that.
+
+If you've added a unique axis suffix, add the same suffix to your new point
+group. For example, ``622_uab`` goes to ``6/mmm_uab`` (although, either of
+these cases would be considered very unconventional).
+
+
+Step 5: Worry about indexing ambiguities
+========================================
+
+At this point, you're in a position to merge your data. If your prior
+information about the point group from step 1 agreed with what you determined
+in step 2, then everything is OK and you're finished already! Simply give the
+point group symbol to ``partialator`` or ``process_hkl`` with the ``-y``
+argument (or via the CrystFEL GUI). For example: ``-y 4/mmm``.
+
+However, maybe something is still not right. Perhaps the structure solution
+software is complaining about "twinned data", strange statistics or "poor"
+L-test results. Or, perhaps your prior information about the structure doesn't
+match the point group you determined in the previous steps. In this case, you
+may be facing an indexing ambiguity, where the true symmetry is lower than what
+can be distinguished by the indexing algorithm.
+
+An *indexing ambiguity* is when the positions of the Bragg peaks do *not* give
+sufficient information to uniquely identify the orientation of the crystal.
+Instead, there are a small number (usually 2) of possible orientations which
+give the *same Bragg peak positions*. The correct orientation can be
+determined by looking at the peak intensities, so it requires a separate
+processing step after indexing and integration.
+
+Indexing ambiguities can be resolved in CrystFEL using ``ambigator``. This
+program takes a stream (from ``indexamajig``), works out the correct indexing
+assignments, and writes a new stream with the incorrectly assigned reflections
+re-labelled with their correct indices. Here, "correct" means "consistent with
+the other patterns in the dataset" - you should keep in mind that the indexing
+ambiguity allows separately-processed datasets to have inconsistent labels.
+
+The mechanics of running ``ambigator`` will be described in a separate
+document. However, you will need to know the "real" and "apparent" point
+groups. The apparent point group is the one we already determined. The real
+point group is so far unknown (unless you have prior information!), but there
+are a small number of possibilities. Here they are:
+
+============================ ======================================================
+Apparent point group Real point group
+============================ ======================================================
+``422`` ``4``
+``32_R`` (rhombohedral axes) ``3_R`` (rhombohedral axes)
+``432`` ``23``
+``622`` ``3_H`` (hexagonal axes) - double ambiguity, see below
+``622`` ``6``
+``622`` ``312_H`` (hexagonal axes)
+``622`` ``321_H`` (hexagonal axes)
+============================ ======================================================
+
+Notice that structures with hexagonal lattices (apparent point group *622*) are
+particularly problematic, with quite a large number of real point groups giving
+the apparent *622* symmetry. One of those cases, point group ``3_H`` exhibits
+a *double ambiguity* where there are four indexing possibilities for each
+pattern, not just the usual two.
+
+
+Step 6: Extra information about "H cells"
+=========================================
+
+A rhombohedral unit cell (all axes the same length, all angles the same but not
+equal to 90°) can be represented in two ways. The first way is using the axes
+exactly as just described. In this case, we talk about "rhombohedral axes" and
+use space group symbols *R3* and *R32*. The second way is to embed the
+rhombohedral cell inside a hexagonal unit cell (a=b≠c, alpha=beta=90°,
+gamma=120°) while having multiple lattice points (i.e. extra copies of the
+crystal structure) within the unit cell. In this case, we talk about
+"hexagonal axes" and use space group symbols *H3* and *H32*.
+
+You will find both representations in space group tables - for example
+`here, in the International Tables Volume A <https://it.iucr.org/Ac/ch2o3v0001/sgtable2o3o155/>`_.
+Rhombohedral axes are easier to think about, but hexagonal axes are commonly
+used for protein structures. If you've downloaded a rhombohedral structure
+from the PDB, it's probably (but not always!) using hexagonal axes.
+
+Different software packages use different conventions for labelling these
+cells. For example, you might also encounter *R3:h* and *R3:r* for hexagonal
+and rhombohedral axes respectively. Unfortunately, sometimes you might even
+encounter programs which use *R3* to refer to *hexagonal* axes, and *H3* for
+*rhombohedral* axes! However, you can always tell the difference by looking
+at the unit cell paramters. For some more discussion, including a useful
+diagram, see `this classic article
+<http://www.phenix-online.org/phenixwebsite_static/mainsite/files/newsletter/CCN_2011_01.pdf#page=12>`_.
+
+The most important thing to keep in mind is that representing the unit cell in
+a different way will never change any of the physical properties. If the
+symmetry is *R3* or *H3*, there's an indexing ambiguity, and if it's *R32* or
+*H32* then there's no ambiguity. The *R3* and *H3* cases are the same thing, as
+are the *R32* and *H32* cases. In both cases, the number of symmetry
+equivalents for each reflection is the same. If there's a strange accidental
+indexing ambiguity for one version (see step 7), the same accidental indexing
+ambiguity applies to the other version as well.
+
+However, you need to tell CrystFEL which representation you're using. For all
+trigonal point groups - that is, anything with a rhombohedral lattice, or a
+hexagonal lattice but no sixfold symmetry - you will need to append either
+``_H`` or ``_R`` to the space group symbol. For example, for point group
+*3* on rhombohedral axes, use ``3_R``. For hexagonal axes, use ``3_H``.
+
+You *cannot* use the unique axis and axis definition suffixes together, for
+example ``321_H_uab``. Always use unique axis *c* for trigonal cells on
+hexagonal axes.
+
+There's a further complication. There are actually two ways that the
+rhombohedral cell can be "embedded" into the hexagonal cell. The two ways are
+called *obverse* and *reverse*. The *International Tables* uses the *obverse*
+representation [#f3]_, and so does all the software that I know about.
+This complication affects the point group symbol that you must use for space
+group *R32*/*H32* (it makes no difference for *R3*/*H3*). Here are all the
+cases for *R32*/*H32*:
+
+============ ========= ================================ ==================
+Axes Setting Point group as given to CrystFEL Comment
+============ ========= ================================ ==================
+Rhombohedral n/a ``32_R``
+Hexagonal Obverse ``321_H``
+Hexagonal Reverse ``312_H`` Don't use this one
+============ ========= ================================ ==================
+
+Just "for fun", here's the same table for *R3*/*H3*:
+
+============ ========= ================================ ==================
+Axes Setting Point group as given to CrystFEL Comment
+============ ========= ================================ ==================
+Rhombohedral n/a ``3_R``
+Hexagonal Obverse ``3_H``
+Hexagonal Reverse ``3_H`` Same as for obverse
+============ ========= ================================ ==================
+
+As you can see, your life will be much easier if you just use rhombohedral axes
+all the time. However, due to the prevalence of hexagonal axes in deposited
+structures, this is likely to mean that you have to convert from one
+representation to the other. Converting atomic locations (i.e. a structural
+model) is outside the scope of CrystFEL, but CrystFEL *can* convert just the
+unit cell parameters. For example, given an "H-centered" unit cell file::
+
+ CrystFEL unit cell file version 1.0
+
+ lattice_type = hexagonal
+ centering = H
+ unique_axis = c
+
+ a = 66.2 A
+ b = 66.2 A
+ c = 150.2 A
+
+ al = 90.0 deg
+ be = 90.0 deg
+ ga = 120.0 deg
+
+CrystFEL's ``cell_tool`` can calculate the rhombohedral representation::
+
+ $ cell_tool -p example.cell --uncenter
+ Input unit cell: cell-example.cell
+ ------------------> The input unit cell:
+ hexagonal H, unique axis c, right handed.
+ a b c alpha beta gamma
+ 66.20 66.20 150.20 A 90.00 90.00 120.00 deg
+ ------------------> The primitive unit cell:
+ rhombohedral R, right handed. <<-----------
+ a b c alpha beta gamma <<----------- Look here!
+ 62.99 62.99 62.99 A 63.40 63.40 63.40 deg <<-----------
+ ------------------> The centering transformation:
+ [ 1 0 1 ]
+ [ -1 1 1 ]
+ [ 0 -1 1 ]
+ ------------------> The un-centering transformation:
+ [ 2/3 -1/3 -1/3 ]
+ [ 1/3 1/3 -2/3 ]
+ [ 1/3 1/3 1/3 ]
+
+
+
+Step 7: "It still isn't working!"
+=================================
+
+The ambiguities described in step 5 are the most common cases, but there are
+more possibilities. Sometimes, the lattice parameters "accidentally" give rise
+to indexing ambiguities. As noted above, it's the *apparent* symmetries of the
+lattice that matter here. For example, unless the indexing is *very* accurate
+(within 1/20 of a degree), the following unit cell will need to be merged with
+point group *222* (or *mmm* to merge Friedel pairs), even though it is
+technicall monoclinic:
+
+a=63 Å, b=82 Å, c=95 Å, alpha=gamma=90°, beta=90.04°
+
+In this case, there will be an indexing ambiguity, because the true symmetry
+is *2* (unique axis *b*), but the apparent symmetry is *222*.
+
+Things can get even more complicated than this, and some very "interesting"
+ambiguities have turned up over the years. CrystFEL's ``cell_tool`` utility
+can analyse your unit cell and spot possible ambiguities. See `the manual
+<https://desy.de/~twhite/crystfel/manual-cell_tool.html>`_ for details.
+
+Crystal structures seem to have a way of finding new ways to cause trouble.
+So, if things are still not working, or if you're just confused, we're happy to
+help. Just send an email! See the `contact <https://desy.de/~twhite/crystfel/contact.html>`_
+page on the CrystFEL website for details.
+
+**Good luck, and may all your indexing be unambiguous!**
+
+
+.. rubric:: Footnotes
+
+.. [#f1] There are a couple of small exceptions here, when the data is exported
+ to XScale or MTZ format. These formats *require* a space group to be
+ nominated, because of the aforementioned reliance on early space group
+ nomination. Here, CrystFEL chooses the lowest-symmetry space group that
+ reflects the point symmetry according to which the merging was performed.
+ The "downstream" structure solution software should be clever enough to
+ assign the correct space group, regardless of what's in the data file.
+
+.. [#f2] It's also possible to change the indexing assignments in the stream
+ without re-running indexing, but this could be considered "advanced" usage.
+ As mentioned above, it's also possible to continue using the non-standard
+ setting, at least as far as CrystFEL is concerned. However, in that case
+ you can expect to have difficulty with other software or when depositing the
+ structure.
+
+.. [#f3] If you're interested, this is made explicit in section 2.1.3.6.6 of
+ International Tables Volume A (2016 edition), which you can read
+ `here <https://it.iucr.org/Ac/ch2o1v0001/sec2o1o3/>`_ (subscription
+ required).