aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorThomas White <taw@physics.org>2022-02-02 14:19:22 +0100
committerThomas White <taw@physics.org>2022-03-01 15:26:42 +0100
commit3619f7952b1ed46d9cc0b075d1996b4bf5388a0c (patch)
treec75e775370f7a513441ddd6d8ea7cdb50f88aba2
parent872935d9514f1b50210af5f124d203a7f404e1db (diff)
Add new tutorial
-rw-r--r--doc/articles/tutorial-images/cell-explorer.pngbin0 -> 49904 bytes
-rw-r--r--doc/articles/tutorial-images/detector-shift-2.pngbin0 -> 48161 bytes
-rw-r--r--doc/articles/tutorial-images/detector-shift.pngbin0 -> 70526 bytes
-rw-r--r--doc/articles/tutorial-images/empty-gui.pngbin0 -> 73860 bytes
-rw-r--r--doc/articles/tutorial-images/reflection-alignment.pngbin0 -> 374426 bytes
-rw-r--r--doc/articles/tutorial-images/saturation-2.pngbin0 -> 87784 bytes
-rw-r--r--doc/articles/tutorial-images/saturation.pngbin0 -> 50828 bytes
-rw-r--r--doc/articles/tutorial.rst1067
-rw-r--r--doc/examples/jf-swissfel-16M.geom359
9 files changed, 1426 insertions, 0 deletions
diff --git a/doc/articles/tutorial-images/cell-explorer.png b/doc/articles/tutorial-images/cell-explorer.png
new file mode 100644
index 00000000..2480fcdd
--- /dev/null
+++ b/doc/articles/tutorial-images/cell-explorer.png
Binary files differ
diff --git a/doc/articles/tutorial-images/detector-shift-2.png b/doc/articles/tutorial-images/detector-shift-2.png
new file mode 100644
index 00000000..3c17148f
--- /dev/null
+++ b/doc/articles/tutorial-images/detector-shift-2.png
Binary files differ
diff --git a/doc/articles/tutorial-images/detector-shift.png b/doc/articles/tutorial-images/detector-shift.png
new file mode 100644
index 00000000..1bd50738
--- /dev/null
+++ b/doc/articles/tutorial-images/detector-shift.png
Binary files differ
diff --git a/doc/articles/tutorial-images/empty-gui.png b/doc/articles/tutorial-images/empty-gui.png
new file mode 100644
index 00000000..b579a017
--- /dev/null
+++ b/doc/articles/tutorial-images/empty-gui.png
Binary files differ
diff --git a/doc/articles/tutorial-images/reflection-alignment.png b/doc/articles/tutorial-images/reflection-alignment.png
new file mode 100644
index 00000000..5dcb1ab0
--- /dev/null
+++ b/doc/articles/tutorial-images/reflection-alignment.png
Binary files differ
diff --git a/doc/articles/tutorial-images/saturation-2.png b/doc/articles/tutorial-images/saturation-2.png
new file mode 100644
index 00000000..057c23a7
--- /dev/null
+++ b/doc/articles/tutorial-images/saturation-2.png
Binary files differ
diff --git a/doc/articles/tutorial-images/saturation.png b/doc/articles/tutorial-images/saturation.png
new file mode 100644
index 00000000..79955a71
--- /dev/null
+++ b/doc/articles/tutorial-images/saturation.png
Binary files differ
diff --git a/doc/articles/tutorial.rst b/doc/articles/tutorial.rst
new file mode 100644
index 00000000..60134856
--- /dev/null
+++ b/doc/articles/tutorial.rst
@@ -0,0 +1,1067 @@
+=================
+CrystFEL tutorial
+=================
+
+This tutorial introduces serial crystallography data processing using the
+CrystFEL graphical user interface (GUI). It covers the main steps of using
+CrystFEL based on some freely available data. The aim of this page is to equip
+you with the knowledge you need to get started with processing a serial
+crystallography data set on your own.
+
+The estimated time required to complete this tutorial is about three hours,
+provided that CrystFEL is already `installed <../../INSTALL.md>`_ and the data
+set has already been downloaded and unpacked (downloading and unpacking is
+described in step 1). The tutorial assumes that Mosflm is available for
+indexing - see the installation instructions for details. There will be several
+waiting periods while data gets processed, during which you will be able to do
+something else.
+
+It's also possible to run CrystFEL almost entirely via the command line, which
+gives you extra abilities compared to the GUI. The "advanced" sections in this
+tutorial will show you how this works. It's fine to skip these sections on your
+first pass through.
+
+In addition to this text document, there is also a `video screencast
+<https://vimeo.com/585412404>`_ which introduces the CrystFEL graphical user
+interface. The video is aimed at people who already have some experience with
+CrystFEL, but should be useful for everyone.
+
+Let's get started!
+
+
+
+1. Download some data
+=====================
+
+There are many serial crystallography datasets available from the `Coherent X-ray
+Imaging Data Bank <https://cxidb.org/>`_. For this tutorial, we'll use
+`dataset 103 <https://cxidb.org/id-103.html>`_. This datasaet is from an
+experiment that was performed at the `Alvra experimental station
+<https://www.psi.ch/de/alvra>`_ at `SwissFEL <https://www.psi.ch/en/swissfel>`_.
+
+You can download as much data as you have disk space for, but one file is
+enough to demonstrate the principles. Here, we will use file
+`run7.JF06T32V01.h5
+<http://portal.nersc.gov/archive/home/projects/cxidb/www/103/Raw_Files/2018-08-10/a2a/run7.JF06T32V01.h5>`_.
+This file is 38 Gb in size, so make sure you have enough disk space available.
+Note also that the download can take some time to complete, even if you have
+a fast internet connection.
+
+In addition to the data file, you'll need a *geometry file* which tells
+CrystFEL how to interpret the data file. The geometry file contains information
+about how the detector is laid out physically, as well as how the data is laid
+out in the file - whether there's one frame or many frames per file, where to
+find the photon energy/X-ray wavelength, and so on. The depositors of *dataset
+103* have kindly deposited a geometry file which matches their data, which you
+can find on the main page for the dataset. The file is called ``jf-16M.geom``.
+However, the specification for geometry files changes quite often as we add new
+features to CrystFEL and deprecate others. The geometry file on the website
+will still work fine, but we provide here an updated file which will work
+slightly more smoothly. If you've downloaded CrystFEL, you'll find it in
+`doc/examples/jf-swissfel-16M.geom <https://gitlab.desy.de/thomas.white/crystfel/-/raw/master/doc/examples/jf-swissfel-16M.geom?inline=false>`_.
+Otherwise, click the link to download a copy.
+
+Finally, you'll need a *bad pixel map* which contains the information about
+which pixels of the detector cannot be trusted. This file is also available
+from the CXIDB. Download it here -
+`run47.JF06T32V01.mask.h5 <https://cxidb.org/data/103/run47.JF06T32V01.mask.h5>`_
+- and save it alongside the geometry and data files.
+
+Advanced
+--------
+
+If you're interested in the differences between the deposited geometry file and
+the file provided here, they are as follows:
+
+* The region masking the direct beam has been translated from lab-space
+ coordinates into detector-relative coordinates, because this makes the data
+ `load faster <speed.rst>`_.
+
+* The filename for the bad pixel mask was changed to match the deposited file,
+ and changed from an absolute path to a relative one.
+
+* Units were added to the camera length and photon energy directives.
+
+* Commented-out directives and non-useful rigid groups were removed.
+
+
+
+2. Start the CrystFEL GUI and load the data
+===========================================
+
+Provided that CrystFEL has been installed properly, you can start the graphical
+user interface (GUI) by running the command ``crystfel`` at a command line.
+An empty workspace should open, which looks like the picture below. If not,
+see `INSTALL.md <../../INSTALL.md>`_ for details, or ask your facility or lab's
+computing team.
+
+.. image:: tutorial-images/empty-gui.png
+
+The bar at the left (note that it may have a vertical scrollbar) shows the main
+data processing steps in roughly chronological order from top to bottom. The
+white area a the bottom is the *log area*, where messages will be displayed.
+The large grey area is where images will be displayed.
+
+Start by loading the data. To do this, click **Load data** at the top. You
+have several options for selecting your files. In this case, we have a single
+file containing the entire data of interest (``run7.JF06T32V01.h5``), so choose
+the first option, *Select an individual file*. Click the file selector button
+immediately to the right and select the file in the file selection window.
+
+You have to select the geometry file at the same time as loading the data.
+Click the file selector button labelled *Geometry file* and choose the correct
+file (``jf-swissfel-16M.geom``).
+
+**Tip:** If you have a long pathname to the data, you don't have to navigate
+using the buttons. You can type or paste directly into the file selector
+window: simply paste as normal (e.g. Ctrl+V), or start typing with ``/``.
+This works for all file selector windows in CrystFEL. In fact, this works for
+all file selector windows in all programs using the GTK toolkit.
+
+If everything goes well, you will see the first frame of data in the GUI. If
+not, check for error messages in the log area or on the terminal from which you
+started the GUI.
+
+When you close the main GUI window, it will offer to save the session. You can
+save your session manually at any time from the **File** menu. Your session
+will be restored the next time you start the GUI in the same folder. All the
+details are stored in a file named ``crystfel.project`` in your working
+directory. You can make a copy of this file at any time, for example for keep
+a backup of your processing parameters. You can also delete it, which will
+cause the GUI to reset to an empty workspace.
+
+
+
+3. Examine the patterns
+=======================
+
+You can navigate through the images in a familiar way. Click and drag to move
+around the frame, and use the mouse scroll wheel to zoom. If you zoom in very
+far, you will see the individual pixel values. Using the arrow buttons near
+the top of the GUI window, you can move forwards, backwards or randomly through
+the images making up the dataset.
+
+**Tip:** Shift-clicking on the "shuffle" button will move backwards through the
+random order, which is often useful when you "overshoot" while looking for an
+interesting pattern!
+
+Flick through a few images and check that everything makes sense:
+
+* Does the background scattering have circular symmetry? If there are
+ discontinuities, the geometry file might not be an accurate description of
+ the experiment. This will cause difficulties in the later steps.
+
+* Are there any obviously bad regions of the detector, for example regions with
+ all zero pixel values, or extremely high values? For the SwissFEL example
+ data, this is the case for an entire detector panel at the bottom-middle of
+ the frame, as well as a for many horizontal and vertical lines criss-crossing
+ each of the detector panels. These pixels have been masked out in the
+ geometry file (see
+ `man crystfel_geometry <https://www.desy.de/~twhite/crystfel/manual-crystfel_geometry.html>`_
+ for details), and therefore show up in a dark brown colour. If you zoom in
+ closely, you will see that the pixel values are be in parentheses.
+
+* Can you see Bragg spots in at least some of the frames? If there are no
+ spots whatsoever, crystallographic data processing will of course fail.
+
+
+
+4. Configure peak detection
+===========================
+
+Crystallographic data processing relies on Bragg peaks, so the first and most
+important data processing step is to identify spots (peaks) in the diffraction
+patterns. Click the next icon in the toolbar, **Peak detection**, to start the
+process by opening the peak detection parameters window.
+
+There is a choice of peak detection methods available in CrystFEL, which you
+can choose between using the menu button at the top. You can easily experiment
+with different methods and parameters to see what gives the best result for
+your data. The **Zaefferer gradient search (zaef)** is the fastest and
+simplest to configure (fewest parameters), but doesn't work well if the
+background varies a lot across the image. The **Radial background estimation
+(peakfinder8)** is the most commonly used method, but it has quite a large
+number of parameters.
+
+**Use the peak lists in the data files (hdf5/cxi)** is only used when
+the data files contain peak list information. This isn't the case for the
+SwissFEL example data, but is generally quite common because the process of
+"hit finding" (separating blank frames from those containing real crystal
+diffraction) itself begins by finding peaks. The peak search results from the
+hit-finding step can be used for CrystFEL's processing as well, so it would be
+a waste of time to do a new peak search! Note that processing steps like peak
+detection, which involve looking at every single pixel of the data, can become
+quite slow when the images are very large, which is the case here (16
+megapixels).
+
+The peak search results will be marked on the image with yellow squares, and
+the results will update "live" as you change the parameters (the new value
+takes effect once you press Enter or move the cursor to a different parameter).
+Adjust each parameter in turn and see how it affects the results. The accuracy
+of the peak search is one of the most critical factors in the success of the
+indexing process, so don't rush this step. Here are some tips for finding the
+right parameters:
+
+Threshold
+ Set this by zooming in closely to the image and looking at the general
+ background level compared to the level of real peaks, in a region where the
+ background is quite strong. Set the threshold about halfway between the
+ two.
+
+Minimum signal/noise ratio
+ Usually this should be a single-digit number. Start with 5, and reduce in
+ steps of 1 if peaks are not being found. Increase in steps of 1 if too many
+ peaks are found. If you find that you need a wildly different value, the
+ detector parameters might not be set correctly in the geometry file
+ (specifically ``adu_per_eV`` or ``adu_per_photon``), throwing off the
+ statistical analysis.
+
+Minimum number of pixels
+ Usually 2 or 3 is the right value. Decrease to 1 if the peaks are very
+ sharp indeed and really do only consist of one pixel. Increase beyond 3 if
+ the peaks are broad.
+
+Maximum number of pixels
+ The peak detection should not be very sensitive to this parameter, unless
+ something is wrong with the data. Set it to around 20.
+
+Local background radius
+ Usually 3 is correct, but increase in steps of 1 if too many peaks are
+ being found and the background is very inconsistent from pixel to pixel.
+
+Minimum/maximum resolution
+ Use these to restrict the peak search to an annular region in the image.
+ If there is a lot of noise near the centre of the image, use the minimum
+ resolution to ignore that area. If there is a lot of noise further out in
+ the frames, use the maximum resolution to cut that out. But beware: this
+ will affect CrystFEL's estimates of the maximum resolution of each
+ diffraction pattern.
+
+Half pixel shift
+ This toggle is necessary because different programs use different
+ conventions about coordinates. Some (including CrystFEL) consider the
+ coordinates *x,y* to refer to the point at distance *x,y* from the corner of
+ the detector panel, measured in units of the width of a pixel.
+ Other programs (including Cheetah) consider the coordinates *x,y* to refer
+ to the centre of the *x*-th, *y*-th pixel in the two directions. This option
+ compensates for the resulting discrepancy of half a pixel width in each
+ dimension. If you're taking peak search data from the image files, try both
+ options. With one of them, the marked peak locations will be visibly far
+ away from the real positions of the Bragg peaks.
+
+Be sure to check the peak search results on more than one frame. Jump between
+frames at random to get a sampling of the entire dataset, because certain
+factors can vary with time throughout the dataset (strength of background,
+X-ray intensity and scattering strength of the crystals). Make sure that peaks
+are being found across the entire image, not just in one area, but remember
+that it's unlikely that a "real" crystal sample diffracts X-rays right to the
+corner of the detector.
+
+A final piece of advice. It's important not to miss too many real peaks, but
+it's also possible to "over-pick" the image, finding too many weak peaks. It's
+better to find fewer peaks, but to be more confident that those peaks are
+really "real", and close to the exact Bragg condition. Therefore, don't spend
+too long trying to refine the parameters to pick out every single
+near-invisible peak. It depends on the data collection parameters, but the
+peaks should form a geometrical pattern --- a fact which you can use to your
+advantage to know which peaks are real.
+
+**Spoiler:** the original paper corresponding to this dataset (read
+`here <https://journals.iucr.org/m/issues/2020/06/00/zf5013/index.html>`_)
+used *peakfinder8* with *threshold=50* and *minimum signal/noise ratio=5*.
+They don't say what values they used for the other parameters, suggesting that
+they used the default values: *minimum number of pixels=2*, *maximum number of
+pixels=200* (unimportant), *local background radius=3*, and the *minimum/maximum
+resolution* set to 0 and 2000 respectively.
+
+Once you're happy with the results, press **Confirm**. You can also press
+**Discard changes** at any point if you mess up.
+
+
+5. Index one frame
+==================
+
+We are now going to try to index the frames, which means determining the
+crystal orientation and lattice parameters for each diffraction pattern.
+Before trying to process the entire dataset, it's advisable to check that things
+are working well for just a handful of frames. The peak detection has already
+been optimised in the previous step, but indexing frames depends on many other
+factors as well.
+
+Click **Index this frame** to open the indexing dialogue box. For a first
+attempt, leave all the parameters in the *Integration* and *Advanced Indexing*
+untouched, and concentrate on *Indexing*. The Unit cell file should say
+*"(None)"* (if it doesn't, click the 'Delete' button right next to it). Leave
+*Automatically choose the indexing methods* ticked, but un-tick *Attempt to
+find multiple lattices per frame* and *Retry indexing if unsuccessful*. This
+will make things slightly faster. The *Unit cell tolerances* are unimportant
+for now, as is the option *Check indexing solutions against reference cell*.
+Then click **Run**.
+
+The GUI will become unresponsive for a short while (be patient!). When it
+returns, one of two things will happen. The first possibility is that nothing
+looks any different, which indicates that the pattern could not be indexed. In
+this case, try again with other frames. If nothing at all can be indexed, then
+something is wrong. This should obviously not be the case for this tutorial!
+
+If the pattern can be indexed, you'll see the lattice parameters displayed in
+the log area of the GUI, and the diffraction pattern itself will be covered
+with green circles. These show where Bragg spots are expected to appear, based
+on the lattice parameters and crystal orientation calculated by the indexing
+algorithm. CrystFEL already does some checks to weed out bad indexing
+solutions, but you'll still have to examine the results to see if they make
+sense. This is something of an art form, but the main point is to see if the
+general patterns of reflections match up. For example, the following picture
+shows good alignment:
+
+.. image:: tutorial-images/reflection-alignment.png
+
+Notice that the green circles ("predicted" reflections), yellow boxes (peak
+search results) and the real peaks (as seen by eye) all form lines closely
+spaced in an approximately vertical direction, with much wider spacing in
+the horizontal direction. Although not all of the markers agree along each of
+the lines, they all agree on the spacing and direction. There are no "alien"
+reflections in the gaps between the real reflections, in either direction.
+Note also that there are some weaker peaks, not found by the peak search ,
+which nevertheless agree with the indexing results. One or two reflections
+seem to be slightly misplaced, which is not a severe problem. In this case, it
+might be because the bandwidth (range of X-ray wavelengths) of the X-ray source
+was quite large, which is not fully taken into account by the calculation.
+You can also see that the reflections in the right-hand side of the picture
+(everything except the two rows furthest to the left) form a kind of ring,
+another pattern which is reproduced by the predictions, peak search results
+and real peaks.
+
+Bad signs are when both the peaks and indexing results both form lines, but
+those lines are in different directions. Another bad sign is if the patterns
+seem to match up very well on one side of the pattern, but very badly on the
+opposite side.
+
+If the diffraction pattern does not have clear patterns like the example above,
+it gets much harder to judge whether the indexing is correct or not. Therefore
+it's best to look for "pretty" patterns (or, at least, "pretty" regions of
+patterns) to judge the correctness of indexing.
+
+If you want to find the exact frame used for this example, it's the fourth
+frame from the beginning of the dataset, assuming you downloaded the single
+file suggested in step 1.
+
+**Tip:** The indexing options will be stored, so to index another pattern you
+can simply click *Index this frame* and hit the enter key.
+
+
+
+6. Index all the frames
+=======================
+
+Now that things are working for a few test frames, it's time to expand the
+processing to *every* frame in the dataset.
+
+Click **Index all frames** in the toolbar. The process is very similar to
+indexing one frame. In fact, the indexing parameters will be brought forward
+from the *Index this frame* dialogue box into this one. There are couple of
+extra fields to fill in, though. First of all, you need to give a name for the
+job, which will be used for referring to the results. Type something
+meaningful for yourself in the box at the top, such as *index-all-nocell-1*.
+Check that the other indexing parameters are how you want them, then go to
+**Cluster/batch system**. Here you can control how the job is run. The data
+processing proper will take quite a long time and use a lot of CPU power, so
+if you have access to a cluster system then you should make use of it here.
+
+There are two options found in the **Batch system** menu. **SLURM** will make
+the GUI submit its jobs through the `SLURM <https://slurm.schedmd.com/>`_
+system. You'll have to give details such as the partition name and any
+relevant constraints - ask your system manager about these. If SLURM is not
+shown in this menu, it's because the SLURM libraries weren't available when
+CrystFEL was compiled - see `INSTALL.md <../../INSTALL.md>`_ or ask your
+facility or lab's computing team.
+
+If SLURM isn't available, you'll have to run "locally", which means to run on
+the same computer as where the GUI itself is running. This is the right option
+if you're processing data on your own private computer, but it will probably
+cause trouble if you're using a shared computer. Therefore it's wise to make
+sure you know how your system is set up! For local processing, you only need
+to give the number of threads, which as a rule should be the same as the number
+of CPUs available.
+
+Back in the indexing parameters tab, it's a good idea to set a non-zero value
+for *Skip frames with fewer than ... peaks*. This will make things faster by
+skipping over frames that don't seem to have a plausible number of peaks to
+constitute a real diffraction pattern. Try setting a value of 15 here.
+
+For a reason that will become clear later, we'll run the indexing job using
+only Mosflm as the indexing engine. So, un-tick *Automatically choose the
+indexing methods*, expand the section *Select indexing methods and prior
+information* and make sure that only *MOSFLM* is selected. The checkboxes
+under *Prior unit cell* and *Prior lattice type* are irrelevant for now,
+because there is no prior information to use.
+
+Finally, in the **Notes** tab you will find a free-form text entry area. This
+is for you to use for your own notes. As the text says, anything you write
+here will be stored with the results of the job you're about to start.
+
+Once you're happy with everything, click **Run**. You should see a progress
+bar appear in the main GUI window. You can cancel the job at any time,
+obviously, by clicking the **Cancel** button. Otherwise, now is a great time
+to step away from your computer for a few minutes and make a cup of tea!
+When the job has finished, the progress bar will show 100% and the *Cancel*
+button will become a close button with a cross icon, which when pressed will
+permanently remove the progress bar.
+
+
+
+Advanced: Job directory contents
+--------------------------------
+
+The CrystFEL GUI places all the files related to your job into a subdirectory
+of your working directory. The subdirectory has whatever name you gave for the
+job (in this case ``index-all-nocell-1``), and you are free to inspect the
+contents. The exact contents depend on which 'backend' you used (*local* or
+*SLURM*) as well as the type of job (indexing, merging or ambiguity resolution,
+the latter two of which will be discussed later). Here they are for an
+indexing job with the *local* system::
+
+ [16:31] tutorial $ ls
+ crystfel.project index-all-nocell-1 jf-16M.geom
+ [16:31] tutorial $ ls index-all-nocell-1/
+ crystfel.stream files.lst notes.txt parameters.json run_indexamajig.sh stderr.log stdout.log
+ [16:31] tutorial $
+
+The *SLURM* backend splits the work into several blocks, each of which will
+have its own set of files. In this case, the filenames will have numbers
+appended: ``crystfel-0.stream``, ``crystfel-1.stream``, ``crystfel-2.stream``
+and so on (the number of files depends on the size of the dataset).
+
+The files are all plain text files, and therefore can be inspected with a text
+editor:
+
+crystfel.stream
+ This contains the output of the job, with metadata and analysis results for
+ each frame in turn. If you want to inspect the contents of this file, it's
+ best to use something like ``less`` because the file may be too big for a
+ text editor to handle. We will examine this file more in the next section.
+
+files.lst
+ This is the list of input frames for the job to work on.
+
+notes.txt
+ The contents of the free-form text entry in the job creation dialogue box.
+
+parameters.json
+ A record of all the values for the processing parameters, taking into
+ account the default values where appropriate.
+
+run_indexamajig.sh
+ This contains the command line for starting the indexing program, which
+ called **indexamajig**. Instead of using the GUI, you can run the jobs
+ entirely from the command line, using this command line as a starting
+ template. The options are all documented in the manual - run
+ ``man indexamajig`` on a command line to access it.
+
+stderr.log and stdout.log
+ These contain the logs from the job. The **stderr.log** is the most
+ informative. It's good to check here to look for error and warning messages.
+
+
+
+7. Look at the indexing results
+===============================
+
+Just like before with one frame, you should check that the batch indexing job
+is giving good results. Near the top of the main GUI window, you'll find a
+drop-down menu button. The normal position here up to now has been
+**Calculations within GUI**. Now, click the button and select the name that
+you gave to your indexing job in the previous step, which should appear in the
+pop-up menu. Now, flick between the frames like you did in step 4 to check the
+peak detection results, and examine the accuracy of indexing like you did in
+step 5.
+
+Actually, you don't have to wait for the indexing job to complete before doing
+this. It's perfectly fine to look at the results while it's still running.
+However, keep in mind that the indexing job will start at the beginning of the
+dataset and work through to the end. So, if you jump to a frame at the end of
+the dataset, you'll see a blank frame (and the message
+``Failed to load chunk from stream. Just displaying the image.`` in the log
+area). Frames nearer the start of the dataset, which have already been
+processed by the batch job, will show up correctly.
+
+If no results at all appear, check the log area for other errors. If the job
+is still running, simply wait a bit longer. If the job has finished (indicated
+by a cross icon next to the progress bar, instead of a "cancel" button) but you
+still don't see any results, try selecting **Tools->Rescan streams**, which
+tells the GUI to update its results index.
+
+
+
+8. Determine the lattice parameters
+===================================
+
+Once the indexing job has completed, select the job from the menu (it should
+still be selected from the last step) and click **Determine unit cell**.
+After a moment of calculation, the **Unit Cell Explorer** will open. This
+tool displays histograms of all six lattice parameters, with the histogram
+peaks coloured according to the lattice centering. Just like the previous
+step, you don't technically need to wait until the job is finished before
+looking at these graphs. However, the graphs will become smoother and clearer
+when they include the largest possible number of results. Click and drag the
+graphs to move them around, use the scroll wheel to zoom in and out, and press
+the plus or minus keys to change the binning.
+
+**Tip:** If you have difficulty distinguishing the colours representing the
+different centering types, you can click the coloured squares in the top right
+hand corner of the Cell Explorer window. This will cycle the corresponding
+colour between light grey, black and the original colour.
+
+In favourable cases, you'll see a single peak for each of the six parameters.
+In such cases, it's obvious that the most popular values correspond to the true
+cell parameters. In other cases, such as this one, you will see multiple peaks
+for each parameter, possibly in different colours (corresponding to different
+types of lattice centering). This happens because there are different ways
+that the crystal lattice can be described. For example:
+
+* For a cubic I (body-centered) lattice, you will often also see a rhombohedral
+ cell (all axis lengths the same, all angles the same) with an angle of
+ 109.5°.
+
+* For a cubic F (face-centered) lattice, you will often also see a rhombohedral
+ cell (all axis lengths the same, all angles the same) with an angle of 60°.
+
+* For a monoclinic C (base-centered) lattice, you might see up to three
+ different representations of the unit cell, corresponding to the three cell
+ choices.
+
+* For any centered cell (any of A, B, C, F, I or even "H"), you will usually
+ see at least one primitive cell which represents the same lattice.
+
+There are, of course, many other possibilities. For more details of these,
+consult any basic crystallography textbook.
+
+In addition to this, there may be a real mixture of different cell parameters,
+and things can get quite complicated. See for example `this paper
+<https://www.nature.com/articles/s41467-018-05953-4>`_, where the
+sample contained a mixture of unrelated crystals with different structures, or
+`this paper <https://www.nature.com/articles/nature20599>`_, where the
+structure actually changed during the experiment.
+
+In this case, you can see a C-centered cell (in purple) alongside a few
+different primitive (black) cells. To start unraveling the situation, hold down
+shift and click/drag within the histograms, to select a range of values.
+With a range selected in one histogram, the other histograms will be calculated
+from the indexing results within that range. Using this feature, you can
+"explore" whether peaks for one parameter correspond to peaks for other
+parameters. Start by selecting the large purple peak in the *b* axis length,
+then select the largest peaks which still show up for the other parameters.
+You should quickly be able to narrow things down to a single set of lattice
+parameters. In this case, a C-centered lattice, apparently orthorhombic (the
+angles are all 90°, within the widths of the peaks), with axis lengths of
+40.30, 180.5 and 142.8 Angstroms.
+
+**Tip:** To de-select a region chosen using shift+drag, simply hold shift and
+click once within the histogram area.
+
+The region selection feature has another purpose, which is to choose which
+areas of the graphs to fit curves to. After getting acquianted with the
+histogram navigation, zoom in on each of the six peaks, centre them, set the
+binning such that they appear as smooth humps, and finally select each one.
+Then press Ctrl+F or select **Tools->Fit cell**. The result should look like
+this:
+
+.. image:: tutorial-images/cell-explorer.png
+
+**Tip:** Symmetrical peaks with narrow distributions (standard deviation much
+less than 1 Angstrom), as in this screenshot, are a good sign. If the
+peaks lean to one side or are split into two, it *could* be a real effect, but
+usually this indicates that the detector geometry needs further refinement.
+
+How the cell is represented depends on the indexing algorithm. In step 6, we
+chose to use Mosflm alone because Mosflm understands lattice centering, and is
+able to go directly to the conventional representation of the cell with all
+angles 90°. Other indexing algorithms can't do this, and will give you the
+primitive representation (or rather, *one of the primitive representations*) of
+the cell. Using only one indexing algorithm for this stage, rather than a
+combination, also helps to simplify matters.
+
+Keep in mind that the true symmetry of the structure is not known at this
+stage. The fact that all three angles are close to 90° does **not**
+necessarily mean that the lattice is *really* orthorhombic, i.e. that it meets
+the minimum symmetry requirement of a twofold rotation symmetry along each axis
+(point group 222, space group C222 or C222\ :sub:`1`\ ). It could also be that the
+angles are 90° by chance, and the true symmetry is lower. The final
+determination can only be made once the structure is solved. Unlike other
+programs, CrystFEL will neither suggest a space group nor require you to
+nominate one.
+
+Once you're happy, go to **File->Create unit cell file** and save the results
+somewhere memorable. Note the option to **Enforce lattice type** at the bottom
+of the file selector. With the warning of the previous paragraph in mind, here
+you can enforce the *metric* symmetry by applying the appropriate constraints
+on the lattice parameters. For example, selecting *orthorhombic*, which you
+should do in this case, will round all the angles to exactly 90°.
+
+
+
+Advanced: Stream file contents
+------------------------------
+
+In the last *Advanced* section, we looked at the contents of the job directory.
+In this one, we'll take a closer look at the stream itself. The stream is a
+plain text file, so you can examine it using standard text handling tools, or
+easily write scripts to process it.
+
+Open the stream file in the job directory using ``less``. You can also use
+your normal text editing tool, but beware that the file is very large, and
+might cause memory problems. If you ran the job locally (see step 6), the
+command will be ``less index-all-nocell-1/crystfel.stream`` (obviously, with
+your own choice of job name substituted for ``index-all-nocell-1``). If you
+ran the job using SLURM, there might be multiple streams. In this case, take
+the first one: ``less index-all-nocell-1/crystfel-0.stream`` (the numbering
+starts from zero).
+
+The CrystFEL version number and ``indexamajig`` command line are stored at the
+top of the stream. This can help you to reproduce an old result, if it ever
+becomes necessary. Then comes a record of the entire geometry file. This
+might be followed by even more *audit* information, depending on the indexing
+options selected. When indexing using a prior unit cell (which will be done in
+the next step), there will be a record of the target unit cell. If the
+indexing methods are chosen automatically, the selected indexing methods will
+also be recorded.
+
+The indexing results come after all of this "header" information. You will see
+that it takes the form of a series of *chunks*, deliminated by lines containing
+only ``----- Begin chunk -----`` and ``----- End chunk -----``. There is one
+chunk per frame of data, and you will see various items of metadata including
+the filename, "Event ID" (which identifies the frame when there is more than
+one frame of data per file), radiation properties and so on. This will be
+followed by the peak search results, with the location and intensity of each
+peak.
+
+If the frame was not "indexable", the chunk will end here. If the indexing
+algorithm was successful, there will be one or more crystal records, between
+``--- Begin crystal`` and ``--- End crystal`` markers. If multiple overlapping
+crystal diffraction patterns were found in a single frame, there will be more
+than one of these. Within the crystal record, you will see the lattice
+parameters and other analysis results including a resolution estimate. Below
+that, there will be the integration results - a list of Miller indices with
+intensity, error estimate and location of each *predicted* Bragg peak.
+
+The cell parameters for crystal are on a line like this::
+
+ Cell parameters 4.02822 18.05395 14.28352 nm, 90.06266 90.05042 90.06303 deg
+
+This line tells you the axis lengths and inter-axial angles for the unit cell
+for the individual crystal. You will also see lines like these::
+
+ lattice_type = orthorhombic
+ centering = C
+ unique_axis = *
+
+Using standard Unix text tools such as ``grep``, you can extract this
+information in text form::
+
+ $ grep "Cell parameters" index-all-nocell-1/crystfel.stream | head -n 10
+ Cell parameters 4.02822 18.05395 14.28352 nm, 90.06266 90.05042 90.06303 deg
+ Cell parameters 18.02935 4.02939 14.30931 nm, 90.01359 90.18854 90.03249 deg
+ Cell parameters 4.02721 18.10237 14.28593 nm, 90.00089 90.01227 89.93198 deg
+ Cell parameters 17.96378 4.04633 14.26642 nm, 90.38523 90.54669 89.99014 deg
+ Cell parameters 4.03392 18.09072 14.30990 nm, 89.92036 90.19984 90.11342 deg
+ Cell parameters 4.04645 9.23522 14.34554 nm, 90.28009 90.53020 101.88863 deg
+ Cell parameters 4.03235 18.03866 14.30363 nm, 90.07003 90.10470 89.96150 deg
+ Cell parameters 4.06850 9.55563 14.47150 nm, 92.04542 90.76992 100.75923 deg
+ Cell parameters 4.03573 18.04051 14.29689 nm, 89.94268 90.09296 90.08017 deg
+ Cell parameters 4.03536 18.04836 14.28127 nm, 89.88508 89.97842 90.07861 deg
+ $ grep "centering" index-all-nocell-1/crystfel.stream | head -n 10
+ centering = C
+ centering = C
+ centering = C
+ centering = C
+ centering = C
+ centering = P
+ centering = C
+ centering = P
+ centering = C
+ centering = C
+ $
+
+
+
+9. Index with prior unit cell
+=============================
+
+To have a dataset suitable for merging, we need to know that all of the
+patterns are indexed using the same lattice parameters. To do so is simply a
+matter of giving the unit cell file from step 8 in the indexing parameters
+dialogue box. The indexing results be compared to these parameters, and
+accepted only if they match --- or can be made to match by a simple
+transformation.
+
+There is another reason to give lattice parameters in advance, which is to
+increase the indexing success rate. Several indexing algorithms work better if
+they know which lattice parameters to search for, and in fact some algorithms
+*only* work when the lattice parameters are known in advance.
+
+Create a new indexing job like before, by clicking **Index all patterns**,
+except this time click the **Unit cell file** button and select the file you
+saved in step 8. Make sure that *Check indexing solutions against reference
+cell* is ticked. Give the job a descriptive name such as *index-all-1*.
+
+The *Unit cell tolerances* are now relevant, because they control how closely
+the lattice has to match the reference parameters. In almost all cases, it's
+best to leave these on the default values of 5% for the axis lengths and 1.5°
+for the angles.
+
+In step 6, it was preferable to index all the patterns using the same indexing
+engine (Mosflm). This helped to make the lattice parameter histograms
+clean and easy to interpret. Now that the lattice parameters are known, it's
+better to try each frame with as many indexing engines as possible, to get the
+highest possible success rate. If the first indexing algorithm fails, the
+next in line will be tried, and so on. You can select the indexing algorithms
+yourself (as in step 6), or just select **Automatically choose the indexing
+methods**.
+
+In addition to this, CrystFEL can be told to retry indexing with the same
+algorithm, if it doesn't work at first, after deleting the weakest peaks from
+the peak list. This will be tried up to five times (for a total of six
+indexing attempts with each method). To enable this, select **Retry indexing if
+unsuccessful**.
+
+On top of all of this, CrystFEL can also be told to retry indexing if it
+*succeeds*, having deleted the peaks which are "explained" by the first
+indexing result. This gives a chance of finding a second overlapping
+diffraction pattern in the frame. For this, select **Attempt to find multiple
+lattices per frame**.
+
+All of this might add up to over 30 indexing attempts per frame. If this
+sounds like a lot of calculation to you, you're correct! There's a trade-off
+to be made between computing time and probability of indexing success. If you
+want things to go faster, select only one indexing algorithm (*Xgandalf* is a
+good choice) and disable *retry*. You should enable multi-lattice indexing if
+there really are lots of multiple lattice frames in your dataset. Otherwise,
+leaving it off will usually give better (and faster) results.
+
+Once you're happy, check the batch system parameters and press **Run**.
+Check the results like before --- remember that you don't have to wait for the
+entire run to finish before examining the diffraction patterns. The unit cell
+histograms should be very clear now, with a nice single peak for each
+parameter.
+
+Don't forget that you can also use **Index this frame** to test your
+parameteters for just one frame, before launching the big processing job!
+
+
+
+Advanced: Check and optimise the detector geometry
+--------------------------------------------------
+
+This step is marked as *Advanced* because it needs some work outside the
+CrystFEL GUI, but it's a very good idea to do this check.
+
+Even if the geometry file is supposedly correct for the experiment, it's best
+to check that, for example, the beam position hasn't drifted. Fortunately,
+CrystFEL has already done most of the work for you. After indexing each
+pattern, CrystFEL runs a short optimisation procedure, adjusting the unit cell
+parameters, orientation and beam centre position to get the best possible
+agreement between the observed and predicted peak locations at the same time as
+making sure that the observed peaks correspond to Bragg positions as closely as
+possible. The required beam shift (equivalently considered as a shift of the
+detector) is recorded in the stream for each pattern (see the *Advanced* part
+of section 8)::
+
+ $ grep "predict_refine/det_shift" index-all-cell-1/crystfel.stream | head -n 5
+ predict_refine/det_shift x = 0.020 y = 0.010 mm
+ predict_refine/det_shift x = -0.028 y = -0.017 mm
+ predict_refine/det_shift x = 0.028 y = 0.048 mm
+ predict_refine/det_shift x = 0.022 y = -0.014 mm
+ predict_refine/det_shift x = -0.016 y = -0.002 mm
+ $
+
+A script provided with CrystFEL called detector-shift will plot these values.
+Simply run it on the stream::
+
+ $ cp ~/crystfel/scripts/detector-shift .
+ $ chmod +x detector-shift
+ $ ./detector-shift index-all-cell-1/crystfel.stream
+ Mean shifts: dx = 0.0062 mm, dy = -0.0097 mm
+
+A window should open, which shows the detector shifts as a scatter plot:
+
+.. image:: tutorial-images/detector-shift.png
+
+The cyan point marks the origin (0,0), and the pink point marks the mean of all
+the offsets. The pixel size of the `Jungfrau detector
+<https://www.psi.ch/en/lxn/jungfrau>`_, which was used for this experiment, is
+75 µm, so almost all of the offsets are less than 1 pixel, and the average
+offset is very much less than 1 pixel. Therefore, no further refinement is
+required.
+
+Just for reference, here is how the graph might look if the offset were larger:
+
+.. image:: tutorial-images/detector-shift-2.png
+
+Notice that the cluster of points is significantly displaced from the origin.
+This offset has already been taken into account by CrystFEL when calculating
+the position of Bragg peaks, but the results will be better overall if the
+geometry is correct from the start of the process. In this case, it would be a
+good idea to update the geometry file. The detector-shift script can fix the
+geometry file for you::
+
+ $ ~/crystfel/scripts/detector-shift index-all-cell-1/crystfel.stream jf-swissfel-16M.geom
+ Mean shifts: dx = -0.14 mm, dy = -0.25 mm
+ Applying corrections to jf-swissfel-16M.geom, output filename jf-swissfel-16M-predrefine.geom
+ default res 9097.525473
+
+The updated geometry file is called ``jf-swissfel-16M-predrefine.geom``, as the
+script tells you. If you process this dataset again, use this new geometry
+file.
+
+
+
+Advanced: Check for detector saturation
+---------------------------------------
+
+Like the previous section, this step is marked as *Advanced* because it needs
+some work outside the CrystFEL GUI, but it's a very good idea to do this check.
+
+An unfortunate feature of some detectors used at FEL facilities is that the
+dynamic range is quite small. There will probably be many saturated reflections
+("overloads"), and you need to exclude these when you merge the data. Another
+small program distributed with CrystFEL, called ``peakogram-stream``, will help
+you judge the maximum intensity to allow for a reflection. Run it on your
+stream, like this::
+
+ $ cp ~/crystfel/scripts/peakogram-stream .
+ $ chmod +x peakogram-stream
+ $ ./peakogram-stream -i index-all-cell-1/crystfel.stream
+
+A graph like this should appear:
+
+.. image:: tutorial-images/saturation.png
+
+The vertical axis represents the highest pixel value in each reflection and the
+horizontal axis represents resolution. The colour scale represents the density
+of points. In this case, there does not seem to be a hard cutoff, which
+indicates that the dynamic range of the detector was large enough to record all
+of the reflections. Just for reference, here is how the graph might look with
+severe saturation:
+
+.. image:: tutorial-images/saturation-2.png
+
+See the dense "cloud" of points with "Reflection max intensity" over about
+7000? Those are the saturated reflections, which have higher intensities than
+the detector can measure and therefore "saturate" at a high value. The reason
+it's a cloud and not a hard cutoff is that the images have had "pedestal"
+values subtracted from each pixel, and the pedestal values vary from pixel to
+pixel, panel to panel and even day to day. To avoid including these saturated
+values, you would have to reject reflections peaking over 7000 detector units.
+This can be done as part of the merging step, described in the next section.
+
+
+
+10. Merge the intensities
+=========================
+
+In the merging step, the individual sets of reflection intensities from each
+pattern will be combined together to make a single combined set of average
+intensities. The most basic method is literally to take the average value of
+the intensity for each reflection (including its symmetry equivalents), but
+quite a lot of modelling is possible during this process. The simplest
+enhancement is to scale the intensities up if they come from overall weaker
+patterns, and down if they came from stronger patterns. After that, the rate
+of intensity falloff with resolution can be measured for each pattern, and
+steeper falloffs (Debye-Waller factors) compensated for. In addition to that,
+quite complicated models of the diffraction physics can be used, in which we
+take into account that not all reflections are at the exact Bragg condition.
+At the same time, different strategies are available for detecting and
+rejecting patterns which appear to be outliers.
+
+All of these methods are available via the CrystFEL GUI. Start by clicking
+**Merge** and selecting *Simple merging (process_hkl)* next to *Model*.
+This will use the simplest possible merging strategy, which will be very fast.
+
+Give the job a name, like before, something like *merge-1*. You need to
+select which set of indexing results to use as the input for merging, which
+is done using the drop-down menu in the top right corner. Select your latest
+and best result here (e.g. *index-all-cell-1*).
+
+You will have to tell CrystFEL which symmetry group to merge the reflection
+data according to. This step requies some crystallographic knowledge, but the
+CrystFEL GUI will help you. We will proceed on the assumption that our initial
+hypothesis of an orthorhombic cell was correct. Click the symmetry selector
+button (labelled *Click to choose*) and choose *orthorhombic* from the *Lattice
+type* menu. Select *Centrosymmetric*, which will tell CrystFEL to merge
+Friedel pairs of reflections to get higher overall data quality. To preserve
+an anomlaous signal, you would instead select *Sohnke*. You cannot choose both
+(obviously). This will leave only one option under *Possible point groups*,
+which is *mmm*. Click the button to select it.
+
+You will see that several other options are available in the *Merge* parameters
+dialogue box, which will not be needed for this first demonstration. Un-tick
+everything. The *Detector saturation cutoff* option deserves a special
+mention, even though it is not needed in this case - how to check and determine
+this value was discussed in the second advanced section of step 9.
+
+Like with indexing, you have the option of running the job locally or via a
+cluster system, and a place to write any notes of your own. Once you're happy,
+press **Run**. Since *process_hkl* is extremely fast, the job will be finished
+very quickly.
+
+If you like, try a second round of merging. This time, use *No partialities
+(unity)* for the *Model*, select *Scale intensities*, *Debye-Waller scaling*
+and *Reject bad patterns according to ΔCC½*. This set of choices is the
+standard "basic" merging strategy for serial crystallography data, and will
+take a little bit longer to process.
+
+
+
+Advanced: Resolve indexing ambiguity
+------------------------------------
+
+Although, fortunately, the tutorial dataset is not subject to an indexing
+ambiguity, the tutorial would not be complete without a mention of them.
+
+An indexing ambiguity occurs when there are multiple *symmetrically
+non-equivalent* ways to assign Miller indices to the reflections for a certain
+choice of lattice parameters. Indexing algorithms rely on the positions of the
+Bragg spots, but not their intensities. For some structures, there are two
+(sometimes even more) ways to index each pattern, each of which results in
+the same spot positions but different intensities. If you were to merge data
+from such a structure without taking this into account, the merged intensity
+for each reflection would be a mixture of the two unrelated reflections. The
+first warning of this happening might be that you can't solve the structure and
+that the structure solution/refinement program produces a "twin warning".
+
+The ambiguity is difficult to resolve because there is a lot of noise
+associated with each individual measurement in serial crystallography. Only by
+merging large numbers of measurements do we arrive at precise estimates of the
+structure factor moduli. In a similar way, resolving an indexing ambiguity
+requires making many inter-pattern correlation measurements in a clustering
+algorithm. For some discussion, see section 4 of `this paper
+<https://journals.iucr.org/j/issues/2016/02/00/zd5001/index.html>`_, and
+follow the references.
+
+Resolving an indexing ambiguity in CrystFEL works quite similarly to the other
+processing stages. Press **Indexing ambiguity**, give the job a name, and
+select the input (any indexing result). You will need to specify the *true*
+symmetry of the structure, as well as either the *apparent* symmetry (taking
+into account the ambiguity) or the transformation which describes the ambiguity
+itself. How to know the correct values here depends on the individual
+scenario, and can involve quite a lot of symmetry theory. Running the job will
+produce another "indexing result" which can be used as input for the merging
+step.
+
+
+
+11. Calculate figures of merit
+==============================
+
+We need to put some numbers on the quality of the merged dataset. To do this,
+press **Figures of merit**. Click the arrow next to *Results to show* and put
+a tick next to your merged dataset. Do the same for *Figures of merit to show*
+and choose some figures of merit. To start with, use *I/σ(I)*, R\ :sub:`split`
+and CC\ :sub:`½` - noting that you can select multiple figures of merit at
+once, and indeed multiple datasets.
+
+Next, you need to specify a unit cell file. Press the file selector button and
+choose your latest unit cell file for the dataset - it's important that it
+matches the results.
+
+You can choose the resolution range to calculate the figures of merit over.
+For a first look, press *Reset to entire data* to show everything, and set the
+*Number of resolution bins* to 20. Leave the minimum I/sigI at *-inf* and the
+*Minimum number of measurements per reflection* at 1, then press Calculate.
+
+The results will show up in the log area, for example::
+
+ Overall I/sigI: 3.543090 (13263 reflections)
+ 1/d / nm^-1 I/sigI num refl
+ 0.889637 5.060389 1029
+ 1.681349 6.396673 966
+ 2.008833 7.230766 900
+ 2.251608 7.066912 883
+ 2.450181 6.684656 857
+ 2.620610 5.167587 797
+ 2.771209 4.746473 824
+ 2.906928 3.310311 787
+ 3.030987 2.678650 805
+ 3.145615 2.141694 764
+ 3.252427 1.328863 746
+ 3.352635 0.129103 710
+ 3.447177 1.141227 735
+ 3.536791 -1.042014 712
+ 3.622075 1.282934 636
+ 3.703517 1.081464 466
+ 3.781523 0.121833 306
+ 3.856435 0.344556 203
+ 3.928542 0.911983 112
+ 3.998093 -0.898763 25
+
+From here, you could cut/paste the results into a spreadsheet, or save them
+somewhere.
+
+Crystallographers love to spend hours arguing over the interpretation of
+figures of merit, such as what constitutes a "good" value, and how the
+resolution limit should be defined. This tutorial isn't going to enter that
+argument, but you can see that the "real" data in the above example doesn't
+extend much beyond a 1/d value of about 3.3 nm\ :sup:`-1` (or 3 Angstroms
+resolution). So it makes sense to trim the resolution range a bit and
+re-calculate the figures of merit. Here is R\ :sub:`split` up to 3 Angstroms
+resolution, in 15 bins::
+
+ Overall Rsplit: 0.352347 (4290 reflections)
+ 1/d / nm^-1 Rsplit num refl
+ 0.823014 0.413106 666
+ 1.530548 0.309372 465
+ 1.828202 0.264401 388
+ 2.048945 0.220279 399
+ 2.229526 0.275009 343
+ 2.384527 0.278238 326
+ 2.521500 0.324522 241
+ 2.644945 0.321376 201
+ 2.757788 0.593567 207
+ 2.862055 0.521963 211
+ 2.959214 0.672673 187
+ 3.050368 0.875012 176
+ 3.136368 1.190163 178
+ 3.217887 1.452259 160
+ 3.295467 1.343084 142
+
+In the interests of finishing in a reasonable time, this dataset is rather
+small - only a few hundred frames. Therefore, we don't expect very high
+quality in the final dataset. A few thousand frames would be much better -
+about 10,000 is usually a good number, but it depends on the crystal symmetry
+as well as how strongly the crystals diffract.
+
+
+12. Export merged data
+======================
+
+The final step is to export the data. If you take a peek inside the job
+subdirectory (see the *advanced* section of step 6, and apply it to your
+merging job), you will find a file called ``crystfel.hkl``. This file
+contains the final merged intensities in a text-based CrystFEL-specific format.
+In this step, we will export the same data to MTZ format, which is more
+familiar to other software.
+
+Click **Export data** to open a familiar save dialogue box. At the bottom of
+the window, you will need to select which merging result to export, as well as
+selecting the format and unit cell file to use. You have to give the unit cell
+file because MTZ files include information about the unit cell. You can
+choose between two types of MTZ export. One of them (*Bijvoet pairs together*)
+should be used when you're doing anomalous phasing. For most other
+applications, the plain *MTZ* option is easier.
+
+Give a name for the file (e.g. ``export.mtz``) and press Save. The file should
+be directly usable by other tools within, for example, CCP4 and Phenix.
+
+
+
+13. Conclusion
+==============
+
+Congratulations! You've successfully processed a data set using CrystFEL. Now
+it's time to process your own data. Good luck!
+
+Do you have feedback on this tutorial, or on CrystFEL in general? CrystFEL is
+an open-source project, which means your contributions are invited. Please
+take a look at the `contributing guide <../../CONTRIBUTING.md>`_ to see how to
+get started, for example by reporting a problem, adding a new feature or
+improving the documentation.
diff --git a/doc/examples/jf-swissfel-16M.geom b/doc/examples/jf-swissfel-16M.geom
new file mode 100644
index 00000000..150198b1
--- /dev/null
+++ b/doc/examples/jf-swissfel-16M.geom
@@ -0,0 +1,359 @@
+adu_per_photon = 1
+res = 13333.3
+clen = 95.3 mm
+photon_energy = 4570 eV
+
+dim0 = %
+dim1 = ss
+dim2 = fs
+data = /data/data
+
+mask_file = run47.JF06T32V01.mask.h5
+mask = /mask
+mask_good = 1
+mask_bad = 0
+
+bad_bs_1/min_fs = 890
+bad_bs_1/max_fs = 1029
+bad_bs_1/min_ss = 7710
+bad_bs_1/max_ss = 7770
+bad_bs_1/panel = m15
+
+bad_bs_2/min_fs = 957
+bad_bs_2/max_fs = 1029
+bad_bs_2/min_ss = 6050
+bad_bs_2/max_ss = 6167
+bad_bs_2/panel = m11
+
+bad_bs_3/min_fs = 0
+bad_bs_3/max_fs = 135
+bad_bs_3/min_ss = 6631
+bad_bs_3/max_ss = 6681
+bad_bs_3/panel = m12
+
+bad_bs_4/min_fs = 0
+bad_bs_4/max_fs = 60
+bad_bs_4/min_ss = 8224
+bad_bs_4/max_ss = 8351
+bad_bs_4/panel = m16
+
+m0/min_fs = 0.0
+m0/min_ss = 0.0
+m0/max_fs = 1029.0
+m0/max_ss = 513.0
+m0/fs = 0.0003420x -0.9999310y
+m0/ss = -0.9989760x -0.0000500y
+m0/corner_x = 2147.07110898
+m0/corner_y = 1131.30172146
+
+m1/min_fs = 0.0
+m1/min_ss = 514.0
+m1/max_fs = 1029.0
+m1/max_ss = 1027.0
+m1/fs = 0.0002220x -0.9997400y
+m1/ss = -0.9992160x -0.0002170y
+m1/corner_x = 2214.25957894
+m1/corner_y = 92.0139285294
+
+m2/min_fs = 0.0
+m2/min_ss = 1028.0
+m2/max_fs = 1029.0
+m2/max_ss = 1541.0
+m2/fs = -0.0020481x -1.0005348y
+m2/ss = -0.9983412x +0.0010576y
+m2/corner_x = 1597.4005022
+m2/corner_y = 2101.70211537
+
+m3/min_fs = 0.0
+m3/min_ss = 1542.0
+m3/max_fs = 1029.0
+m3/max_ss = 2055.0
+m3/fs = -0.0005887x -0.9997813y
+m3/ss = -0.9982958x +0.0003974y
+m3/corner_x = 1595.17294252
+m3/corner_y = 1063.05183348
+
+m4/min_fs = 0.0
+m4/min_ss = 2056.0
+m4/max_fs = 1029.0
+m4/max_ss = 2569.0
+m4/fs = -0.0001283x -0.9994185y
+m4/ss = -0.9998541x +0.0006393y
+m4/corner_x = 1664.06460835
+m4/corner_y = 25.061321815
+
+m5/min_fs = 0.0
+m5/min_ss = 2570.0
+m5/max_fs = 1029.0
+m5/max_ss = 3083.0
+m5/fs = 0.0002421x -1.0004238y
+m5/ss = -1.0008347x +0.0011567y
+m5/corner_x = 1663.73373709
+m5/corner_y = -1013.98246581
+
+m6/min_fs = 0.0
+m6/min_ss = 3084.0
+m6/max_fs = 1029.0
+m6/max_ss = 3597.0
+m6/fs = -0.0000468x -0.9997925y
+m6/ss = -0.9987864x -0.0015127y
+m6/corner_x = 1044.60276587
+m6/corner_y = 2103.69726324
+
+m7/min_fs = 0.0
+m7/min_ss = 3598.0
+m7/max_fs = 1029.0
+m7/max_ss = 4111.0
+m7/fs = -0.0002248x -1.0000118y
+m7/ss = -0.9998321x +0.0003420y
+m7/corner_x = 1046.63423691
+m7/corner_y = 1064.65610742
+
+m8/min_fs = 0.0
+m8/min_ss = 4112.0
+m8/max_fs = 1029.0
+m8/max_ss = 4625.0
+m8/fs = -0.0007860x -0.9999388y
+m8/ss = -0.9994561x +0.0006193y
+m8/corner_x = 1114.33039924
+m8/corner_y = 25.1453069861
+
+m9/min_fs = 0.0
+m9/min_ss = 4626.0
+m9/max_fs = 1029.0
+m9/max_ss = 5139.0
+m9/fs = 0.0005826x -0.9999539y
+m9/ss = -0.9999378x +0.0005674y
+m9/corner_x = 1112.40812707
+m9/corner_y = -1014.28507319
+
+m10/min_fs = 0.0
+m10/min_ss = 5140.0
+m10/max_fs = 1029.0
+m10/max_ss = 5653.0
+m10/fs = 0.0011524x -0.9983465y
+m10/ss = -1.0001167x -0.0014642y
+m10/corner_x = 495.296332614
+m10/corner_y = 2102.19953678
+
+m11/min_fs = 0.0
+m11/min_ss = 5654.0
+m11/max_fs = 1029.0
+m11/max_ss = 6167.0
+m11/fs = 0.0000999x -1.0002213y
+m11/ss = -1.0005192x +0.0000633y
+m11/corner_x = 496.656703965
+m11/corner_y = 1065.27724791
+
+m12/min_fs = 0.0
+m12/min_ss = 6168.0
+m12/max_fs = 1029.0
+m12/max_ss = 6681.0
+m12/fs = -0.0000855x -0.9999193y
+m12/ss = -1.0004724x -0.0002647y
+m12/corner_x = 564.171218651
+m12/corner_y = 25.4230052458
+
+m13/min_fs = 0.0
+m13/min_ss = 6682.0
+m13/max_fs = 1029.0
+m13/max_ss = 7195.0
+m13/fs = -0.0006560x -0.9998388y
+m13/ss = -1.0003827x +0.0003877y
+m13/corner_x = 563.286933163
+m13/corner_y = -1013.80073624
+
+m14/min_fs = 0.0
+m14/min_ss = 7196.0
+m14/max_fs = 1029.0
+m14/max_ss = 7709.0
+m14/fs = -0.0000406x -0.9999492y
+m14/ss = -0.9997145x +0.0002315y
+m14/corner_x = -53.7592830185
+m14/corner_y = 2034.98959172
+
+m15/min_fs = 0.0
+m15/min_ss = 7710.0
+m15/max_fs = 1029.0
+m15/max_ss = 8223.0
+m15/fs = 0.0000830x -1.0003003y
+m15/ss = -1.0001746x -0.0002212y
+m15/corner_x = -52.8650708457
+m15/corner_y = 997.730814412
+
+m16/min_fs = 0.0
+m16/min_ss = 8224.0
+m16/max_fs = 1029.0
+m16/max_ss = 8737.0
+m16/fs = 0.0002819x -0.9998868y
+m16/ss = -1.0001755x -0.0003436y
+m16/corner_x = 14.4730424249
+m16/corner_y = -41.4096494962
+
+m17/min_fs = 0.0
+m17/min_ss = 8738.0
+m17/max_fs = 1029.0
+m17/max_ss = 9251.0
+m17/fs = 0.0000710x -0.9999990y
+m17/ss = -0.9999990x -0.0000710y
+m17/corner_x = 14.7182
+m17/corner_y = -1079.99
+
+m18/min_fs = 0.0
+m18/min_ss = 9252.0
+m18/max_fs = 1029.0
+m18/max_ss = 9765.0
+m18/fs = 0.0003985x -0.9994658y
+m18/ss = -0.9993795x +0.0010962y
+m18/corner_x = -603.526474602
+m18/corner_y = 2034.67961646
+
+m19/min_fs = 0.0
+m19/min_ss = 9766.0
+m19/max_fs = 1029.0
+m19/max_ss = 10279.0
+m19/fs = 0.0001215x -1.0000508y
+m19/ss = -0.9986848x -0.0002170y
+m19/corner_x = -603.013050601
+m19/corner_y = 997.527083431
+
+m20/min_fs = 0.0
+m20/min_ss = 10280.0
+m20/max_fs = 1029.0
+m20/max_ss = 10793.0
+m20/fs = -0.0002129x -0.9997114y
+m20/ss = -0.9991926x +0.0001047y
+m20/corner_x = -534.870840943
+m20/corner_y = -41.5520182954
+
+m21/min_fs = 0.0
+m21/min_ss = 10794.0
+m21/max_fs = 1029.0
+m21/max_ss = 11307.0
+m21/fs = -0.0012098x -1.0000400y
+m21/ss = -0.9990117x -0.0004867y
+m21/corner_x = -535.088798024
+m21/corner_y = -1080.39407788
+
+m22/min_fs = 0.0
+m22/min_ss = 11308.0
+m22/max_fs = 1029.0
+m22/max_ss = 11821.0
+m22/fs = -0.0013763x -1.0016851y
+m22/ss = -0.9984016x +0.0015958y
+m22/corner_x = -1152.66813908
+m22/corner_y = 2037.30125205
+
+m23/min_fs = 0.0
+m23/min_ss = 11822.0
+m23/max_fs = 1029.0
+m23/max_ss = 12335.0
+m23/fs = -0.0006443x -0.9996427y
+m23/ss = -0.9992117x +0.0010783y
+m23/corner_x = -1152.1391591
+m23/corner_y = 996.843546324
+
+m24/min_fs = 0.0
+m24/min_ss = 12336.0
+m24/max_fs = 1029.0
+m24/max_ss = 12849.0
+m24/fs = -0.0007565x -0.9992784y
+m24/ss = -0.9994284x +0.0005559y
+m24/corner_x = -1084.97948141
+m24/corner_y = -41.5784301786
+
+m25/min_fs = 0.0
+m25/min_ss = 12850.0
+m25/max_fs = 1029.0
+m25/max_ss = 13363.0
+m25/fs = -0.0016985x -0.9999457y
+m25/ss = -0.9999488x +0.0004879y
+m25/corner_x = -1084.78264327
+m25/corner_y = -1079.89291543
+
+m26/min_fs = 0.0
+m26/min_ss = 13364.0
+m26/max_fs = 1029.0
+m26/max_ss = 13877.0
+m26/fs = 0.0040020x -0.9968330y
+m26/ss = -1.0013620x -0.0004710y
+m26/corner_x = -1707.347444
+m26/corner_y = 2033.21445
+
+m27/min_fs = 0.0
+m27/min_ss = 13878.0
+m27/max_fs = 1029.0
+m27/max_ss = 14391.0
+m27/fs = -0.0000440x -0.9995340y
+m27/ss = -0.9987470x +0.0001860y
+m27/corner_x = -1704.0880367
+m27/corner_y = 997.367420522
+
+m28/min_fs = 0.0
+m28/min_ss = 14392.0
+m28/max_fs = 1029.0
+m28/max_ss = 14905.0
+m28/fs = -0.0008300x -0.9996020y
+m28/ss = -0.9992680x +0.0001480y
+m28/corner_x = -1632.5362714
+m28/corner_y = -40.9815422
+
+m29/min_fs = 0.0
+m29/min_ss = 14906.0
+m29/max_fs = 1029.0
+m29/max_ss = 15419.0
+m29/fs = 0.0009530x -1.0010560y
+m29/ss = -0.9985140x -0.0014000y
+m29/corner_x = -1638.73032075
+m29/corner_y = -1081.459554
+
+m30/min_fs = 0.0
+m30/min_ss = 15420.0
+m30/max_fs = 1029.0
+m30/max_ss = 15933.0
+m30/fs = 0.0000710x -0.9999990y
+m30/ss = -0.9999990x -0.0000710y
+m30/corner_x = -2252.42
+m30/corner_y = 997.518
+
+m31/min_fs = 0.0
+m31/min_ss = 15934.0
+m31/max_fs = 1029.0
+m31/max_ss = 16447.0
+m31/fs = 0.0000710x -0.9999990y
+m31/ss = -0.9999990x -0.0000710y
+m31/corner_x = -2183.510001
+m31/corner_y = -41.160929
+
+m0/coffset = -0.000085
+m1/coffset = -0.000085
+m2/coffset = -0.000085
+m3/coffset = -0.000085
+m4/coffset = -0.000085
+m5/coffset = -0.000085
+m6/coffset = -0.000085
+m7/coffset = -0.000085
+m8/coffset = -0.000085
+m9/coffset = -0.000085
+m10/coffset = -0.000085
+m11/coffset = -0.000085
+m12/coffset = -0.000085
+m13/coffset = -0.000085
+m14/coffset = -0.000085
+m15/coffset = -0.000085
+m16/coffset = -0.000085
+m17/coffset = -0.000085
+m18/coffset = -0.000085
+m19/coffset = -0.000085
+m20/coffset = -0.000085
+m21/coffset = -0.000085
+m22/coffset = -0.000085
+m23/coffset = -0.000085
+m24/coffset = -0.000085
+m25/coffset = -0.000085
+m26/coffset = -0.000085
+m27/coffset = -0.000085
+m28/coffset = -0.000085
+m29/coffset = -0.000085
+m30/coffset = -0.000085
+m31/coffset = -0.000085