aboutsummaryrefslogtreecommitdiff
path: root/doc/articles/speed.rst
blob: 3db8f5a37c69e3f2dffabd3704a6b565d54aed01 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
=====================================
How to increase data processing speed
=====================================

You want ``indexamajig`` to run faster?  You're probably already using ``-j``,
causing it to divide its work between parallel processes.  Maybe you're even
already using a compute cluster with a batch system, via the GUI or the
``turbo-index-slurm`` and ``turbo-index-lsf`` scripts.  But you want even more
speed?  Here are some tips for getting things to run as fast as possible:


Compile CrystFEL and dependencies with optimisations
====================================================

Note that CMake's default is to compile *without* optimisations.  You need to
add the option ``-DCMAKE_BUILD_TYPE=Release`` (or ``RelWithDebInfo``) to your
CMake invokation to tell it to enable optimisations.  In CrystFEL, it's
particularly important to do this for the HDF5 compression plugins (this makes
a factor of 3 difference in decompression speed!), XGandalf and PinkIndexer.


Tune or avoid compression
=========================

Data compression always trades speed for disk space.  For the highest speed,
disable it altogether.  Obviously, there needs to be a trade-off with available
disk space.

When compressing data in HDF5, pay careful attention to the
`chunk size <https://support.hdfgroup.org/HDF5/doc/Advanced/Chunking/>`_.
A badly selected chunk size can cause a very large slowdown.


Bin the pixel data
==================

If you're using a high-resolution detector such as an Eiger 16M, consider
whether you really need the full resolution or not.  Most experiments don't
need anything close to 16 megapixel resolution.  If not, bin the detector
frames down to 4M or even 1M.  This makes a huge difference because the peak
search algorithm must look at all pixels, so binning your data from 16M to 4M
can make it four times faster.  Note that the peak search is one of the only
processing stages which needs to be done on every single frame, hit or non-hit!


Avoid x/y bad regions
=====================

For a similar reason, avoid defining bad regions in x/y coordinates.  If you
can, define them in fs/ss coordinates instead, or use in-band bad pixel flags
(i.e. set the bad pixel values to NaN).  If you specify bad regions in x/y
coordinates, CrystFEL has to figure out which detector pixels fall into the
specified area in the lab coordinate system, for which it (currently) uses a
slow brute-force algorithm.


Avoid bad pixel masks
=====================

In many cases, e.g. Pilatus and Eiger detectors, the bad pixel information is
included in the image data itself, so there's no need to put the information in
a separate file.  Bad pixels have a special flag value, usually 65535.  With
recent versions, you can tell CrystFEL to take note of these values using
``flag_morethan = 65535`` in the geometry file.


Skip non-hits
=============

Use ``--min-peaks``, so that only plausible hits get processed.  At the same
time, add ``--no-non-hits-in-stream`` so that time isn't wasted recording
information about non-hits.


Choose the fastest peak search algorithms
=========================================

If the background is low and/or smooth, you can use the faster ``zaef`` peak
search algorithm instead of ``peakfinder8`` without compromising on the
results.

The speed of ``peakfinder8`` can be improved with option
``indexamajig --peakfinder8-fast``, which tells CrystFEL to pre-calculate some
values. This is only possible with a static detector geometry (see below).


Choose the fastest indexing algorithms
======================================

In our tests, ``asdf`` gives the best compromise between speed and success
rate, so it's the best choice if you need fast processing.  The ``indexamajig``
option ``--asdf-fast`` makes it about three times faster with only a small
reduction in success rate.

DirAx, TakeTwo, Mosflm and XGandalf are also good choices (roughly in that
order). Don't use PinkIndexer, unless you really need it (wide bandwidth or
electron diffraction data).  PinkIndexer is a very general and accurate
indexing algorithm, but these advantages must be "paid for" in speed.


Try less hard to index each frame
=================================

The default behaviour is to try very hard to index each frame: all indexing
methods will be tried up to six times, deleting the weakest peaks after each
unsuccessful attempt, and trying again with the leftover peaks after a
successful attempt.  If you enable a large number of indexing methods, this can
add up to over 30 attempts to index each frame!  The options ``--no-retry``
and ``--no-multi`` will disable this behaviour.  In addition, you should
reduce the number of indexing methods in operation: ``xgandalf`` alone is a
good choice.

Of course, doing the above will probably decrease the fraction of indexed
frames somewhat, but the trade-off might be positive for your data.


Integrate to lower resolution
=============================

Restrict the resolution of data for integration by setting
``indexamajig --push-res``.  This affects data quality, so you will need to
try different values to find the best one.  Start with ``--push-res=1.5``,
which will cause spots to be integrated up to 1.5 nm :sup:`-1` higher than the
conservatively-estimated resolution of each diffraction pattern.  This gives a
reasonable balance between integrating weak "invisible" high-resolution data,
and not including too much "junk" data.  If the metrics for the final merged
data suggest that there might be more information at higher resolution, use a
larger value.

Normally, we recommend limiting the resolution only at the merging stage
(``partialator --push-res``), because this gives you the most flexibility - you
can set any ``--push-res`` value without re-integrating the entire dataset.  If
you limit the resolution at the integration stage, the number of reflections to
be integrated will be much smaller, which can lead to a significant speed
improvement.  However, the ``--push-res`` value that you use for merging must
be smaller than the value used for integration.


Don't need integration results?  Don't integrate!
=================================================

If you're using CrystFEL as part of an online monitoring system, you might not
be interested in the integration results at all.  Since spot prediction and
integration can take a significant amount of time, you can save a lot by
disabling them.  Disable integration, but not prediction, with
``indexamajig --integration=none``.  The stream will contain predicted spot
positions, but their intensities will all be zero.  Disable prediction
altogether with ``indexamajig --cell-parameters-only``.

This is particularly important when doing "unrestricted" indexing with no
prior unit cell information.  Occasional spuriously large unit cells can slow
things down a lot by producing a lot of reflections.


Use a static detector geometry
==============================

CrystFEL geometry files allow some aspects of the geometry to come from the
data files, such as the panel z-positions ("clen"/camera length) and overall
detector shifts.  If you can instead give fixed numerical values for
everything, then some parts of CrystFEL can prepare calculations in advance.
In some cases, this can make a significant speed improvement.