aboutsummaryrefslogtreecommitdiff
path: root/doc/man/crystfel_geometry.5.md
blob: 640771ee4b32ba4af2e7eb80eee07a7423905637 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
% crystfel_geometry(5)

INTRODUCTION
============

A CrystFEL "geometry file", usually named _something.geom_, is used by programs
in the CrystFEL suite to get information about:

* The physical position of the detector, including all sub-detectors if
  applicable.

* The layout of data in the file, especially if a "container format" such as
  HDF5 is used.

* The values of parameters such as the incident radiation wavelength, or a
  description of where to get these values in the data files.


COORDINATE SYSTEM
=================

It's important to distinguish between the **data coordinate system** and the
**laboratory coordinate system**.

The **data coordinate system** consists of the **fast scan** and **slow scan**
directions.  **Fast scan** refers to the direction whose coordinate changes
most quickly as the bytes in the data file are moved through.  **Slow scan**
refers to the other dimension of the 2D data array.  Arrays in the data file
can have more than two dimensions - see secion **DATA DIMENSIONS** below.

CrystFEL's **laboratory coordinate system** defined as follows:

* +z is the beam direction, and points along the beam (i.e. away from the source)

* +y points towards the zenith (ceiling).

* +x completes the right-handed coordinate system.

The CrystFEL GUI shows +x horizontally (left to right), and +y vertically
(bottom to top).  This means that the GUI shows images from the "into the beam"
perspective.


PANELS AND GEOMETRY FILE SYNTAX
===============================

CrystFEL's representation of a detector is broken down into one or more
**panels**, each of which has its own position, size and various other
parameters.

Lines in a CrystFEL geometry file have the following general form:

    parameter = value

Many parameters, however, are specified for a specific panel, as follows:

    panel_name/parameter = value

Most parameters can theoretically be specified on a per-panel level, but in
practice are the same for all panels.  For example, the pixel size is usually
the same for all panels.  In this case, specify the parameter once, without
a panel name, at the top of the geometry file.  The value will then be used
for all panels which are *first mentioned* later in the file.

The panel names can be anything of your choosing, except that the names must
not start with **bad**, **group** or **rigid_group**.  These are reserved for
other purposes (see below).

You can also add comments, for example:

    ; This is a comment
    mypanel/min_fs = 34    ; This is also a comment


SPECIFYING THE WAVELENGTH AND BEAM PARAMETERS
=============================================

To specify the incident beam wavelength, you need to use one of the following
forms (only one):

**wavelength** = _nnn_ [**m**|**A**]
: Specifies a wavelength directly, in meters or Angstroms according to the
: suffix.

**photon_energy** = _nnn_ [**eV**|**keV**]
: Specifies the energy per photon of electromagnetic radiation (e.g. X-rays).
: If no units suffix is given, **eV** (electron volts) will be assumed.

**electron_voltage** = _nnn_ [**V**|**kV**]
: Specifies the accelerating voltage in an electron microscope.  This should be
: the accelerating voltage, not the relativistically-corrected energy of the
: accelerated electrons (the difference is small).

For all of these, a data location in the input files can be given instead of
a literal number.  For example, with data in HDF5 format:

    wavelength = /data/LCLS/wavelength m

If there are multiple frames per input file, the program will do what you
expect depending on the type of `/data/LCLS/wavelength`.  For example, a scalar
value in the metadata will be applied to all frames, or an array of values can
be used to provide a separate wavelength for each frame.

You can also specify the radiation bandwidth, as follows:

**bandwidth** = _bw_
: The bandwidth of the radiation, expressed as a fraction of the wavelength.
: The bandwidth will be interpreted as the standard deviation of a Gaussian
: spectrum, and used for calculating reflection positions.


PHYSICAL PANEL LOCATIONS
========================

For each panel, the physical location is controlled by the **fs**, **ss**,
**corner_x**, **corner_y** and **clen** parameters, for example:

    q3a15/fs = -0.0010058292x +0.9999995232y
    q3a15/ss = -0.9999995232x -0.0010058292y
    q3a15/corner_x = 575.475
    q3a15/corner_y = -221.866
    q3a15/coffset = 0.01


**fs**, **ss**
: The vectors in the lab coordinate system of the fast and slow scan directions
: of the panel data, measured in pixels.  Inclusion of a component in the z
: direction means that the panel is not perpendicular to the X-ray beam.

**corner_x**, **corner_y**
: The position in x and y (lab coordinate system) of the corner of this panel.
: The corner is defined as the first point in the panel to appear in the image
: data. The units are pixel widths of the current panel.  This should be the
: location of the very corner of the panel (not the center of the first pixel).

**coffset**
: The offset of the panel along the z-direction from the position given by
: **clen**.

The overall detector position in the z-direction is given by **clen**, which
can only be specified once in the geometry file (not for each panel):

**clen** = _nnn_ [mm|m]
: The overall z-position ("camera length") for the detector, or a data file
: location, and units (m or mm).  If no units are given, **m** will be assumed
: if the value is given as a literal number, or **mm** if it's a data file
: header location.  This discrepancy is for historical reasons, and you should
: always specify the units.  Like when specifying the wavelength (see above),
: CrystFEL should do what you expect with multi-frame data files.

Getting the **clen** value from the image headers gives the illusion of
avoiding the need to create a new geometry file every time you change the
detector position.  However, in practice this doesn't work very well because
the detector movement direction is usually not exactly parallel to the beam
axis.  That means that the beam center position varies with the camera length,
and you would have to prepare a new geometry file for each position anyway.
For the best results, simply make sure that the experimental geometry stays as
static as possible.

### Per-frame beam center position

You can specify an overall detector shift.  The most common use of this is for
serial scanning electron diffraction experiments, where the beam center moves
from frame to frame.  You should avoid using this feature, especially with a
metadata location, because it limits CrystFEL's ability to pre-calculate
certain data structures.  This feature is likely to be removed in future
versions.

**detector_shift_**[x,y] = _nnn_ [m|mm]
: These specify that the entire detector should be shifted by this amount in the
: x and y directions.  The units should be specified as m or mm.  If units are
: not specified, the value will be taken as metres.  nnn can be a file metadata
: location (e.g. an HDF5 path).


PANEL DATA LOCATIONS
====================

For each panel, you have to specify where to find the data in the file.  Don't
forget that many of these parameters will be the same for each panel, so you
can set them once at the top (see the section **Panels and geometry file
syntax**).

**data** = _/location/of/data_
: The location in the data file of the array that contains the panel's data.
: The interpretation of the value depends on the file type.  If you're using
: HDF5 files, it will be a path such as `/data/run_1/imagedata`.  The default
: value is `/data/data`.

**min_fs**, **max_fs**, **min_ss**, **max_ss** = _nnn_
: The range of pixels in the data block that correspond to this panel.  Often,
: multiple panels are grouped together into one "slab".  The pixel ranges are
: in the *data coordinate system*, and are specified *inclusively*.

### Multiple frames per file

The best performance is achieved when each file on disk contains a large number
of images, rather than just one.  In this case, you have to additionally
specify the data layout to CrystFEL.

Consider a file format where each frame has its own data array under a separate
name, for example an HDF5 file with the following layout:

    /data/run_1/image_1/data
    /data/run_1/image_2/data
    /data/run_1/image_3/data
    /data/run_1/image_4/data
    /data/run_1/image_5/data
    ...

In this case, you can use **%** as a placeholder in the data location.
For example:

    data = /data/run_1/%/data

For HDF5 files, the **%** must be a whole name at a certain hierarchy level,
i.e. the following is not allowed:

    data = /data/run_1/image_%/data  ;; This won't work

Next, consider a file format where the frames and detector panels are grouped
into a four-dimensional array.  The first two dimensions are the image data
axes, the third dimension is the panel number, and the fourth dimension is the
frame number.  In this case, you can specify the data location as follows:

    data = /data/run_1/image4Darray
    dim0 = %
    dim2 = ss
    dim3 = fs
    min_fs = 0
    max_fs = 255
    min_ss = 0
    max_ss = 255

    panel1/dim1 = 0
    panel1/corner_x = ....
    ...

    panel2/dim1 = 1
    panel2/corner_x = ....
    ...

    panel3/dim1 = 2
    panel3/corner_x = ....
    ...

**dim** values can be a literal number, a placeholder (**%**), or **fs** or
**ss**.  Note that, in this example, the common parameter values have been
placed at the top, avoiding some repetition.

CrystFEL assumes that the data block defined by the 'data' property has a
dimensionality corresponding to the axis with the highest value of n defined by
the 'dim' property.  That is, if the geometry file specifies dim0, dim1 and
dim2, then the data block is expected to be three-dimensional.  The size of the
data block along each of those axes comes from the image metadata (e.g. the
array sizes in the HDF5 file).

The lowest number of n corresponds to the most slowly-changing array index as
the data block is traversed.  The default values are dim0=ss and dim1=fs.  The
value of n corresponding to fs must not be lower than the value assigned to ss,
i.e. "fast scan is always fast scan".


PEAK LISTS
==========

It's possible to include lists of peak positions into the data file.  If the
data is pre-processed using a "hit finding" procedure, usually a peak search
will already have been performed.  It makes sense to re-use these peak search
results, instead of performing a new peak search inside CrystFEL.

In this case, you need to specify the location of the peak list in the data
file, and the format of the peak list.

**peak_list** = _loc_
: Peak list location in the data files.

**peak_list_type** = _type_
: Specify the layout of the peak list.  Allowed values are **cxi**, **list3**
: and **auto**.

The possible list types are:

**list3**
: The peak list is a two dimensional array whose size in the first dimension
: equals the number of peaks and whose size in the second dimension is exactly
: three.  The first two columns contain the fast scan and slow scan coordinates,
: the third contains the intensities.  This is the correct option for
: "single-frame" HDF5 files as written by older versions of Cheetah.

**cxi**
: The peak list is an HDF5 group containing four separate HDF5 datasets: nPeaks,
: peakXPosRaw, peakYPosRaw and peakTotalIntensity.  See the specification for the
: CXI file format at http://www.cxidb.org/ for more details.  This is the correct
: option for "multi-event" HDF5 files as output by recent versions of Cheetah.

**auto**
: CrystFEL will decide between the above options based on the file extension.

### Important note about coordinate conventions

Note that CrystFEL considers all peak locations to be distances from the corner
of the detector panel, in pixel units, consistent with its description of
detector geometry (see the section about **corner_x** above).  The software
which generates the HDF5 or CXI files, including Cheetah, may instead consider
the peak locations to be pixel indices in the data array.  In the former case,
a peak position (0,0) corresponds to the very corner of the detector panel.  In
the latter case, position (0,0) corresponds to the center of the first pixel,
and the very corner would be (-0.5,-0.5).

To compensate for this discrepancy, CrystFEL will, by default, add 0.5 to all
peak coordinates. See the **indexamajig** option **--no-half-pixel-shift** if
this isn't what you want.


PIXEL SIZE
==========

You will need to specify the size of the pixels, of course.  Use one of the
following:

**pixel_pitch** = _pixelSize_
: The width of the pixels, in meters.

**res** = _pixelsPerMeter_
: The resolution, in pixels per metre, i.e. one divided by the pixel size in
: metres.

These values effectively give the scale factor between the length of the
**fs,ss** vectors and physical space.  If the **fs** and **ss** vectors have
different magnitudes, the pixels will not be square.  This is allowed, but
comes with a possibility of strange problems, because many algorithms assume
square pixels.


DETECTOR GAIN
=============

CrystFEL needs to know the gain of the detector, in order to determine how
many photons correspond to a particular signal level and hence calculate error
estimates on the intensity values.  These gain values are **not** used to
correct the pixel values for different gains among the panels.

Use one of the following:

**adu_per_photon**
: The number of detector intensity units which will arise from one quantum of
: intensity (one X-ray photon, or one electron in an electron microscope).

**adu_per_eV**
: The number of detector intensity units which will arise from a 1 eV photon
: of electromagnetic radiation.  This will be scaled by the photon energy
: (see **photon_energy**) to calculate the intensity per photon at the
: wavelength used by the experiment.  This option should only be used for
: electromagnetic radiation.


DETECTOR SATURATION
===================

You can specify the saturation value in the geometry file, which will allow
**indexamajig** to avoid integrating saturated reflections.  However, usually
it's best to include all reflections at this stage, and exclude the saturated
reflections at the merging stage (see **process_hkl** and **partialator**
options **--max-adu**).

**max_adu**
: The saturation value for the panel.  A warning will be displayed if you use
: this option, because it's better to exclude saturated reflections at the
: merging stage.

Some combinations of detectors and processing methods result in the saturation
level varying pixel-to-pixel.  For this case, you can provide a per-pixel map
of saturation values.  Note that **both** the map values and the **max_adu**
values will both be honoured.

**saturation_map**
: This specifies the location of the per-pixel saturation map in the data file.

**saturation_map_file**
: Specifies that the saturation map should come from the file named here,
: instead of the file being processed.  This can be an absolute filename or
: relative to the working directory.


BAD REGIONS
===========

"Bad region" refers to any set of pixels that should be completely ignored by
CrystFEL.  There are multiple ways to mark pixels as bad.

### Marking a whole panel

To flag all pixels in one panel as bad, simply set the **no_index** parameter:

**no_index**
: If set to **true** or any numerical value other than 0, indicates that the
: panel should be ignored.  The slightly misleading name is for historical
: reasons.

### Marking pixels at the panel edges

With many detectors, the pixels at the edge of the detector panels behave
differently and should be masked out.

**mask_edge_pixels** = _n_
: Mark a border of _n_ pixels around the edge of the panel as bad.

### Marking pixels according to value

Many data files contain information about bad pixels encoded in the pixel
values, for example a value of 65535 often indicates a bad pixel.

**flag_lessthan** = _n_
: Mark pixels as bad if their value is less than _n_.

**flag_morethan** = _n_
: Mark pixels as bad if their value is more than _n_.

**flag_equal** = _n_
: Mark pixels as bad if their value exactly _n_.

Note carefully that the inequalities are strict, not inclusive: "less than",
not "less than or equal to".

Note also that **flag_equal** will be difficult to use for data in
floating-point format.  With floating-point data, you should use
**flag_lessthan** and **flag_morethan**.

### Marking pixels in rectangles

You can specify a range of pixels to ignore in the *data coordinate system* or
the *laboratory coordinate system*.

To mask pixels in the *data coordinate system*, use the following syntax:

    badregionB/min_fs = 128
    badregionB/max_fs = 160
    badregionB/min_ss = 256
    badregionB/max_ss = 512
    badregionB/panel = q0a1

A bad region is distinguished from a panel because it starts with **bad**.
Apart from that, the region can use any name of your choice.

The pixel ranges are specified *inclusively*.  The *panel* name has to be
specified, because the pixel range alone might not be unique (see section
**Multiple frames per file**).  Bad regions specified in this way therefore
cannot stretch across multiple panels.

To mask pixels in the *laboratory coordinate system*, use the following syntax:

    badregionA/min_x = -20.0
    badregionA/max_x = +20.0
    badregionA/min_y = -100.0
    badregionA/max_y = +100.0

In this case, the panel name is not required, and the bad region can span
multiple panels.  However, bad regions specified in laboratory coordinates take
longer to process (when loading images) than regions specified in fs/ss (image
data) coordinates.  You should therefore use fs/ss coordinates unless the
convenience of x/y coordinates outweighs the speed reduction.

### Providing a separate bad pixel mask

You can provide an array, separate to the image data array, containing
information about the bad pixels.  Up to 8 such masks can be provided for each
detector panel.  Specify the mask location using the following directives,
where you should substitute **N** for a number between 0 and 7 inclusive:

**maskN_data** = _location_
: The location (inside the image data file) of the mask array.  Placeholders
: ('%') in the location will be substituted with the same values as used for the
: placeholders in the image data, although there may be fewer of them for the
: masks than for the image data.

**maskN_file** = _filename_
: Filename to use for the mask data, if not the same as the image data.
: The filename : may be specified as an absolute filename, or relative to the
: working directory.

**maskN_goodbits** = _bitmask_
: Bit mask for good pixels (see below).

**maskN_badbits** = _bitmask_
: Bit mask for bad pixels (see below).

A pixel will be considered *bad* unless all of the bits which are set in
**goodbits** are set.  A pixel will *also* be considered bad if *any* of the
bits which are set in **badbits** are set.  In pseudocode, where **&** is a
bitwise "and", the algorithm is:

    if (mask_value & mask_goodbits) != mask_goodbits:
        mark_pixel_as_bad

    if (mask_value & mask_badbits) != 0:
        mark_pixel_as_bad

Example:

    mask2_data = /data/bad_pixel_map
    mask2_file = /home/myself/mybadpixels.h5
    mask2_goodbits = 0x00
    mask2_badbits = 0xff

There are some older mask directives which are still understood by this version
of CrystFEL.  They are synonyms of the new directives as follows:

    mask       ----->   mask0_data
    mask_file  ----->   mask0_file
    mask_good  ----->   mask0_goodbits
    mask_bad   ----->   mask0_badbits


DETECTOR HIERARCHY
==================

Detector panels can be combined into **groups**.  Certain operations,
especially detector geometry refinement, use these groups to conveniently move
panels.  Groups are specified as follows:

    group_abc = panel1,panel2
    group_def = panel3,panel4

This creates a group called **abc**, containing panels **panel1** and
**panel2**, and a group **def** containing **panel3** and **panel4**.

Groups can themselves be combined into higher-level groups, for example:

    group_all = abc,def

This defines a group called **all** which contains both of the groups created
above.

The highest-level group should always be called **all**.

All members of a group need to be defined before defining the group.  This
means that the group definitions must come **after** the panel definitions, and
the groups should be defined from the bottom to top level of the hierarchy -
the **all** group coming last.

If you do not define any groups, CrystFEL will automatically create the **all**
group for you, containing all panels in a flat hierarchy.  This allows basic
geometry refinement (level zero, see **align_detector**) to work without any
extra work.

The **group** system replaces the **rigid_group** system used in older versions
of CrystFEL.  If the geometry file contains any **rigid_group** lines, they
will be ignored in this version.


EXAMPLES
========

For examples, look in the **examples** folder, which can be found online at
https://gitlab.desy.de/thomas.white/crystfel/-/tree/master/doc/examples


AUTHOR
======

This page was written by Thomas White and Valerio Mariani.


REPORTING BUGS
==============

Report bugs to <taw@physics.org>, or visit <http://www.desy.de/~twhite/crystfel>.


COPYRIGHT AND DISCLAIMER
========================

Copyright © 2023 Deutsches Elektronen-Synchrotron DESY, a research centre of
the Helmholtz Association.

CrystFEL is free software: you can redistribute it and/or modify it under the
terms of the GNU General Public License as published by the Free Software
Foundation, either version 3 of the License, or (at your option) any later
version.

CrystFEL is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.  See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with
CrystFEL.  If not, see <http://www.gnu.org/licenses/>.


SEE ALSO
========

**crystfel**(7), **indexamajig**(1), **adjust_detector**(1),
**align_detector**(1)