From 9838a9ff1da59f968dd88eaac8dd4c9b5db07159 Mon Sep 17 00:00:00 2001 From: Thomas White Date: Thu, 21 Sep 2023 13:35:34 +0200 Subject: Update manual pages --- doc/man/align_detector.1.md | 18 +- doc/man/crystfel_geometry.5.md | 471 +++++++++++++++++++++++++++-------------- 2 files changed, 319 insertions(+), 170 deletions(-) diff --git a/doc/man/align_detector.1.md b/doc/man/align_detector.1.md index c31e8758..aa5f5061 100644 --- a/doc/man/align_detector.1.md +++ b/doc/man/align_detector.1.md @@ -18,7 +18,8 @@ DESCRIPTION **align_detector** refines the detector geometry based on the calibration data written by **indexamajig**. The refinement takes into account all the inter-dependencies between crystal orientations, cell parameters and the panel -positions, but is nevertheless very fast. This is achieved using the +positions, as if a single large minimisation had been performed with all frames +at once. The algorithm is nevertheless very fast - this is achieved using the Millepede-II algorithm. For more information, see https://www.desy.de/~kleinwrt/MP2/doc/html/index.html @@ -29,11 +30,12 @@ section **Detector hierarchy**. Next, run **indexamajig** as usual, but with option **--mille**. This will produce several files named **mille-data-0.bin**, **mille-data-1.bin**, **mille-data-2.bin** and so on - as many files as there were indexamajig -subprocesses (set with **indexamajig -j**). +subprocesses (set with **indexamajig -j**). Use option **--mille-dir** to +put these files in a useful location. -Finally, run **align_detector**, giving it the input geometry file, the "Mille" files, a refinement -level and a filename for the updated geometry file. The input geometry file -must match the file used for the indexamajig run. +Finally, run **align_detector**, giving it the input geometry file, the "Mille" +files, a refinement level and a filename for the updated geometry file. The +input geometry file must match the file used for the indexamajig run. Refinement level **0** allows only the overall detector position to vary. Higher levels allow groups of panels to move according to the hierarchy. For @@ -58,15 +60,15 @@ at https://gitlab.desy.de/claus.kleinwort/millepede-ii OPTIONS ======= -**-g** _input.geom_ +**-g** _input.geom_, **--geometry**=_input.geom_ : Specify the input geometry filename. -**-o** _output.geom_ +**-o** _output.geom_, **--output**=_output.geom_ : Specify the output geometry filename. : Note that the geometry file will be re-written, meaning that any formatting : and comments will be lost. -**-l** _level_ +**-l** _level_, **--level**=_level_ : Specify the refinement level. **-l 0** refines the overall detector position : only. The maximum refinement level is determined by the hierarchy of the : detector. diff --git a/doc/man/crystfel_geometry.5.md b/doc/man/crystfel_geometry.5.md index dffc1d7f..ecaf60bf 100644 --- a/doc/man/crystfel_geometry.5.md +++ b/doc/man/crystfel_geometry.5.md @@ -148,6 +148,15 @@ can only be specified once in the geometry file (not for each panel): : always specify the units. Like when specifying the wavelength (see above), : CrystFEL should do what you expect with multi-frame data files. +Getting the **clen** value from the image headers gives the illusion of +avoiding the need to create a new geometry file every time you change the +detector position. However, in practice this doesn't work very well because +the detector movement direction is usually not exactly parallel to the beam +axis. That means that the beam center position varies with the camera length, +and you would have to prepare a new geometry file for each position anyway. +For the best results, simply make sure that the experimental geometry stays as +static as possible. + ### Per-frame beam center position You can specify an overall detector shift. The most common use of this is for @@ -167,174 +176,325 @@ versions. PANEL DATA LOCATIONS ==================== -data -: The location in the HDF5 file of the data block that contains the panel's data. -: The default value is /data/data. If the HDF5 file contains multiple events, -: and each event is stored in a different data block, the variable part of the -: path can be represented using the % character placeholder. -: -: Example: -: -: data = /data/%/rawdata -: -: The CrystFEL programs will look for the first event at -: /data/event1_name/rawdata, for the second at /data/event2_name/rawdata, etc., -: where event_name and event2_name are simply whatever the program could find in -: the HDF5 file which matched the pattern you gave. - -dimn -: Information about the layout of the data block identified by the 'data' -: property. n is an integer number identifying an axis in a multidimensional HDF5 -: data block. The property value defines the kind of information encoded by the -: axis. Possible values are: -: % - event placeholder,the axis encodes events -: ss - the axis encoding the slow scan index -: fs - the axis encodes the fast scan index -: number - the index in this dimension should be fixed at number. -: -: CrystFEL assumes that the data block defined by the 'data' property has a -: dimensionality corresponding to the axis with the highest value of n defined by -: the 'dim' property. That is, if the geometry file specifies dim0, dim1 and -: dim2, then the data block is expected to be three-dimensional. The size of the -: data block along each of those axes comes from the image metadata (e.g. the -: array sizes in the HDF5 file). -: -: The lowest number of n corresponds to the most slowly-changing array index as -: the data block is traversed. The default values are dim0=ss and dim1=fs. The -: value of n corresponding to fs must not be lower than the value assigned to ss, -: i.e. "fast scan is always fast scan". -: -: Example: -: -: dim0 = % -: dim1 = 4 -: dim2 = ss -: dim3 = fs -: -: The above snippet specifies that the data block is 4-dimensional. The first -: axis represents the event number, the index in the second axis is always 4, and -: the remaining two axes are the image coordinates. - -min_fs, min_ss, max_fs, max_ss -: The range of pixels in the data block specified by the 'data' property that -: corresponds to the panel, in fast scan/slow scan coordinates, specified -: inclusively. +For each panel, you have to specify where to find the data in the file. Don't +forget that many of these parameters will be the same for each panel, so you +can set them once at the top (see the section **Panels and geometry file +syntax**). + +**data** = _/location/of/data_ +: The location in the data file of the array that contains the panel's data. +: The interpretation of the value depends on the file type. If you're using +: HDF5 files, it will be a path such as `/data/run_1/imagedata`. The default +: value is `/data/data`. + +**min_fs**, **max_fs**, **min_ss**, **max_ss** = _nnn_ +: The range of pixels in the data block that correspond to this panel. Often, +: multiple panels are grouped together into one "slab". The pixel ranges are +: in the *data coordinate system*, and are specified *inclusively*. + +### Multiple frames per file + +The best performance is achieved when each file on disk contains a large number +of images, rather than just one. In this case, you have to additionally +specify the data layout to CrystFEL. + +Consider a file format where each frame has its own data array under a separate +name, for example an HDF5 file with the following layout: + + /data/run_1/image_1/data + /data/run_1/image_2/data + /data/run_1/image_3/data + /data/run_1/image_4/data + /data/run_1/image_5/data + ... + +In this case, you can use **%** as a placeholder in the data location. +For example: + + data = /data/run_1/%/data + +For HDF5 files, the **%** must be a whole name at a certain hierarchy level, +i.e. the following is not allowed: + + data = /data/run_1/image_%/data ;; This won't work + +Next, consider a file format where the frames and detector panels are grouped +into a four-dimensional array. The first two dimensions are the image data +axes, the third dimension is the panel number, and the fourth dimension is the +frame number. In this case, you can specify the data location as follows: + + data = /data/run_1/image4Darray + dim0 = % + dim2 = ss + dim3 = fs + min_fs = 0 + max_fs = 255 + min_ss = 0 + max_ss = 255 + + panel1/dim1 = 0 + panel1/corner_x = .... + ... + + panel2/dim1 = 1 + panel2/corner_x = .... + ... + + panel3/dim1 = 2 + panel3/corner_x = .... + ... + +**dim** values can be a literal number, a placeholder (**%**), or **fs** or +**ss**. Note that, in this example, the common parameter values have been +placed at the top, avoiding some repetition. + +CrystFEL assumes that the data block defined by the 'data' property has a +dimensionality corresponding to the axis with the highest value of n defined by +the 'dim' property. That is, if the geometry file specifies dim0, dim1 and +dim2, then the data block is expected to be three-dimensional. The size of the +data block along each of those axes comes from the image metadata (e.g. the +array sizes in the HDF5 file). + +The lowest number of n corresponds to the most slowly-changing array index as +the data block is traversed. The default values are dim0=ss and dim1=fs. The +value of n corresponding to fs must not be lower than the value assigned to ss, +i.e. "fast scan is always fast scan". PEAK LISTS ========== -peak_list = loc -: This gives the location of the peak list in the data files, for peak detection -: methods hdf5 and cxi (see man indexamajig). - -peak_list_type = layout -: Specify the layout of the peak list. Allowed values are cxi, list3 and auto. -: -: list3 expects the peak list to be a two dimensional array whose size in the -: first dimension equals the number of peaks and whose size in the second -: dimension is exactly three. The first two columns contain the fast scan and -: slow scan coordinates, the third contains the intensities. This is the correct -: option for "single-frame" HDF5 files as written by older versions of Cheetah. -: -: cxi expects the peak list to be a group containing four separate HDF5 datasets: -: nPeaks, peakXPosRaw, peakYPosRaw and peakTotalIntensity. See the specification -: for the CXI file format at http://www.cxidb.org/ for more details. This is the -: correct option for "multi-event" HDF5 files as output by recent versions of -: Cheetah. -: -: auto tells CrystFEL to decide between the above options based on the file extension. -: -: Note that CrystFEL considers all peak locations to be distances from the corner -: of the detector panel, in pixel units, consistent with its description of -: detector geometry (see 'man crystfel_geometry'). The software which generates -: the HDF5 or CXI files, including Cheetah, may instead consider the peak -: locations to be pixel indices in the data array. To compensate for this -: discrepancy, CrystFEL will, by default, add 0.5 to all peak coordinates. Use -: --no-half-pixel-shift if this isn't what you want. - - -DETECTOR RESPONSE PROPERTIES -============================ - -adu_per_eV, adu_per_photon -: The number of detector intensity units (ADU) which will arise from either one -: electron-Volt of photon energy, or one photon. This is used to estimate -: Poisson errors. Note that setting different values for this parameter for -: different panels does not result in the intensities being scaled accordingly -: when integrating data. You should only specify one out of adu_per_eV and -: adu_per_photon. - -res -: The resolution (in pixels per metre) for this panel. This is one over the -: pixel size in metres. - -max_adu -: The saturation value for the panel. You can use this to exclude saturated -: peaks from the peak search or to avoid integrating saturated reflections. -: However, usually it's best to include saturated peaks, and exclude saturated -: reflections with the --max-adu option of process_hkl and partialator. -: Therefore you should avoid setting this parameter - a warning will be displayed -: if you do. - -saturation_map -: This specifies the location of the per-pixel saturation map in the HDF5 file. -: This works just like mask in that it can come from the current file or a -: separate one (see saturation_map_file). Reflections will be rejected if they -: contain any pixel above the per-pixel values, in addition to the other checks -: (see max_adu). - -saturation_map_file -: Specifies that the saturation map should come from the HDF5 file named here, -: instead of the HDF5 file being processed. It can be an absolute filename or +It's possible to include lists of peak positions into the data file. If the +data is pre-processed using a "hit finding" procedure, usually a peak search +will already have been performed. It makes sense to re-use these peak search +results, instead of performing a new peak search inside CrystFEL. + +In this case, you need to specify the location of the peak list in the data +file, and the format of the peak list. + +**peak_list** = _loc_ +: Peak list location in the data files. + +**peak_list_type** = _type_ +: Specify the layout of the peak list. Allowed values are **cxi**, **list3** +: and **auto**. + +The possible list types are: + +**list3** +: The peak list is a two dimensional array whose size in the first dimension +: equals the number of peaks and whose size in the second dimension is exactly +: three. The first two columns contain the fast scan and slow scan coordinates, +: the third contains the intensities. This is the correct option for +: "single-frame" HDF5 files as written by older versions of Cheetah. + +**cxi** +: The peak list is an HDF5 group containing four separate HDF5 datasets: nPeaks, +: peakXPosRaw, peakYPosRaw and peakTotalIntensity. See the specification for the +: CXI file format at http://www.cxidb.org/ for more details. This is the correct +: option for "multi-event" HDF5 files as output by recent versions of Cheetah. + +**auto** +: CrystFEL will decide between the above options based on the file extension. + +### Important note about coordinate conventions + +Note that CrystFEL considers all peak locations to be distances from the corner +of the detector panel, in pixel units, consistent with its description of +detector geometry (see the section about **corner_x** above). The software +which generates the HDF5 or CXI files, including Cheetah, may instead consider +the peak locations to be pixel indices in the data array. In the former case, +a peak position (0,0) corresponds to the very corner of the detector panel. In +the latter case, position (0,0) corresponds to the center of the first pixel, +and the very corner would be (-0.5,-0.5). + +To compensate for this discrepancy, CrystFEL will, by default, add 0.5 to all +peak coordinates. See the **indexamajig** option **--no-half-pixel-shift** if +this isn't what you want. + + +PIXEL SIZE +========== + +You will need to specify the size of the pixels, of course. Use one of the +following: + +**pixel_pitch** = _pixelSize_ +: The width of the pixels, in meters. + +**res** = _pixelsPerMeter_ +: The resolution, in pixels per metre, i.e. one divided by the pixel size in +: metres. + +These values effectively give the scale factor between the length of the +**fs,ss** vectors and physical space. If the **fs** and **ss** vectors have +different magnitudes, the pixels will not be square. This is allowed, but +comes with a possibility of strange problems, because many algorithms assume +square pixels. + + +DETECTOR GAIN +============= + +CrystFEL needs to know the gain of the detector, in order to determine how +many photons correspond to a particular signal level and hence calculate error +estimates on the intensity values. These gain values are **not** used to +correct the pixel values for different gains among the panels. + +Use one of the following: + +**adu_per_photon** +: The number of detector intensity units which will arise from one quantum of +: intensity (one X-ray photon, or one electron in an electron microscope). + +**adu_per_eV** +: The number of detector intensity units which will arise from a 1 eV photon +: of electromagnetic radiation. This will be scaled by the photon energy +: (see **photon_energy**) to calculate the intensity per photon at the +: wavelength used by the experiment. This option should only be used for +: electromagnetic radiation. + + +DETECTOR SATURATION +=================== + +You can specify the saturation value in the geometry file, which will allow +**indexamajig** to avoid integrating saturated reflections. However, usually +it's best to include all reflections at this stage, and exclude the saturated +reflections at the merging stage (see **process_hkl** and **partialator** +options **--max-adu**). + +**max_adu** +: The saturation value for the panel. A warning will be displayed if you use +: this option, because it's better to exclude saturated reflections at the +: merging stage. + +Some combinations of detectors and processing methods result in the saturation +level varying pixel-to-pixel. For this case, you can provide a per-pixel map +of saturation values. Note that **both** the map values and the **max_adu** +values will both be honoured. + +**saturation_map** +: This specifies the location of the per-pixel saturation map in the data file. + +**saturation_map_file** +: Specifies that the saturation map should come from the file named here, +: instead of the file being processed. This can be an absolute filename or : relative to the working directory. BAD REGIONS =========== -Bad regions will be completely ignored by CrystFEL. You can specify the pixels -to exclude in pixel units, either in the lab coordinate system (see above) or -in fast scan/slow scan coordinates (mixtures are not allowed). In the latter -case, the range of pixels is specified inclusively. Bad regions are -distinguished from normal panels by the fact that they begin with the three -letters "bad". +"Bad region" refers to any set of pixels that should be completely ignored by +CrystFEL. There are multiple ways to mark pixels as bad. + +### Marking a whole panel + +To flag all pixels in one panel as bad, simply set the **no_index** parameter: + +**no_index** +: If set to **true** or any numerical value other than 0, indicates that the +: panel should be ignored. The slightly misleading name is for historical +: reasons. + +### Marking pixels at the panel edges + +With many detectors, the pixels at the edge of the detector panels behave +differently and should be masked out. + +**mask_edge_pixels** = _n_ +: Mark a border of _n_ pixels around the edge of the panel as bad. + +### Marking pixels according to value + +Many data files contain information about bad pixels encoded in the pixel +values, for example a value of 65535 often indicates a bad pixel. + +**flag_lessthan** = _n_ +: Mark pixels as bad if their value is less than _n_. + +**flag_morethan** = _n_ +: Mark pixels as bad if their value is more than _n_. + +**flag_equal** = _n_ +: Mark pixels as bad if their value exactly _n_. -If you specify a bad region in fs/ss (image data) coordinates, you must also -specify which panel name you are referring to. +Note carefully that the inequalities are strict, not inclusive: "less than", +not "less than or equal to". -Note that bad regions specified in x/y (lab frame) coordinates take longer to -process (when loading images) than regions specified in fs/ss (image data) -coordinates. You should use fs/ss coordinates unless the convenience of x/y -coordinates outweighs the speed reduction. +Note also that **flag_equal** will be difficult to use for data in +floating-point format. With floating-point data, you should use +**flag_lessthan** and **flag_morethan**. -no_index -: Set this to 1 or "true" to ignore this panel completely. +### Marking pixels in rectangles -flag_lessthan, flag_morethan, flag_equal -: Mark pixels as "bad" if their values are respectively less than, more than or -: equal to the given value. Note carefully that the inequalities are strict, not -: inclusive: "less than", not "less than or equal to". +You can specify a range of pixels to ignore in the *data coordinate system* or +the *laboratory coordinate system*. -mask_edge_pixels -: Mark the specified number of pixels, at the edge of the panel, as "bad". +To mask pixels in the *data coordinate system*, use the following syntax: -maskN_data, maskN_file, maskN_goodbits, maskN_badbits -: These specify the parameters for bad pixel mask number N. You can have up to 8 -: bad pixel masks, numbered from 0 to 7 inclusive. Placeholders ('%') in the -: location (maskN_data) will be substituted with the same values as used for the + badregionB/min_fs = 128 + badregionB/max_fs = 160 + badregionB/min_ss = 256 + badregionB/max_ss = 512 + badregionB/panel = q0a1 + +A bad region is distinguished from a panel because it starts with **bad**. +Apart from that, the region can use any name of your choice. + +The pixel ranges are specified *inclusively*. The *panel* name has to be +specified, because the pixel range alone might not be unique (see section +**Multiple frames per file**). Bad regions specified in this way therefore +cannot stretch across multiple panels. + +To mask pixels in the *laboratory coordinate system*, use the following syntax: + + badregionA/min_x = -20.0 + badregionA/max_x = +20.0 + badregionA/min_y = -100.0 + badregionA/max_y = +100.0 + +In this case, the panel name is not required, and the bad region can span +multiple panels. However, bad regions specified in laboratory coordinates take +longer to process (when loading images) than regions specified in fs/ss (image +data) coordinates. You should therefore use fs/ss coordinates unless the +convenience of x/y coordinates outweighs the speed reduction. + +### Providing a separate bad pixel mask + +You can provide an array, separate to the image data array, containing +information about the bad pixels. Up to 8 such masks can be provided for each +detector panel. Specify the mask location using the following directives, +where you should substitute **N** for a number between 0 and 7 inclusive: + +**maskN_data** = _location_ +: The location (inside the image data file) of the mask array. Placeholders +: ('%') in the location will be substituted with the same values as used for the : placeholders in the image data, although there may be fewer of them for the : masks than for the image data. -: -: You can optionally give a filename for each mask with maskN_file. The filename -: may be specified as an absolute filename, or relative to the working directory. -: If you don't specify a filename, the mask will be read from the same file as -: the image data. -: -: A pixel will be considered bad unless all of the bits which are set in goodbits -: are set. A pixel will also be considered bad if any of the bits which are set -: in badbits are set. Note that pixels can additionally be marked as bad via -: other mechanisms as well (e.g. no_index or bad). + +**maskN_file** = _filename_ +: Filename to use for the mask data, if not the same as the image data. +: The filename : may be specified as an absolute filename, or relative to the +: working directory. + +**maskN_goodbits** = _bitmask_ +: Bit mask for good pixels (see below). + +**maskN_badbits** = _bitmask_ +: Bit mask for bad pixels (see below). + +A pixel will be considered *bad* unless all of the bits which are set in +**goodbits** are set. A pixel will *also* be considered bad if *any* of the +bits which are set in **badbits** are set. In pseudocode, where **&** is a +bitwise "and", the algorithm is: + + if (mask_value & mask_goodbits) != mask_goodbits: + mark_pixel_as_bad + + if (mask_value & mask_badbits) != 0: + mark_pixel_as_bad Example: @@ -351,19 +511,6 @@ of CrystFEL. They are synonyms of the new directives as follows: mask_good -----> mask0_goodbits mask_bad -----> mask0_badbits -Examples: - - badregionA/min_x = -20.0 - badregionA/max_x = +20.0 - badregionA/min_y = -100.0 - badregionA/max_y = +100.0 - - badregionB/min_fs = 128 - badregionB/max_fs = 160 - badregionB/min_ss = 256 - badregionB/max_ss = 512 - badregionB/panel = q0a1 - DETECTOR HIERARCHY ================== -- cgit v1.2.3