aboutsummaryrefslogtreecommitdiff
path: root/doc/indexamajig.txt
diff options
context:
space:
mode:
authorThomas White <taw@physics.org>2011-06-16 17:53:28 +0200
committerThomas White <taw@physics.org>2012-02-22 15:27:28 +0100
commit34b21127ea75e6a714a6c04a09f226180b2eb541 (patch)
treedb5d75b4365cbbda4728f0d512d24abcdd3ece88 /doc/indexamajig.txt
parentaa4d05d94275baa8c87acc6343a23d16f1877b24 (diff)
Move documentation to manpages
Diffstat (limited to 'doc/indexamajig.txt')
-rw-r--r--doc/indexamajig.txt219
1 files changed, 0 insertions, 219 deletions
diff --git a/doc/indexamajig.txt b/doc/indexamajig.txt
deleted file mode 100644
index c7097851..00000000
--- a/doc/indexamajig.txt
+++ /dev/null
@@ -1,219 +0,0 @@
-indexamajig - bulk indexing and data reduction program
-------------------------------------------------------
-
-The "indexamajig" program takes as input a list of diffraction image files,
-currently in HDF5 format. For each image, it attempts to find peaks and then
-index the pattern. If successful, it will measure the intensities of the peaks
-at Bragg locations and produce a list in the form "h k l I", with some extra
-information about the locations of the peaks.
-
-For minimal basic use, you need to provide the list of diffraction patterns,
-the method which will be used to index, a file describing the geometry of the
-detector, a PDB file which contains the unit cell which will be used for the
-indexing, and that you'd like the program to output a list of intensities for
-each successfully indexed pattern. Here is what the minimal use might look like
-on the command line:
-
-indexamajig -i mypatterns.lst -j 10 \
- -g mygeometry.geom \
- --indexing=mosflm,dirax --peaks=hdf5 \
- --cell-reduction=reduce \
- -b myxfel..beam \
- -o test.stream -p mycell.pdb \
- --record=integrated
-
-More typical use includes all the above, but might also include a noise or
-common mode filter (--filter-noise or --filter-cm respectively) if detector
-noise causes problems for the peak detection. The HDF5 files might be in some
-folder a long way from the current directory, so you might want to specify a
-full pathname to be added in front of each filename. You'll probably want to
-run more than one indexing job at a time (-j <n>).
-
-You can include a table of saturation values for in the HDF5 file, if you have
-a method for estimating the intensities of saturated peaks. It goes in
-/processing/hitfinder/peakinfo_saturated, and should be an n*3 two dimensional
-array, where the first two columns contain fast scan and slow scan coordinates
-(in that order) and the third contains the value which should belong in a peak
-at the given location. The value will be divided by 5 and spread in a small
-cross centred on that location.
-
-See doc/geometry for information about how to create a geometry description
-file.
-
-You can control what information is included in the output stream using
-' --record=<flags>'. Possible flags are:
-
- pixels Include a list of sums of pixel values within the
- integration domain, correcting for individual pixel
- solid angles.
-
- integrated Include a list of reflection intensities, produced by
- integrating around predicted peak locations.
-
- peaks Include peak locations and intensities from the peak
- search.
-
- peaksifindexed As 'peaks', but only if the pattern could be indexed.
-
- peaksifnotindexed As 'peaks', but only if the pattern could NOT be indexed.
-
-So, if you just want the integrated intensities of indexed peaks, use
-"--record=integrated". If you just want to check that the peak detection is
-working, used "--record=peaks". If you want the integrated peaks for the
-indexable patterns, but also want to check the peak detection for the patterns
-which could not be indexed, you might use
-"--record=integrated,peaksifnotindexed" and then use "check-peak-detection" from
-the "scripts" folder to visualise the results of the peak detection.
-
-
-Peak Detection
---------------
-
-You can control the peak detection on the command line. Firstly, you can choose
-the peak detection method using "--peaks=<method>". Currently, two possible
-values for "method" are available. "hdf5" will take the peak locations from the
-HDF5 file. It expects a two dimensional array at /processing/hitfinder/peakinfo
-where size in the first dimension is the number of peaks and the size in the
-second dimension is three. The first two columns contain the x and y
-coordinate (see the "Note about data orientation" in geometry.txt for details),
-the third contains the intensity. However, the intensity will be ignored since
-the pattern will always be re-integrated using the unit cell provided by the
-indexer on the basis of the peaks.
-
-The "zaef" method uses a simple gradient search after Zaefferer (2000). You can
-control the overall threshold and minimum gradient for finding a peak using the
-"--threshold" and "--min-gradient" options. Both of these have units of "ADU"
-(i.e. units of intensity according to the contents of the HDF5 file).
-
-A minimum peak separation can also be provided in the geometry description file
-(see geometry.txt for details). This number serves two purposes. Firstly,
-it is the maximum distance allowed between the peak summit and the foot point
-(where the gradient exceeds the minimum gradient). Secondly, it is the minimum
-distance allowed between one peak and another, before the later peak will be
-rejected "by proximity".
-
-You can suppress peak detection altogether for a panel in the geometry file by
-specifying the "no_index" value for the panel as non-zero.
-
-
-Indexing Methods
-----------------
-
-You can choose between a variety of indexing methods. You can choose more than
-one method, in which case each method will be tried in turn until the later cell
-reduction step says that the cell is a "hit". Choose from:
-
- dirax : invoke DirAx
- mosflm : invoke MOSFLM (DPS)
-
-Depending on what you have installed. For "dirax" and "mosflm", you need to
-have the dirax or ipmosflm binaries in your PATH.
-
-Example: --indexing=dirax,mosflm
-
-
-Cell Reduction
---------------
-
-You can choose from various options for cell reduction with the
-"--cell-reduction=" option. The choices are "none", "reduce" and "compare".
-This choice is important because all autoindexing methods produce an "ab
-initio" estimate of the unit cell (nine parameters), rather than just finding
-the orientation of the target cell (three parameters). It's clear that this is
-not optimal, and will hopefully be fixed in future versions.
-
-With "none", the raw cell from the autoindexer will be used. The cell probably
-won't match the target cell, but it'll still get used. Use this option to test
-whether the patterns are basically "indexable" or not, or if you don't know the
-cell parameters. In the latter case, you'll need to plot some kind of histogram
-of the resulting parameters from the output stream to see which are the most
-popular. If you're lucky, this will reveal the true unit cell.
-
-With "reduce", linear combinations of the raw cell will be checked against the
-target cell. If at least one candidate is found for each axis of the target
-cell, the angles will be checked to correspondence. If a match is found, this
-cell will be used for further processing. This option should generate the most
-matches, but might produce spurious results in many cases. The predicted peaks
-are always checked to verify that at least 10% of the predicted peaks are close
-to peaks located by the peak search. If not, the next candidate unit cell is
-tried until there are no more options.
-
-The "compare" method is like "reduce", but linear combinations are not taken.
-That means that the cell must either match or match after a simple permutation
-of the axes. This is useful when the target cell is subject to reticular
-twinning, such as if one cell axis length is close to twice another. With
-"reduce", there is a possibility that the axes might be confused in this
-situation. This happens for lysozyme (1VDS), so watch out.
-
-The tolerance for matching with "reduce" and "compare" is hardcoded as 5% in
-the reciprocal axis lengths and 1.5 degrees in the (reciprocal) angles. Cells
-from these reduction routines are further constrained to be right-handed. The
-unmatched raw cell might be left-handed: CrystFEL doesn't check this for you.
-Always using a right-handed cell means that the Bijvoet pairs can be told
-apart.
-
-If the unit cell is centered (i.e. if the space group begins with I, R, C, A or
-F), you should be careful when using "compare" for the cell reduction, since
-(for example) DirAx will always find a primitive unit cell, and this cell must
-be converted to the non-primitive conventional cell from the PDB.
-
-
-Tuning CPU affinities for NUMA hardware
----------------------------------------
-
-If you are running indexamajig on a NUMA (non-uniform memory architecture)
-machine, a performance gain can sometimes be made by preventing the kernel from
-allowing a process or thread to run on a CPU which is distant from the one on
-which it started. Distance, in this context, might mean that the CPU is able to
-access all the memory visible to the original CPU, but perhaps only relatively
-slowly via a cable link. In many cases a group of CPUs will have direct access
-to a certain region of memory, and so the process may be scheduled on any CPU in
-that group without any penalty. However, scheduling the process to any CPU
-outside the group may be slow. When running under Linux, indexamajig is able to
-avoid such sub-optimal process scheduling by setting CPU affinities for its
-threads. The CPU affinities are also inherited by subprocesses (e.g. MOSFLM or
-DirAx).
-
-To do this usefully, you need to give indexamajig some information about your
-hardware's architecture. Specify the size of the CPU groups using
-"--cpugroup=<n>". You also need to specify the overall number of CPUs, so that
-the program knows when to 'wrap around'. Using "--cpuoffset=<n>", where "n" is
-a group number (not a CPU number), allows you to manually skip a few CPUs, which
-may be useful if you do not want to use all the available CPUs but want to avoid
-running all your jobs on the same ones.
-
-Note that specifying the above options is NOT the same thing as giving the
-number of analyses to run in parallel (the 'number of threads'), which is done
-with "-j <n>". The CPU tuning options provide information to indexamajig about
-how to set the CPU affinities for its threads, but it does not specify how many
-threads to use.
-
-Example: 72-core Altix UV 100 machine at the author's institution
-
-This machine consists of six blades, each containing two 6-core CPUs and some
-local memory. Any CPU on any blade can access the memory on any other blade,
-but the access will be slow compared to accessing memory on the same blade.
-When running two instances of indexamajig, a sensible choice of parameters might
-be:
-
-1: --cpus=72 --cpugroup=12 --cpuoffset=0 -j 36
-2: --cpus=72 --cpugroup=12 --cpuoffset=36 -j 36
-
-This would dedicate half of the CPUs to one instance, and the other half to the
-other.
-
-
-A Note about Unit Cell Settings
--------------------------------
-
-CrystFEL's core symmetry module only knows about one setting for each unit cell.
-You must use the same setting. That means that the unique axis (for cells which
-have one) must be "c".
-
-
-"Gotchas"
----------
-
-Don't run more than one indexamajig jobs simultaneously in the same working
-directory - they'll overwrite each other's DirAx or MOSFLM files, causing subtle
-problems which can't easily be detected.