aboutsummaryrefslogtreecommitdiff
path: root/doc/indexamajig.txt
diff options
context:
space:
mode:
Diffstat (limited to 'doc/indexamajig.txt')
-rw-r--r--doc/indexamajig.txt70
1 files changed, 51 insertions, 19 deletions
diff --git a/doc/indexamajig.txt b/doc/indexamajig.txt
index a7cac95f..c7097851 100644
--- a/doc/indexamajig.txt
+++ b/doc/indexamajig.txt
@@ -59,7 +59,11 @@ You can control what information is included in the output stream using
So, if you just want the integrated intensities of indexed peaks, use
"--record=integrated". If you just want to check that the peak detection is
-working, used "--record=peaks".
+working, used "--record=peaks". If you want the integrated peaks for the
+indexable patterns, but also want to check the peak detection for the patterns
+which could not be indexed, you might use
+"--record=integrated,peaksifnotindexed" and then use "check-peak-detection" from
+the "scripts" folder to visualise the results of the peak detection.
Peak Detection
@@ -154,28 +158,57 @@ F), you should be careful when using "compare" for the cell reduction, since
be converted to the non-primitive conventional cell from the PDB.
-A Note about Unit Cell Settings
--------------------------------
+Tuning CPU affinities for NUMA hardware
+---------------------------------------
-CrystFEL's core symmetry module only knows about one setting for each unit cell.
-You must use the same setting. That means that the unique axis (for cells which
-have one) must be "c".
+If you are running indexamajig on a NUMA (non-uniform memory architecture)
+machine, a performance gain can sometimes be made by preventing the kernel from
+allowing a process or thread to run on a CPU which is distant from the one on
+which it started. Distance, in this context, might mean that the CPU is able to
+access all the memory visible to the original CPU, but perhaps only relatively
+slowly via a cable link. In many cases a group of CPUs will have direct access
+to a certain region of memory, and so the process may be scheduled on any CPU in
+that group without any penalty. However, scheduling the process to any CPU
+outside the group may be slow. When running under Linux, indexamajig is able to
+avoid such sub-optimal process scheduling by setting CPU affinities for its
+threads. The CPU affinities are also inherited by subprocesses (e.g. MOSFLM or
+DirAx).
+
+To do this usefully, you need to give indexamajig some information about your
+hardware's architecture. Specify the size of the CPU groups using
+"--cpugroup=<n>". You also need to specify the overall number of CPUs, so that
+the program knows when to 'wrap around'. Using "--cpuoffset=<n>", where "n" is
+a group number (not a CPU number), allows you to manually skip a few CPUs, which
+may be useful if you do not want to use all the available CPUs but want to avoid
+running all your jobs on the same ones.
+
+Note that specifying the above options is NOT the same thing as giving the
+number of analyses to run in parallel (the 'number of threads'), which is done
+with "-j <n>". The CPU tuning options provide information to indexamajig about
+how to set the CPU affinities for its threads, but it does not specify how many
+threads to use.
+Example: 72-core Altix UV 100 machine at the author's institution
-Unconventional Use
-------------------
+This machine consists of six blades, each containing two 6-core CPUs and some
+local memory. Any CPU on any blade can access the memory on any other blade,
+but the access will be slow compared to accessing memory on the same blade.
+When running two instances of indexamajig, a sensible choice of parameters might
+be:
-There are some less often used options, for example "--dump-peaks" to dump the
-peak locations found by the peak search (in turn presented to the indexer).
-This might be useful if you want to check the performance of the peak finder.
-If you run a large dataset with bot --dump-peaks and --near-bragg enabled,
-you'll generate a large amount of data. To separate the peaks from the
-indexed peaks, use scripts/stream-split as follows:
+1: --cpus=72 --cpugroup=12 --cpuoffset=0 -j 36
+2: --cpus=72 --cpugroup=12 --cpuoffset=36 -j 36
-scripts/stream-split myoutputfile.txt indexed.txt peaks.txt
+This would dedicate half of the CPUs to one instance, and the other half to the
+other.
-.. to generate both indexed.txt and peaks.txt. One of the last two arguments
-can be "/dev/null" if you're only interested in the other.
+
+A Note about Unit Cell Settings
+-------------------------------
+
+CrystFEL's core symmetry module only knows about one setting for each unit cell.
+You must use the same setting. That means that the unique axis (for cells which
+have one) must be "c".
"Gotchas"
@@ -183,5 +216,4 @@ can be "/dev/null" if you're only interested in the other.
Don't run more than one indexamajig jobs simultaneously in the same working
directory - they'll overwrite each other's DirAx or MOSFLM files, causing subtle
-problems
-which can't easily be detected.
+problems which can't easily be detected.