From 34b21127ea75e6a714a6c04a09f226180b2eb541 Mon Sep 17 00:00:00 2001 From: Thomas White Date: Thu, 16 Jun 2011 17:53:28 +0200 Subject: Move documentation to manpages --- doc/0-INDEX | 22 ----- doc/geometry.txt | 112 ---------------------- doc/indexamajig.txt | 219 ------------------------------------------- doc/man/crystfel_geometry.1 | 123 +++++++++++++++++++++++++ doc/man/indexamajig.1 | 220 ++++++++++++++++++++++++++++++++++++++++++++ doc/man/pattern_sim.1 | 36 ++++++++ doc/man/process_hkl.1 | 51 ++++++++++ doc/pattern_sim.txt | 27 ------ doc/process_hkl.txt | 46 --------- doc/quickstart.txt | 5 - doc/symmetry.txt | 56 ----------- 11 files changed, 430 insertions(+), 487 deletions(-) delete mode 100644 doc/0-INDEX delete mode 100644 doc/geometry.txt delete mode 100644 doc/indexamajig.txt create mode 100644 doc/man/crystfel_geometry.1 create mode 100644 doc/man/indexamajig.1 create mode 100644 doc/man/pattern_sim.1 create mode 100644 doc/man/process_hkl.1 delete mode 100644 doc/pattern_sim.txt delete mode 100644 doc/process_hkl.txt delete mode 100644 doc/quickstart.txt delete mode 100644 doc/symmetry.txt (limited to 'doc') diff --git a/doc/0-INDEX b/doc/0-INDEX deleted file mode 100644 index 7bf5aec7..00000000 --- a/doc/0-INDEX +++ /dev/null @@ -1,22 +0,0 @@ -Index to the CrystFEL documentation ------------------------------------ - -quickstart.txt - Basic usage information and suggested workflow. - -indexamajig.txt -pattern_sim.txt -process_hkl.txt - Information about the individual programs and their use. - -geometry.txt - Information about detector geometry description files. - -twin-calculator.pdf - Symmetry tables (in two formats). - -symmetry.txt - How CrystFEL uses symmetry. - -examples/ - Contains example geometry files. diff --git a/doc/geometry.txt b/doc/geometry.txt deleted file mode 100644 index c215d27c..00000000 --- a/doc/geometry.txt +++ /dev/null @@ -1,112 +0,0 @@ -CrystFEL detector geometry files --------------------------------- - -The detector geometry is taken from a text file rather than hardcoded into the -program. Programs which care about the geometry (particularly indexamajig, -pattern_sim and powder_plot) take an argument "--geometry=" -(or "-g "), where contains the geometry. - -A flexible (and pedantic) representation of the detector has been developed to -avoid all possible sources of ambiguity. CrystFEL's representation of a -detector is broken down into one or more "panels", each of which has its own -camera length, geometry, resolution and so on. Each panel fits into the overall -image taken from the HDF5 file, defined by minimum and maximum coordinates in -the "fast scan" and "slow scan" directions. "Fast scan" refers to the direction -whose coordinate changes most quickly as the bytes in the HDF5 file are moved -through. The coordinates are specified inclusively, meaning that a minimum of 0 -and a maximum of 9 results in a width of ten pixels. Counting begins from zero. -All pixels in the image must be assigned to a panel - gaps are not permitted. - -In the current version, panels are assumed to be perpendicular to the incident -beam and to have their edges parallel. Within these limitations, any geometry -can be constructed. - -The job of the geometry file is to establish a relationship between the array -of pixel values in the HDF5 file, defined in terms only of the "fast scan" and -"slow scan" directions, and the laboratory coordinate system defined as follows: - -+z is the beam direction, and points along the beam (i.e. away from the source) -+y points towards the zenith (ceiling). -+x completes the right-handed coordinate system. - -Naively speaking, this means that CrystFEL at the images from the "into the -beam" perspective, but please avoid thinking of things in this way. It's much -better to consider the precise way in which the coordinates are mapped. - -The syntax for a simple geometry might include several entires of the following -form: - -; Lines which should be ignored start with a semicolon. - -; The name before the slash indicates which panel is referred to. You can use -; any name as long as it doesn't start with "bad" (see below). -; The range of pixels in the HDF5 file which correspond to a panel are given: -panel0/min_fs = 0 -panel0/min_ss = 0 -panel0/max_fs = 193 -panel0/max_ss = 184 - -; The readout direction (x, y or 0). If more than three peaks are found in -; the same readout region, they are all discarded. This helps to avoid -; problems due to streaks appearing along the readout direction. -; If the badrow direction is '-', then the culling described above will not -; be performed for this panel. -panel0/badrow_direction = - - -; The resolution (in pixels per metre) for this panel -panel0/res = 9090.91 - -; The characteristic peak separation in pixels. The peak detection will assume -; that genuine peaks are separated by at least this amount. -panel0/peak_sep = 6.0 - -; You need to specify the peak integration radius, which should be a little -; larger than the actual radii of the peaks in pixels -panel0/integr_radius = 2.0 - -; The camera length (in metres) for this panel -; You can also specify the HDF path to a scalar floating point value containing -; the camera length in millimetres. -panel0/clen = /LCLS/detectorPosition - -; For this panel, the fast and slow scan directions correspond to the given -; directions in the lab coordinate system described above, measured in pixels. -panel0/fs = +y -panel0/ss = -x - -; The corner of this panel, defined as the first point in the panel to appear in -; the HDF5 file, is now given a position in the lab coordinate system. -; Note that "first point in the panel" is a conceptual simplification. We refer -; to that corner, and to the very corner of the pixel - NOT, for example, to the -; centre of the first pixel to appear. -panel0/corner_x = 429.39 -panel0/corner_y = -17.30 - -; You can suppress indexing for this panel if required, by setting "no_index" to -; "true" or "1". -panel0/no_index = 0 - -; You can also specify bad regions. Peaks with centroid locations within such -; a region will not be integrated nor indexed. Bad regions are specified in -; pixel units, but in the lab coordinate system (i.e. "y" points at the ceiling, -; "z" is the beam direction and "x" completes the right-handed system). -badregionA/min_x = -20.0 -badregionA/max_x = +20.0 -badregionA/min_y = -100.0 -badregionA/max_y = +100.0 - -; If you have a bad pixel mask, you can include it in the HDF5 file as an -; unsigned 16-bit integer array of the same size as the data. You need to -; give its path within each HDF5 file, and two bitmasks. The pixel is -; considered good if all of the bits which are set in "mask_good" are set, AND -; if none of the bits which are set in "mask_bad" are set. -mask = /processing/hitfinder/masks -mask_good = 0x27 -mask_bad = 0x00 - -; Any of the per-panel values can be given without a panel prefix, for example: -peak_sep = 6.0 -; in which case the value will be used for all *subsequent* panels. - - -See the "examples" folder for some examples (look at the ones ending in .geom). diff --git a/doc/indexamajig.txt b/doc/indexamajig.txt deleted file mode 100644 index c7097851..00000000 --- a/doc/indexamajig.txt +++ /dev/null @@ -1,219 +0,0 @@ -indexamajig - bulk indexing and data reduction program ------------------------------------------------------- - -The "indexamajig" program takes as input a list of diffraction image files, -currently in HDF5 format. For each image, it attempts to find peaks and then -index the pattern. If successful, it will measure the intensities of the peaks -at Bragg locations and produce a list in the form "h k l I", with some extra -information about the locations of the peaks. - -For minimal basic use, you need to provide the list of diffraction patterns, -the method which will be used to index, a file describing the geometry of the -detector, a PDB file which contains the unit cell which will be used for the -indexing, and that you'd like the program to output a list of intensities for -each successfully indexed pattern. Here is what the minimal use might look like -on the command line: - -indexamajig -i mypatterns.lst -j 10 \ - -g mygeometry.geom \ - --indexing=mosflm,dirax --peaks=hdf5 \ - --cell-reduction=reduce \ - -b myxfel..beam \ - -o test.stream -p mycell.pdb \ - --record=integrated - -More typical use includes all the above, but might also include a noise or -common mode filter (--filter-noise or --filter-cm respectively) if detector -noise causes problems for the peak detection. The HDF5 files might be in some -folder a long way from the current directory, so you might want to specify a -full pathname to be added in front of each filename. You'll probably want to -run more than one indexing job at a time (-j ). - -You can include a table of saturation values for in the HDF5 file, if you have -a method for estimating the intensities of saturated peaks. It goes in -/processing/hitfinder/peakinfo_saturated, and should be an n*3 two dimensional -array, where the first two columns contain fast scan and slow scan coordinates -(in that order) and the third contains the value which should belong in a peak -at the given location. The value will be divided by 5 and spread in a small -cross centred on that location. - -See doc/geometry for information about how to create a geometry description -file. - -You can control what information is included in the output stream using -' --record='. Possible flags are: - - pixels Include a list of sums of pixel values within the - integration domain, correcting for individual pixel - solid angles. - - integrated Include a list of reflection intensities, produced by - integrating around predicted peak locations. - - peaks Include peak locations and intensities from the peak - search. - - peaksifindexed As 'peaks', but only if the pattern could be indexed. - - peaksifnotindexed As 'peaks', but only if the pattern could NOT be indexed. - -So, if you just want the integrated intensities of indexed peaks, use -"--record=integrated". If you just want to check that the peak detection is -working, used "--record=peaks". If you want the integrated peaks for the -indexable patterns, but also want to check the peak detection for the patterns -which could not be indexed, you might use -"--record=integrated,peaksifnotindexed" and then use "check-peak-detection" from -the "scripts" folder to visualise the results of the peak detection. - - -Peak Detection --------------- - -You can control the peak detection on the command line. Firstly, you can choose -the peak detection method using "--peaks=". Currently, two possible -values for "method" are available. "hdf5" will take the peak locations from the -HDF5 file. It expects a two dimensional array at /processing/hitfinder/peakinfo -where size in the first dimension is the number of peaks and the size in the -second dimension is three. The first two columns contain the x and y -coordinate (see the "Note about data orientation" in geometry.txt for details), -the third contains the intensity. However, the intensity will be ignored since -the pattern will always be re-integrated using the unit cell provided by the -indexer on the basis of the peaks. - -The "zaef" method uses a simple gradient search after Zaefferer (2000). You can -control the overall threshold and minimum gradient for finding a peak using the -"--threshold" and "--min-gradient" options. Both of these have units of "ADU" -(i.e. units of intensity according to the contents of the HDF5 file). - -A minimum peak separation can also be provided in the geometry description file -(see geometry.txt for details). This number serves two purposes. Firstly, -it is the maximum distance allowed between the peak summit and the foot point -(where the gradient exceeds the minimum gradient). Secondly, it is the minimum -distance allowed between one peak and another, before the later peak will be -rejected "by proximity". - -You can suppress peak detection altogether for a panel in the geometry file by -specifying the "no_index" value for the panel as non-zero. - - -Indexing Methods ----------------- - -You can choose between a variety of indexing methods. You can choose more than -one method, in which case each method will be tried in turn until the later cell -reduction step says that the cell is a "hit". Choose from: - - dirax : invoke DirAx - mosflm : invoke MOSFLM (DPS) - -Depending on what you have installed. For "dirax" and "mosflm", you need to -have the dirax or ipmosflm binaries in your PATH. - -Example: --indexing=dirax,mosflm - - -Cell Reduction --------------- - -You can choose from various options for cell reduction with the -"--cell-reduction=" option. The choices are "none", "reduce" and "compare". -This choice is important because all autoindexing methods produce an "ab -initio" estimate of the unit cell (nine parameters), rather than just finding -the orientation of the target cell (three parameters). It's clear that this is -not optimal, and will hopefully be fixed in future versions. - -With "none", the raw cell from the autoindexer will be used. The cell probably -won't match the target cell, but it'll still get used. Use this option to test -whether the patterns are basically "indexable" or not, or if you don't know the -cell parameters. In the latter case, you'll need to plot some kind of histogram -of the resulting parameters from the output stream to see which are the most -popular. If you're lucky, this will reveal the true unit cell. - -With "reduce", linear combinations of the raw cell will be checked against the -target cell. If at least one candidate is found for each axis of the target -cell, the angles will be checked to correspondence. If a match is found, this -cell will be used for further processing. This option should generate the most -matches, but might produce spurious results in many cases. The predicted peaks -are always checked to verify that at least 10% of the predicted peaks are close -to peaks located by the peak search. If not, the next candidate unit cell is -tried until there are no more options. - -The "compare" method is like "reduce", but linear combinations are not taken. -That means that the cell must either match or match after a simple permutation -of the axes. This is useful when the target cell is subject to reticular -twinning, such as if one cell axis length is close to twice another. With -"reduce", there is a possibility that the axes might be confused in this -situation. This happens for lysozyme (1VDS), so watch out. - -The tolerance for matching with "reduce" and "compare" is hardcoded as 5% in -the reciprocal axis lengths and 1.5 degrees in the (reciprocal) angles. Cells -from these reduction routines are further constrained to be right-handed. The -unmatched raw cell might be left-handed: CrystFEL doesn't check this for you. -Always using a right-handed cell means that the Bijvoet pairs can be told -apart. - -If the unit cell is centered (i.e. if the space group begins with I, R, C, A or -F), you should be careful when using "compare" for the cell reduction, since -(for example) DirAx will always find a primitive unit cell, and this cell must -be converted to the non-primitive conventional cell from the PDB. - - -Tuning CPU affinities for NUMA hardware ---------------------------------------- - -If you are running indexamajig on a NUMA (non-uniform memory architecture) -machine, a performance gain can sometimes be made by preventing the kernel from -allowing a process or thread to run on a CPU which is distant from the one on -which it started. Distance, in this context, might mean that the CPU is able to -access all the memory visible to the original CPU, but perhaps only relatively -slowly via a cable link. In many cases a group of CPUs will have direct access -to a certain region of memory, and so the process may be scheduled on any CPU in -that group without any penalty. However, scheduling the process to any CPU -outside the group may be slow. When running under Linux, indexamajig is able to -avoid such sub-optimal process scheduling by setting CPU affinities for its -threads. The CPU affinities are also inherited by subprocesses (e.g. MOSFLM or -DirAx). - -To do this usefully, you need to give indexamajig some information about your -hardware's architecture. Specify the size of the CPU groups using -"--cpugroup=". You also need to specify the overall number of CPUs, so that -the program knows when to 'wrap around'. Using "--cpuoffset=", where "n" is -a group number (not a CPU number), allows you to manually skip a few CPUs, which -may be useful if you do not want to use all the available CPUs but want to avoid -running all your jobs on the same ones. - -Note that specifying the above options is NOT the same thing as giving the -number of analyses to run in parallel (the 'number of threads'), which is done -with "-j ". The CPU tuning options provide information to indexamajig about -how to set the CPU affinities for its threads, but it does not specify how many -threads to use. - -Example: 72-core Altix UV 100 machine at the author's institution - -This machine consists of six blades, each containing two 6-core CPUs and some -local memory. Any CPU on any blade can access the memory on any other blade, -but the access will be slow compared to accessing memory on the same blade. -When running two instances of indexamajig, a sensible choice of parameters might -be: - -1: --cpus=72 --cpugroup=12 --cpuoffset=0 -j 36 -2: --cpus=72 --cpugroup=12 --cpuoffset=36 -j 36 - -This would dedicate half of the CPUs to one instance, and the other half to the -other. - - -A Note about Unit Cell Settings -------------------------------- - -CrystFEL's core symmetry module only knows about one setting for each unit cell. -You must use the same setting. That means that the unique axis (for cells which -have one) must be "c". - - -"Gotchas" ---------- - -Don't run more than one indexamajig jobs simultaneously in the same working -directory - they'll overwrite each other's DirAx or MOSFLM files, causing subtle -problems which can't easily be detected. diff --git a/doc/man/crystfel_geometry.1 b/doc/man/crystfel_geometry.1 new file mode 100644 index 00000000..0deb058e --- /dev/null +++ b/doc/man/crystfel_geometry.1 @@ -0,0 +1,123 @@ +.\" +.\" Geometry man page +.\" +.\" (c) 2009-2011 Thomas White +.\" +.\" Part of CrystFEL - crystallography with a FEL +.\" + +.TH CRYSTFEL\_GEOMETRY 1 +.SH NAME +CrystFEL detector geometry files + +.SH OVERVIEW + +The detector geometry is taken from a text file rather than hardcoded into the +program. Programs which care about the geometry (particularly indexamajig, +pattern_sim and powder_plot) take an argument "--geometry=" +(or "-g "), where contains the geometry. + +A flexible (and pedantic) representation of the detector has been developed to +avoid all possible sources of ambiguity. CrystFEL's representation of a +detector is broken down into one or more "panels", each of which has its own +camera length, geometry, resolution and so on. Each panel fits into the overall +image taken from the HDF5 file, defined by minimum and maximum coordinates in +the "fast scan" and "slow scan" directions. "Fast scan" refers to the direction +whose coordinate changes most quickly as the bytes in the HDF5 file are moved +through. The coordinates are specified inclusively, meaning that a minimum of 0 +and a maximum of 9 results in a width of ten pixels. Counting begins from zero. +All pixels in the image must be assigned to a panel - gaps are not permitted. + +In the current version, panels are assumed to be perpendicular to the incident +beam and to have their edges parallel. Within these limitations, any geometry +can be constructed. + +The job of the geometry file is to establish a relationship between the array +of pixel values in the HDF5 file, defined in terms only of the "fast scan" and +"slow scan" directions, and the laboratory coordinate system defined as follows: + ++z is the beam direction, and points along the beam (i.e. away from the source) ++y points towards the zenith (ceiling). ++x completes the right-handed coordinate system. + +Naively speaking, this means that CrystFEL at the images from the "into the +beam" perspective, but please avoid thinking of things in this way. It's much +better to consider the precise way in which the coordinates are mapped. + +The syntax for a simple geometry might include several entires of the following +form: + +; Lines which should be ignored start with a semicolon. + +; The name before the slash indicates which panel is referred to. You can use +; any name as long as it doesn't start with "bad" (see below). +; The range of pixels in the HDF5 file which correspond to a panel are given: +panel0/min_fs = 0 +panel0/min_ss = 0 +panel0/max_fs = 193 +panel0/max_ss = 184 + +; The readout direction (x, y or 0). If more than three peaks are found in +; the same readout region, they are all discarded. This helps to avoid +; problems due to streaks appearing along the readout direction. +; If the badrow direction is '-', then the culling described above will not +; be performed for this panel. +panel0/badrow_direction = - + +; The resolution (in pixels per metre) for this panel +panel0/res = 9090.91 + +; The characteristic peak separation in pixels. The peak detection will assume +; that genuine peaks are separated by at least this amount. +panel0/peak_sep = 6.0 + +; You need to specify the peak integration radius, which should be a little +; larger than the actual radii of the peaks in pixels +panel0/integr_radius = 2.0 + +; The camera length (in metres) for this panel +; You can also specify the HDF path to a scalar floating point value containing +; the camera length in millimetres. +panel0/clen = /LCLS/detectorPosition + +; For this panel, the fast and slow scan directions correspond to the given +; directions in the lab coordinate system described above, measured in pixels. +panel0/fs = +y +panel0/ss = -x + +; The corner of this panel, defined as the first point in the panel to appear in +; the HDF5 file, is now given a position in the lab coordinate system. +; Note that "first point in the panel" is a conceptual simplification. We refer +; to that corner, and to the very corner of the pixel - NOT, for example, to the +; centre of the first pixel to appear. +panel0/corner_x = 429.39 +panel0/corner_y = -17.30 + +; You can suppress indexing for this panel if required, by setting "no_index" to +; "true" or "1". +panel0/no_index = 0 + +; You can also specify bad regions. Peaks with centroid locations within such +; a region will not be integrated nor indexed. Bad regions are specified in +; pixel units, but in the lab coordinate system (i.e. "y" points at the ceiling, +; "z" is the beam direction and "x" completes the right-handed system). +badregionA/min_x = -20.0 +badregionA/max_x = +20.0 +badregionA/min_y = -100.0 +badregionA/max_y = +100.0 + +; If you have a bad pixel mask, you can include it in the HDF5 file as an +; unsigned 16-bit integer array of the same size as the data. You need to +; give its path within each HDF5 file, and two bitmasks. The pixel is +; considered good if all of the bits which are set in "mask_good" are set, AND +; if none of the bits which are set in "mask_bad" are set. +mask = /processing/hitfinder/masks +mask_good = 0x27 +mask_bad = 0x00 + +; Any of the per-panel values can be given without a panel prefix, for example: +peak_sep = 6.0 +; in which case the value will be used for all *subsequent* panels. + + +See the "examples" folder for some examples (look at the ones ending in .geom). diff --git a/doc/man/indexamajig.1 b/doc/man/indexamajig.1 new file mode 100644 index 00000000..fcb1afc4 --- /dev/null +++ b/doc/man/indexamajig.1 @@ -0,0 +1,220 @@ +.\" +.\" indexamajig man page +.\" +.\" (c) 2009-2011 Thomas White +.\" +.\" Part of CrystFEL - crystallography with a FEL +.\" + +.TH INDEXAMAJIG 1 +.SH NAME +indexamajig \- bulk indexing and data reduction program +.SH SYNOPSIS +.PP +.B indexamajig +[options] + +.SH DESCRIPTION + +The "indexamajig" program takes as input a list of diffraction image files, +currently in HDF5 format. For each image, it attempts to find peaks and then +index the pattern. If successful, it will measure the intensities of the peaks +at Bragg locations and produce a list in the form "h k l I", with some extra +information about the locations of the peaks. + +For minimal basic use, you need to provide the list of diffraction patterns, +the method which will be used to index, a file describing the geometry of the +detector, a PDB file which contains the unit cell which will be used for the +indexing, and that you'd like the program to output a list of intensities for +each successfully indexed pattern. Here is what the minimal use might look like +on the command line: + +indexamajig -i mypatterns.lst -j 10 -g mygeometry.geom --indexing=mosflm,dirax --peaks=hdf5 --cell-reduction=reduce -b myxfel..beam -o test.stream -p mycell.pdb --record=integrated + +More typical use includes all the above, but might also include a noise or +common mode filter (--filter-noise or --filter-cm respectively) if detector +noise causes problems for the peak detection. The HDF5 files might be in some +folder a long way from the current directory, so you might want to specify a +full pathname to be added in front of each filename. You'll probably want to +run more than one indexing job at a time (-j ). + +You can include a table of saturation values for in the HDF5 file, if you have +a method for estimating the intensities of saturated peaks. It goes in +/processing/hitfinder/peakinfo_saturated, and should be an n*3 two dimensional +array, where the first two columns contain fast scan and slow scan coordinates +(in that order) and the third contains the value which should belong in a peak +at the given location. The value will be spread in a small cross centred on +that location. + +See doc/geometry for information about how to create a geometry description +file. + +You can control what information is included in the output stream using +' --record='. Possible flags are: + + pixels Include a list of sums of pixel values within the + integration domain, correcting for individual pixel + solid angles. + + integrated Include a list of reflection intensities, produced by + integrating around predicted peak locations. + + peaks Include peak locations and intensities from the peak + search. + + peaksifindexed As 'peaks', but only if the pattern could be indexed. + + peaksifnotindexed As 'peaks', but only if the pattern could NOT be indexed. + +So, if you just want the integrated intensities of indexed peaks, use +"--record=integrated". If you just want to check that the peak detection is +working, used "--record=peaks". If you want the integrated peaks for the +indexable patterns, but also want to check the peak detection for the patterns +which could not be indexed, you might use +"--record=integrated,peaksifnotindexed" and then use "check-peak-detection" from +the "scripts" folder to visualise the results of the peak detection. + +.SH PEAK DETECTION + +You can control the peak detection on the command line. Firstly, you can choose +the peak detection method using "--peaks=". Currently, two possible +values for "method" are available. "hdf5" will take the peak locations from the +HDF5 file. It expects a two dimensional array at /processing/hitfinder/peakinfo +where size in the first dimension is the number of peaks and the size in the +second dimension is three. The first two columns contain the x and y +coordinate (see the "Note about data orientation" in geometry.txt for details), +the third contains the intensity. However, the intensity will be ignored since +the pattern will always be re-integrated using the unit cell provided by the +indexer on the basis of the peaks. + +The "zaef" method uses a simple gradient search after Zaefferer (2000). You can +control the overall threshold and minimum gradient for finding a peak using the +"--threshold" and "--min-gradient" options. Both of these have units of "ADU" +(i.e. units of intensity according to the contents of the HDF5 file). + +A minimum peak separation can also be provided in the geometry description file +(see geometry.txt for details). This number serves two purposes. Firstly, +it is the maximum distance allowed between the peak summit and the foot point +(where the gradient exceeds the minimum gradient). Secondly, it is the minimum +distance allowed between one peak and another, before the later peak will be +rejected "by proximity". + +You can suppress peak detection altogether for a panel in the geometry file by +specifying the "no_index" value for the panel as non-zero. + + +.SH INDEXING METHODS + +You can choose between a variety of indexing methods. You can choose more than +one method, in which case each method will be tried in turn until the later cell +reduction step says that the cell is a "hit". Choose from: + + dirax : invoke DirAx + mosflm : invoke MOSFLM (DPS) + +Depending on what you have installed. For "dirax" and "mosflm", you need to +have the dirax or ipmosflm binaries in your PATH. + +Example: --indexing=dirax,mosflm + +.SH CELL REDUCTION + +You can choose from various options for cell reduction with the +"--cell-reduction=" option. The choices are "none", "reduce" and "compare". +This choice is important because all autoindexing methods produce an "ab +initio" estimate of the unit cell (nine parameters), rather than just finding +the orientation of the target cell (three parameters). It's clear that this is +not optimal, and will hopefully be fixed in future versions. + +With "none", the raw cell from the autoindexer will be used. The cell probably +won't match the target cell, but it'll still get used. Use this option to test +whether the patterns are basically "indexable" or not, or if you don't know the +cell parameters. In the latter case, you'll need to plot some kind of histogram +of the resulting parameters from the output stream to see which are the most +popular. If you're lucky, this will reveal the true unit cell. + +With "reduce", linear combinations of the raw cell will be checked against the +target cell. If at least one candidate is found for each axis of the target +cell, the angles will be checked to correspondence. If a match is found, this +cell will be used for further processing. This option should generate the most +matches, but might produce spurious results in many cases. The predicted peaks +are always checked to verify that at least 10% of the predicted peaks are close +to peaks located by the peak search. If not, the next candidate unit cell is +tried until there are no more options. + +The "compare" method is like "reduce", but linear combinations are not taken. +That means that the cell must either match or match after a simple permutation +of the axes. This is useful when the target cell is subject to reticular +twinning, such as if one cell axis length is close to twice another. With +"reduce", there is a possibility that the axes might be confused in this +situation. This happens for lysozyme (1VDS), so watch out. + +The tolerance for matching with "reduce" and "compare" is hardcoded as 5% in +the reciprocal axis lengths and 1.5 degrees in the (reciprocal) angles. Cells +from these reduction routines are further constrained to be right-handed. The +unmatched raw cell might be left-handed: CrystFEL doesn't check this for you. +Always using a right-handed cell means that the Bijvoet pairs can be told +apart. + +If the unit cell is centered (i.e. if the space group begins with I, R, C, A or +F), you should be careful when using "compare" for the cell reduction, since +(for example) DirAx will always find a primitive unit cell, and this cell must +be converted to the non-primitive conventional cell from the PDB. + + +.SH TUNING CPU AFFINITIES FOR NUMA HARDWARE + +If you are running indexamajig on a NUMA (non-uniform memory architecture) +machine, a performance gain can sometimes be made by preventing the kernel from +allowing a process or thread to run on a CPU which is distant from the one on +which it started. Distance, in this context, might mean that the CPU is able to +access all the memory visible to the original CPU, but perhaps only relatively +slowly via a cable link. In many cases a group of CPUs will have direct access +to a certain region of memory, and so the process may be scheduled on any CPU in +that group without any penalty. However, scheduling the process to any CPU +outside the group may be slow. When running under Linux, indexamajig is able to +avoid such sub-optimal process scheduling by setting CPU affinities for its +threads. The CPU affinities are also inherited by subprocesses (e.g. MOSFLM or +DirAx). + +To do this usefully, you need to give indexamajig some information about your +hardware's architecture. Specify the size of the CPU groups using +"--cpugroup=". You also need to specify the overall number of CPUs, so that +the program knows when to 'wrap around'. Using "--cpuoffset=", where "n" is +a group number (not a CPU number), allows you to manually skip a few CPUs, which +may be useful if you do not want to use all the available CPUs but want to avoid +running all your jobs on the same ones. + +Note that specifying the above options is NOT the same thing as giving the +number of analyses to run in parallel (the 'number of threads'), which is done +with "-j ". The CPU tuning options provide information to indexamajig about +how to set the CPU affinities for its threads, but it does not specify how many +threads to use. + +Example: 72-core Altix UV 100 machine at the author's institution + +This machine consists of six blades, each containing two 6-core CPUs and some +local memory. Any CPU on any blade can access the memory on any other blade, +but the access will be slow compared to accessing memory on the same blade. +When running two instances of indexamajig, a sensible choice of parameters might +be: + +1: --cpus=72 --cpugroup=12 --cpuoffset=0 -j 36 +2: --cpus=72 --cpugroup=12 --cpuoffset=36 -j 36 + +This would dedicate half of the CPUs to one instance, and the other half to the +other. + + +.SH A NOTE ABOUT UNIT CELL SETTINGS + +CrystFEL's core symmetry module only knows about one setting for each unit cell. +You must use the same setting. That means that the unique axis (for cells which +have one) must be "c". + + +.SH KNOWN BUGS + +Don't run more than one indexamajig jobs simultaneously in the same working +directory - they'll overwrite each other's DirAx or MOSFLM files, causing subtle +problems which can't easily be detected. diff --git a/doc/man/pattern_sim.1 b/doc/man/pattern_sim.1 new file mode 100644 index 00000000..d53aeae5 --- /dev/null +++ b/doc/man/pattern_sim.1 @@ -0,0 +1,36 @@ +.\" +.\" pattern_sim man page +.\" +.\" (c) 2009-2011 Thomas White +.\" +.\" Part of CrystFEL - crystallography with a FEL +.\" + +.TH PATTERN\_SIM 1 +.SH NAME +pattern\_sim \- Simulation of nanocrystallographic diffraction patterns +.SH SYNOPSIS +.PP +.B pattern\_sim +[options] + +.SH DESCRIPTION + +pattern_sim does not know about symmetry, so your input reflection list +(give with "-i") must be expanded. You can do this with: + +$ get_hkl -i myfile.hkl -o output.hkl -y mypointgroup -e 1 + +get_hkl does not currently understand symmetry, which means you'll have to +expand any molecular model (the PDB) out to P1 to get the correct results. You +can achieve that, for example, by loading it into Mercury, turning on "Packing" +and re-saving. Alternatively, you can do this using CCP4 with a command like: + +$ echo symgen P63 | pdbset xyzin model.pdb xyzout model-P1.pdb + +While on this subject, you might also want to include hydrogens in the model +using something like: +$ echo HYDROGENS APPEND | hgen xyzin model.pdb xyzout model-with-H.pdb + +Please be sure to read the "Note about Unit Cell Settings" in the documentation +for indexamajig. diff --git a/doc/man/process_hkl.1 b/doc/man/process_hkl.1 new file mode 100644 index 00000000..6c626e31 --- /dev/null +++ b/doc/man/process_hkl.1 @@ -0,0 +1,51 @@ +.\" +.\" process_hkl man page +.\" +.\" (c) 2009-2011 Thomas White +.\" +.\" Part of CrystFEL - crystallography with a FEL +.\" + +.TH PROCESS\_HKL 1 +.SH NAME +process\_hkl \- Monte Carlo merging program +.SH SYNOPSIS +.PP +.B process\_hkl +-i mypatterns.stream -o mydata.hkl -y mypointgroup [options] + +.SH DESCRIPTION + +This program takes as input the data stream from "indexamajig". It merges the +many individual intensities together to form a single list of reflection +intensities which are useful for crystallography. + +Typical usage is of the form: + +$ process_hkl -i mypatterns.stream -o mydata.hkl -y mypointgroup + +.SH CHOICE OF POINT GROUP FOR MERGING + +One of the main features of serial crystallography is that the orientations of +individual crystals are random. That means that the orientation of each +crystal must be determined independently, with no information about its +relationship to the orientation of crystals in other patterns (as would be the +case for a rotation series of patterns). + +Some Laue classes are merohedral. This means that the orientation will have an +ambiguity, but this time more serious. The two (or more) possible +orientations could be called "twins", but the mechanism of their formation is +somewhat different to the conventional use of the term. In these cases, you +will need to merge according to the point group corresponding holohedral Laue +class. + +You can also tell process_hkl the "apparent" symmetry, which is the symmetry as +far as whatever produced the stream was concerned. In the case of most indexing +algorithms, this will be the corresponding holohedral point group (not the +Laue class nor the holohedral Laue class). If you use the "-a" option to give +this information, process_hkl will try to resolve the remaining orientational +ambiguities to get from the apparent symmetry to the true symmetry (given with +"-y"). Currently, it won't do a very good job of it. + +The document twin-calculator.pdf contains more detailed information about this +issue, as well as tables which contain all the required information. diff --git a/doc/pattern_sim.txt b/doc/pattern_sim.txt deleted file mode 100644 index cbbcb749..00000000 --- a/doc/pattern_sim.txt +++ /dev/null @@ -1,27 +0,0 @@ -pattern_sim does not know about symmetry, so your input reflection list -(give with "-i") must be expanded. You can do this with: - -$ get_hkl -i myfile.hkl -o output.hkl -y mypointgroup -e 1 - - - -The symmetry of the molecular model (the space group) ------------------------------------------------------ - -get_hkl does not currently understand symmetry, which means you'll have to -expand any molecular model (the PDB) out to P1 to get the correct results. You -can achieve that, for example, by loading it into Mercury, turning on "Packing" -and re-saving. Alternatively, you can do this using CCP4 with a command like: - -$ echo symgen P63 | pdbset xyzin model.pdb xyzout model-P1.pdb - -While on this subject, you might also want to include hydrogens in the model -using something like: -$ echo HYDROGENS APPEND | hgen xyzin model.pdb xyzout model-with-H.pdb - - -A Note about Unit Cell Settings -------------------------------- - -Please be sure to read the "Note about Unit Cell Settings" in the documentation -for indexamajig. diff --git a/doc/process_hkl.txt b/doc/process_hkl.txt deleted file mode 100644 index 3c5e40f9..00000000 --- a/doc/process_hkl.txt +++ /dev/null @@ -1,46 +0,0 @@ -process_hkl - data scaling and merging program ----------------------------------------------- - -This program takes as input the data stream from "indexamajig". It merges the -many individual intensities together to form a single list of reflection -intensities which are useful for crystallography. - -Typical usage is of the form: - -$ process_hkl -i mypatterns.stream -o mydata.hkl -y mypointgroup - - -How to choose the point group ------------------------------ - -One of the main features of serial crystallography is that the orientations of -individual crystals are random. That means that the orientation of each -crystal must be determined independently, with no information about its -relationship to the orientation of crystals in other patterns (as would be the -case for a rotation series of patterns). - -Some Laue classes are merohedral. This means that the orientation will have an -ambiguity, but this time more serious. The two (or more) possible -orientations could be called "twins", but the mechanism of their formation is -somewhat different to the conventional use of the term. In these cases, you -will need to merge according to the point group corresponding holohedral Laue -class. - -You can also tell process_hkl the "apparent" symmetry, which is the symmetry as -far as whatever produced the stream was concerned. In the case of most indexing -algorithms, this will be the corresponding holohedral point group (not the -Laue class nor the holohedral Laue class). If you use the "-a" option to give -this information, process_hkl will try to resolve the remaining orientational -ambiguities to get from the apparent symmetry to the true symmetry (given with -"-y"). Currently, it won't do a very good job of it. - -Commit number 5cdcaad6277c on the 13th of October 2010 altered indexamajig such -that it always finds a right-handed unit cell. That means that no ambiguity due -to inversion exists in streams produced by versions later than that. For -streams produced by copies of indexamajig older than that, you DO need to use -the corresponding holohedral Laue class (not point group) as the apparent -symmetry (with -a). However, since the ambiguity resolution used by process_hkl -doesn't really work, this detail is somewhat academic. - -The document twin-calculator.pdf contains more detailed information about this -issue, as well as tables which contain all the required information. diff --git a/doc/quickstart.txt b/doc/quickstart.txt deleted file mode 100644 index 5af7120e..00000000 --- a/doc/quickstart.txt +++ /dev/null @@ -1,5 +0,0 @@ -Quick start guide for CrystFEL ------------------------------- - -So, you have a folder full of thousands of diffraction patterns, and you want to -analyse them. You've come to the right place. diff --git a/doc/symmetry.txt b/doc/symmetry.txt deleted file mode 100644 index c190f8ee..00000000 --- a/doc/symmetry.txt +++ /dev/null @@ -1,56 +0,0 @@ -How CrystFEL handles symmetry ------------------------------ - -Most programs in CrystFEL understand point group symmetry. The exception is -"get_hkl", which you can read about below. You give the point group following -the "-y" option to the programs. - -Please read doc/process_hkl for important information on how symmetry is used -during the indexing and merging procedures. It's important to understand how -this works before, for example, trying to merge a dataset. - -Symmetry definitions are included in src/symmetry.c. Point group definitions -are required for merging and the display of merged results, but space groups are -not taken into account since merging does not care about systematic absences - -as far as CrystFEL is concerned, systematic absences are just measurements -which happen to have values of zero. Each space group belongs to exactly one -point group, which you can look up in the International Tables for X-Ray -Crystallography. Alternatively, "twin-calculator.pdf" in the same directory as -this file lists all the space groups according to point group, Laue class and -holohedry. - - -Adding a new point group ------------------------- - -Point groups are being added here as they are required, so it's likely that the -exact one you want hasn't been added yet. Here's how to add a new one by -editing src/symmetry.c. - -First, expand the check_cond() function to include a description of the -asymmetric reciprocal unit cell for the point group. Every reflection in the -whole of reciprocal space must map onto exactly one reflection in the asymmetric -unit cell so defined. The asymmetric cell is usually defined with positive h, k -and l, but it doesn't really matter. Working out the required condition means -visualising the cell and taking care to properly handle situations such as the -(000) reflection. Get this right, otherwise you'll go crazy when it breaks in -weird ways. - -Next, expand the num_general_equivs() function. Given a point group, this -function must return the number of equivalent reflections for a general -reflection, including the input reflection. High-symmetry reflections (usually -ones with zeroes in their indices) have fewer equivalents, but the num_equivs() -function will work this out for you. - -Finally, add the new point group to the get_general_equiv() function. This -function takes a set of Miller indices, a point group and an index "n", and -returns (by reference) the indices of the "n"th equivalent reflection. You just -have to worry about the general position, because get_equiv() will work out the -special positions for you. get_general_equiv() must return the original indices -when idx=0. - -If you want the new point group to be used for simulation on the GPU, you will -also need to modify src/diffraction-gpu.c and data/diffraction.cl. Choose a -simple capitalised name for the point group and add it to the list of OpenCL -preprocessor definitions in setup_gpu(). Then add a corresponding list of -equivalents following the established pattern in molecule_factor(). That's it. -- cgit v1.2.3