aboutsummaryrefslogtreecommitdiff
path: root/libcrystfel/doc/coding.md
blob: 6e84e4fe7e9b3ba28a00420a6f966302e6c90136 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
\page coding CrystFEL coding standards

### Licensing

CrystFEL is distributed under the terms of the GNU General Public License
version 3 or higher.  Contributions are very welcome provided they also use this
license.  If your code is not already licensed compatibly, or if the license if
not clear, we will ask you to re-license it.

Whenever you edit a source file, don't forget to update the copyright dates at
the top.  Add your name and email address if they're not there already.  Be sure
to add your name to the 'AUTHORS' file in the top level folder, as well.

### Scope

The remainder of these rules apply to C code in libcrystfel and the core
CrystFEL programs.  There are currently no specific guidelines for Python,
Perl, shell script, CMake files or other parts of the codebase.

### Indentation

*Indentation* is done with *tabs* and *alignment* is done with spaces.
For example:

    int function(int a, int b)
    {
    <-tab-->int p;  /* <--- Tab character used to indent code inside function */
            char *str;

    <-tab-->do_something(a, "A long string which takes up a lot of space",
    <-tab-->.............str, &p);   /* <--- spaces used to align with bracket */
    }

**Rationale:** Using tab characters makes it easy to align code correctly,
because you can't slip out of alignment by one character.  It also makes the
code look neat whatever width you configure your editor to display tabs as.

### Wrap width

Code should fit into 80 columns (counting tab characters as 8 columns) wherever
possible, with exceptions only in cases where not doing so would cause line
breaks at ugly positions, such as straight after an *opening* bracket.  The
absolute maximum allowable line length is 120 columns, with no exceptions
whatsoever.

For example, this is preferred because it fits into 80 columns (just!):

    very_long_variable_name = very_long_function_name(even_longer_long_variablename,
                                                      var2, var3);

However, if the variable names are even longer, then the following is preferred,
even though it goes over 80 columns:

    very_very_long_variable_name = very_long_function_name(even_longer_long_variablename,
                                                           var2, var3);

This is preferred over the following type of thing, which is just plain ugly:

    very_very_long_variable_name = very_long_function_name(
                                                   even_longer_long_variablename,
                                                           var2, var3);

However, see the point below regarding variable names.

**Rationale:** Aside from ensuring it's always possible to display two parts of
the code side-by-side with a reasonable font size, this is not really about how
the code looks.  Rather, it is to discourage excessive levels of nesting and
encourage smaller, more easily understood functions.  I don't think I've yet
seen any examples of code where the intelligibility could not be improved while
simultaneously reducing the number of levels of indentation.

### Variable, function and type names

Shorter variable and function names, with explanatory comments, are preferred
over long variable names:

    double wavelength_in_angstrom_units;  /* <--- not preferred */

Preferred:

    /* The wavelength in Angstrom units */
    double wl;

***Rationale:*** Shorter variable names make it easier to see the overall
logical or mathematical structure.

"Snake case" is preferred over "camel case":

    int model_option;   /* <--- Preferred */
    int modelOption;    /* <--- Discouraged */

Capitalisation is used for complex type names:

    UnitCell *cell;

### Nested loops

When performing a two or three dimensional iteration which could be considered
as one larger iteration, for example over image coordinates or Miller indices,
it is acceptable to indent as follows:

    for ( h=-10; h<+10; h++ ) {
    for ( k=-10; k<+10; k++ ) {
    for ( l=-10; l<+10; l++ ) {

            /* Do stuff */

    }
    }
    }

In this case, there must be no lines at all, not even blank ones, between each
of the "for" statements and also between each of the final closing braces.

***Rationale:*** Large multi-dimensional loops are common in scientific code,
with more than three levels not at all uncommon.  Any reasonable limit on code
width would be overshot by indenting each level separately.

### Comments

C-style comments are preferred over C++-style, even for one-line comments:

    /* Preferred type of comment */

    // Discouraged type of comment

***Rationale:***  CrystFEL is written in C, not C++.

Multi-line comments should have stars down one side:

    /* This is a multiple-line comment.
     * Here is the second line of the comment */

It's also acceptable to put the closing slash on its own line:

    /* Here is another multiple-line comment.
     * Here is the second line of the comment
     */

### Bracket positions

The opening brace of a function should be on its own line:

    void somefunction(int someparam)
    {
            /* Here is the body of the function */
    }

The opening brace of an if statement should be on the same line as the 'if',
unless the condition is very long, especially if it spans multiple lines:

            if ( a < b ) {
                    do_something(a);
            } else {
                    do_other_something(a);
            }

            if ( a_very_long_condition == structure->wibble.value
              && other_very_long_condition )
            {
                    do_something_completely_different(someparam);
            }

***Rationale:*** This keeps the appearance of 'if' statements compact instead of
spread out, while allowing some visual separation between long conditions and
the conditional code.

### Space around parantheses

Parentheses should have spaces after them in 'if' statements, but not in
function calls. Function arguments should have spaces after the comma.  There
should be no space between the function name and the opening bracket.  That
means:

    if ( something < 3 ) {
            do_something(a, b, c);
    }

or:

    if ( h>3 && k<3 && l==-1 ) {
            do_something(h, k, l);
    }

instead of:

    if (something<3) {
            do_something (a,b,c);
    }

***Rationale:*** I find this guideline helps to encourage good grouping of
logical statements.  Some flexibility is allowed here, though.

### Whitespace

No trailing whitespace at all, even in empty lines.

***Rationale:*** Invisible characters do nothing other than generate VCS
conflicts.

### Cleverness

Transparent, easily understood solutions are preferred over faster ones, except
where you can demonstrate that the code is performance-critical and the benefit
is significant.  Even in that case, copious comments should be provided.

Use of undefined behaviour, even if "it always works", is absolutely forbidden.

### Git/VCS usage

This style of commit message is preferred:

> Strip out libfrosticle references and add new function model

This style of commit message is discouraged:

> Stripped out libfrosticle references, and added new function model

**Rationale:** this encourages you to think in terms of small, self-contained
changes to the code, rather than successive versions with many different changes
combined together.  It also matches the conventions used by most other projects.