DØ C++ Language Coding Guidelines

Herb Greenlee, John Hobbs, Alan Jonckheere, Laura Paterno
Fermi National Accelerator Laboratory

Marc Paterno
University of Rochester

Scott Snyder
Brookhaven National Laboratory

Gordon Watts
Brown University

17 October 1997

1. Introduction

This coding style guide was developed to help write C++ code in the DØ Code Development Environment. We strongly recommend adherence to these guidelines. Please realize that this is an evolving document and the guidelines will change over time.


2. Files

2.1. File layout

Each C++ program is a collection of implementation files(*.cpp) and header files(*.hpp). The executable program is obtained by compiling and linking its constituent files. Each file has certain public information, such as the declaration of public classes and functions that are implemented in it. It also has private information such as functions that are not called from outside, variables that cannot be inspected or modified from outside, and maybe local type declarations. Unfortunately, C++ is not very supportive in this regard. All global data and functions are by default public. We require that you declare all functions and global data as static unless they are exported to other files.

The public interface of a class is specified in a header file with the same name as the implementation .cpp file but with the extension .hpp. It contains extern declarations of all exported global variables, prototypes of all exported functions and public type and class definitions.

Implementation and header files are organized as follows:

The functions and operations sections only appear in the implementation (.cpp) files with the exception of inline functions.

2.2. Header comment

Each module starts with a header comment block in the format below.

//
// File: wm.cpp
// Purpose: widget manipulation
// Created: 01-JAN-1990 by Cay S. Horstmann
//
// Comments:
// General comments go here, for example:
// - command line options
// - file formats
// - rules and conventions
// - descriptions of the main data structures
//
// Revisions:
// 21-FEB-1996 C. Horstmann
// Added ...
//
...

Notice that whenever a name appears that the full last name is given. Always specify the last name of all authors and contributors to the code.

2.3. Included header files

The next section lists all included header files.

// Include files
#include <stdlib>
#include <string>
#include "cw.hpp"
#include "wm.hpp"

Sort header files in the order shown:

Follow the convention of using <...> for the files stored in the "standard place" of the compiler (e.g. \borlandc\include), "..." for the files stored in the local directory. Do not embed absolute path names, e.g

#include <c:/borlandc/include/stdlib> // DON'T !!!

The start of each include file (xxx.hpp) should always have a #ifndef INC_XXX at the start followed by a #define of INC_XXX if it is not defined. In addition each file included in the .hpp file must also be surrounded by a #ifndef INC_XXX. This includes system files as well. This will reduce compilation times.

2.5. The class section

This section contains the definitions of classes, enumerations, and types, in any order.

// CLASS definitions
enum Weekday { MON, TUE, WED, THU, FRI, SAT, SUN };

typedef int (*DateCompareFun)(Date, Date);

class Date
//
// Purpose: Store dates and perform date arithmetic
//
{
public:
//...
};

2.6. The globals section

Then come the definitions of global constants and variables. All global variables that are not declared in the header file must be declared as static, i.e. private to the module. We strongly caution the programmer against using global variables. In our experience, having more than three global variables in a module is excessive.

// GLOBALS

// the number of days in every month (except for leap years)
static const int days_per_month[] = { 31, 28, 31, 30, ... };

const int JULYEAR0 = -4713; // Julian day 0 = Jan. 1, -4713

2.7. The functions section

This section lists all functions of a class. First, list prototypes (if any), then the functions. All functions without a prototype in the header file must be declared static, i.e. private to a class. You may sort the functions in any order, but we recommend that you sort them in reverse order of call. In particular, the file containing main has main last. Always prototype your functions. We list the functions before the operations of classes because the operations need to know about the static functions, whereas the functions already know the operations from the class definitions. In an object-oriented program, it would be surprising to see more than a handful of functions per file.

// Function Definitions
static long dat2jul(int d, int m, int y)
//
// Purpose: Convert calendar date into Julian day
//
// Inputs:
// d - the day
// m - the month
// y - the year
//
// Returns: The Julian day number that begins at
// noon of the given calendar date.
//
// Comments: This algorithm is from Press et al.,
// Numerical Recipes in C, 2nd ed.,
// Cambridge University Press 1992
//
{
// ...
}

// Overload >> operator for output to stream
istream& operator>>(istream& is, Date& date)
//
// Purpose: Read a date from a stream
//
// Inputs:
// is - the stream
// date(OUT) - the date that is read in
//
// Returns:
// is - the stream
//
{
// ...
}

// Main program
int main(int argc, char* argv[])
{
// ...
}

2.8. The operations section

This section lists all operations of classes. Sort them by the type of operations being performed. First should come all the contructors and the destructor. Next should come the accessors (getters) and mutators (setters). After this should come all the other methods for the class.

// Constructors and Destructors

Date::Date(int d, int m, int y)
//
// Purpose: Construct a date object
//
// Inputs:
// d - the day
// m - the month
// y - the year
//
: _day(d), _month(m), _year(y)
{}

// Accessors

int Date::Day(void)
//
// Purpose: Return the current day value
// Returns: _day
//
{
return(_day);
}

// Methods

long Date::days_between(Date b) const
//
// Purpose: to find the number of days between
// this date and b.
//
// Inputs:
// b - a date
// Returns: The number of days between this date
// and b
//
{
// ...
}

3. Classes

3.1. Class header comment

Each class declaration starts with a header comment.

class name[:public base class[,public base class]*]
//
// Purpose:
//
// States:
//
{
// ...
};

The Purpose comment is mandatory. Describe the class in some way. You may phrase the description for an object of the class: "A mailbox stores and plays mail messages".

The States comment is very useful in understanding those classes that have clearly defined states that influence their behavior. For example, a bounded stack has two special states "empty" and "full" that change the way that pop and push behave. Omitting the States comment implies that the class has no states that are observable through the public interface.

3.2. Base classes

We only use public inheritance. Use aggregation instead of private inheritance, that is, replace

class C : private B {
}

with

class C {
private:
B _b;
}

This requires no change in the interface of C.

There is protected inheritance, but we don't allow it.

3.3. Class layout

First list the public section, then the private section. (If you have a protected section, put it in between the two. We disallow protected data and are neutral on protected operations. Within each section, order features as follows:

This order is partially psychological, partially pragmatic. For the reader of a class, the public interface is the most important aspect and therefore must come first. The private section may change and should be of no great concern to most class users. In each section, we order items to answer the following questions:

We need to place the local types before these items because some of them may refer to them. And we place memory management functions (copy constructor, destructor, assignment operator) at the end because they are uninteresting for the class user. (N.B. Do not supply these functions unless the default ones supplied by the compiler are inadequate.)

Each public feature and each data field that is not totally obvious from its name should have a one-line comment. You can omit comments for patently obvious functions like print(). More detailed descriptions will be found in the operations section.

class String
//
// Purpose: character strings
//
{
public:
// maximum number of positions in string
enum {NPOS = INT_MAX};

String(); // constructs an empty string
String(const char s[]);
String(char ch);

char& operator[](int I);
void set(int i, char ch); // change a character
// ...

int length() const;
bool is_null() const;
char get(int i) const; // get a character
String substr(int from, int n = NPOS) const;
String operator+(const String&) const; // concatenate

friend String operator+(const char a[], const String &b);
static compare(const String&, const String&);

String(const String&);
const String& operator=(const String&);
~String();

private:
String(size_t len, const char s[], size_t slen);
void detach();
void unique();
char* _str;
};

3.4. Friends

The C++ compiler does not care where friend declarations are placed inside a class definition. We follow this rule:

A friend function that is used by the public is placed at the end of the public section. A friend declaration that serves for the implementation of the class is placed at the end of the private section. For example,

class List {
public:
// ...
friend ostream& operator<<(ostream&, const List&);
private:
Link* _head;
friend class Iterator;
};

3.5. Operations

In the declaration, do not omit the argument names of member functions, e.g. don't use

class String {
public:
String remove(int, int); // NO
// ...
};

Which int does what?

Do not use inlines in class declarations.

class Polygon {
public:
void set(int n, Point x) { _vertex.set(n, x); } // NO
// ...
};

Instead, use

class Polygon {
public:
void set(int n, Point x);
};

inline void Polygon::set(int n, Point x)
//
// Purpose: set a vertex of a polygon
//
// Inputs:
// n - position of vertex
// p - vertex
//
{
_vertex.set(n, x);
}

Sure, it is torture to type all this for a trivial function. (They don't call it "strong typing" for nothing...) But there are a number of benefits. It is much easier to revoke the inline attribute. The inline source code doesn't clutter up the public interface. There is room for a decent comment for those functions that require it. For example, the operation above would benefit from an explanation on the behavior if n is larger than the number of vertices of the polygon.

The only exception to this rule is that we permit do-nothing inlines for default constructors and virtual destructors in the class definition.

class Customer {
public:
Customer() {};
// ...
virtual ~Customer() {}
};

3.6. Data fields

In a class, all data fields must be private. Prefix each field name with an underscore (_age). Use accessor and mutator functions to access and change the data.

class Employee {
public:
int age() const;
void set_age( int a ) ;
// ...
private:
int _age;
// ...
};

Use the naming scheme _field, field(), set_field() consistently. (Actually, it is not entirely clear why one would want to set the age of an employee. It would make more sense to store the birthday and compute the age. This kind of design change is of course the reason for insisting on private data.)


4. Operations and functions

4.1. Header comment

Each operation and function has a comment of the following form.

//
// Purpose: explanation
//
// Inputs:
// argument 1 - explanation
// argument 2 - explanation
// ...
//
// Returns: explanation of return value
//
// Comments: pre- and postconditions, exceptions,
// etc.
//

The Purpose comment is required, except for constructors, destructors, assignment operators, field accessors and very trivial functions that are adequately described by their return value. The Inputs comment is omitted if the function takes no arguments. The Returns comment is omitted for void functions. No purpose and return value comments are necessary for main. Explain the command line arguments in the Inputs section.

static long dat2jul(int d, int m, int y)
//
// Purpose: Convert calendar date into Julian day
//
// Inputs:
// d - the day
// m - the month
// y - the year
//
// Returns: The Julian day number that begins at
// noon of the given calendar date.
//
// Comments: This algorithm is from Press et al.,
// Numerical Recipes in C, 2nd ed.,
// Cambridge University Press 1992
//
{
// ...
}

Reference arguments that have no well-defined incoming value but are used to hold a result must be tagged as OUT.

istream& operator>>(istream& is, Date& date)
//
// Purpose: Read a date from a stream
//
// Inputs:
// is - the stream
// date (OUT) - the date that is read in
// Returns: is
//
{
// ...
}

4.2. Declaration attributes

Every function must have a return type. Use void for procedures.

Every accessor operation must be declared as const.

Every global function must be declared static unless its declaration is exported to the header file.

4.3. Parameters

Parameter names should be explicit, especially if they are integers or boolean.

Customer Bank::remove(int i, bool b); // huh?
Customer Bank::remove(int teller, bool display); // OK

Of course, for very generic functions, short names may be very appropriate.

For each array, pointer or reference argument, use const if the argument is not modified by the function. The function is presumed to modify all non- const pointer, reference and C array arguments.

void Mailbox::add(Message& m); // will modify m
void Mailbox::add(const Message& m); // won't modify m

Never use pointers to denote C arrays. Pointer parameters are presumed to point to a single object.

// e points to a single object (maybe derived from Employee)
void find(const Employee* e, int idnum);
// e is a C array of objects (of exact type Employee)
void find(const Employee e[], int idnum);

Do not use pointers for reference arguments. Use references instead.

void swap(int* px, int* py); // NO
void swap(int& px, int& py); // OK

Do not use pointers to avoid the cost of call by value. If

int Bank::find(Customer c);

is considered too inefficient, use a constant reference instead.

int Bank::find(const Customer& c);

Of course, if all objects of the type are located on the heap, and you always access them with pointers, then a pointer parameter is ok. We only object to taking the address of a stack object in the call.

Do not write procedures (void functions) that return exactly one answer through a pointer or reference. Instead, make it into a return value.

void Bank::find(Customer c, bool& found); // NO
bool Bank::find(Customer c); // OK

Of course, if the function computes more than one value, some must be returned through reference arguments.

4.4. Function length

As a rule of thumb, functions should not be longer than 30 lines of code. The function header, comments, blank lines and lines containing only braces are not included in this count. For operations of classes, this limitation is rarely a problem. Functions that parse input with a long switch statement or if/else may end up being much longer, but then keep each branch to 10 lines or less.

4.5. Constructors

It is considered good style to write constructors so that all data members are constructed outside the {...} and only non-trivial actions are placed inside. In otherwords use colon initialization whenever possible. For example,

Date::Date(int d, int m, int y)
//
// Purpose: Constructor for the date class which accepts
// day, month and year as arguments.
//
// Inputs:
// d - the day
// m - the month
// y - the year
//
: _day(d), _month(m), _year(y)
{
assert_precond(valid());
}

5. Variables

5.1. Comments

Every local variable, with the exception of really self-explanatory names and boring loop counters, must be commented when declared. Every global variable must be commented, without exception.

int nfont = 0; // the number of fonts currently loaded

5.2. Initialization

Every variable must be explicitly initialized whenever possible.

int nfont = ft_load(); //the number of fonts currently loaded

It is almost always possible to do this in C++, by declaring the variable just before it is to be used for the first time. Remember that variable declarations and statements can be freely mixed.

Move variables to the innermost block in which they are needed.

while( ... ) {
int b = f();
// ...
}

This is considered good style, much better than declaring all variables at the beginning of the function.

5.3. Pointers & References

We recommend placing the * or the & with the variable, not the type e.g.

Shape *p;
Shape &r;

not

Shape* p;
Shape& r;

5.4. Global variables

Global variables are those declared outside functions. When declared as static, they can be read and modified by all functions in the current module (hundreds of lines of code). When exported, they can be read and modified by all functions in the program (many thousands of lines of code). This is unreasonable in practice. There is a simple strategy for minimizing global variables: Group related variables into classes.

Don't use

int win_top, win_left, win_bot, win_right, cur_row, cur_col;

Instead, use

class Window {
// ...
private:
int win_top, win_left, win_bot, win_right;
int cur_row, cur_col;
};

Window display_win;

A prime candidate for grouping are arrays of the same length. Don't use

int charwidth[NCHAR];
char* pixels[NCHAR];
unsigned char charcode[NCHAR];

Instead, use

class BitmapChar {
// ...
private:
int _width;
char* _pixels;
unsigned char _code;
};

BitmapChar display_font[FT_NCHAR];

In fact, it makes sense to go one step further and declare a type

BitmapFont.class BitmapFont {
// ...
private:
String _name;
Array<BitmapChar> _characters;
};

BitmapFont display_font;

In C++, it is often possible to replace global variables by shared class variables.

class Date {
// ...
private:
int _day, _month, _year;
static const int _days_per_month[12];
};

const int Date::_days_per_month[12] = { 31, 28, ..., 31 };

This is considered good programming practice.


6. Constants

6.1. Constant definitions

In C++, do not use #define to define constants.

#define NFONT 20 // DON'T

Use const instead.

const int FT_NFONT = 20; // the maximum number of fonts

6.2. Zero

Should you use '\0' and NULL, or just plain 0? We leave this to you. It makes sense to use the "strongly typed" explicit symbols instead of 0. On the other hand, it is well within the spirit of C++ to use 0 as an overloaded symbol to denote the "right zero" value in various contexts.

6.3. Enumerations

Use enum to define a number of related constants.

enum Color {BLUE = 1, GREEN = 2, RED = 4};

Do not use a sequence of const or #define!

Avoid global enumerations. Whenever possible, make enumerations local to a class.

class Date {
public:
enum Weekday {SUN, MON, TUE, WED, THU, FRI, SAT};
// ...
private:
// ...
};

The programmer refers to these constants as Date::SUN, Date::MON etc. The names SUN, MON etc. don't clutter up the global name space. That avoids nasty conflicts with other enumerations.

enum Workstation {SUN, DEC, HP, IBM,};
enum Exam {SAT, GRE, GMAT,};

It is often useful to have the compiler keep track of the total number of items:

enum Tokentype {
NIL,
// ...
FRACTION,
ROOT,
MATRIX,
TABLE,
// add new token types above this line
TK_NTOKENTYPE
};

Now NTOKENTYPE keeps track of the number of token types defined. Be sure to include the comment, directing the person maintaining the code to insert new types before the counter.

In C++, it is common to use enum to define constants used inside classes. For example,

class Stack {
// ...
private:
enum { SSIZE = 20, }
int _item[SSIZE];
int _stackptr;
//...
};

The seemingly more rational

class Stack {
//
private:
const int SSIZE = 20; // NO
int _item[SSIZE];
int _stackptr;
//...
};

is a syntax error.

6.4. Magic Numbers

A magic number is a integer constant embedded in code, without a constant definition. Using magic numbers makes code amazingly difficult to maintain. They are strictly outlawed. Even the most reasonable cosmic constant is going to change one day. You think there are 365 days per year? Your customers on Mars are going to be pretty unhappy about your silly prejudice. Make a constant

const int DAYS_PER_YEAR = 365

so you can easily cut a Martian version without trying to find all the 365's, 364's, 366's, 367's etc... in your code. By the way, the device

const int THREE_HUNDRED_AND_SIXTY_FIVE = 365

is counterproductive and frowned upon.

You can take it for granted that all array lengths will change at least twice (oops, meant to say, at least N_BUFLEN_CHANGE times) during the lifetime of the code. Declare the length of each fixed size array as an individual constant. Don't use the same constant for two arrays unless for compelling logical reasons the two arrays have to have the same length.


7. Control flow

7.1. The if statement

Avoid the "if...if...else" trap. The code

if( ... )
if( ... ) ...;
else {
...;
...;
}

will not do what the indentation level suggests, and it can take hours to find such a bug. Always use an extra pair of {...} when dealing with "if...if...else":

if( ... ) {
if( ... ) ...;
else( ... ) ...;
} // {...} not necessary but they keep you out of trouble

if( ... ) {
if( ... ) ...;
} // {...} are necessary
else ...;

7.2. The for statement

Do not use the for loop for weird constructs (even though [Kernighan & Ritchie] contains an occasional abuse). Constructs such as

for (r = i = 0; s[i]; r += s[i++] - '0') r *= 10;

are not tolerated. Make it into a while loop. That way, the sequence of instructions is much clearer.

r = i = 0;
while (s[i])
{
r *= 10;
r += s[ i++ ] - '0';
}

Only use for loops when a variable runs from somewhere to somewhere with some constant increment/decrement.

for (i = a.low(); i <= a.hppigh(); i++)
a[i].print();

A for loop traversing a linked list can be neat and intuitive:

for (l.reset(); !l.at_end(); l.next())
l.current().print();

7.3. The switch statement

The switch statement should be laid out as follows:

switch (x) {
case a:
...
...
break;
case b:
...;
return;
case c:
case d:
case e:
...;
...;
break;
default:
...;
...;
break;
}

Every branch must end in a break or return statement, even the last one. "Fall through" is not permitted. You don't have to use default, but if you do, put it at the end of the switch.

7.4. Nonlinear control flow

Do not use the break (except in switch), continue and goto statements under any circumstances. It is always possible to avoid a break in a loop by adding another boolean variable. It makes for a clearer loop because you can easily verify the loop invariant.

The loop

for (i = a.low(); i <= a.hppigh(); i++)
if(a[i] == x) // found it
break; // DON'T
if (i > a.hppigh()) // never found
...

can easily be rewritten as

bool found = FALSE;
i = a.low();
while (!found && i <= a.hppigh()) {
if( a[i] == x ) found = TRUE;
else i++;
}
if (!found)
...

The break and continue commands are not stable. Adding more loops or statements at the end of a loop change their meaning. goto is for yacc output, not for code produced by humans.

The only nonlinear control flow statements that you may use are return and throw.


8. Lexical issues

8.1. Naming convention

The following rules specify when to use upper- and lowercase letters in identifier names.

  1. All variable and function names and all structure members are in lowercase (maybe with an occasional upperCase letter or under_score in the middle). For example, font_cache.
  2. All #defined constants and macros, all enum constants and all inline const are in uppercase (maybe with an occasional UNDER_SCORE). For example, CACHE_SIZE.
  3. All typedefs and all C++ struct, class, union, enum names start with uppercase and are followed by lowercase (maybe with an occasional UpperCase letter or Under_score.) For example, FontCache.

Names should be explicit. Names should be reasonably long and descriptive. Use font_pointer instead of fp. No drppng f vwls. Local variables that are fairly routine can be short (ch, i) as long as they are really just boring holders for an input character, a loop counter etc... And, do not use ptr, p, pntr, pnt, p2 for five pointer variables in your function. Surely these variables all have a specific purpose and can be named to remind the reader of it (e.g. pcur, pnext, pprev, pnew, pret...).

8.2. Indentation and white space

Use tab stops every 2 or 3 columns. The default of 8 columns wastes screen real estate.

Use blank lines freely to separate logically separate parts of a function.

Separate functions by comments like // ....

Use a blank space around every binary operator and ?:.

ch = i < IO_BUF_SIZE ? curChar : 0; // GOOD
ch=i<IO_BUF_SIZE?curChar:0;//BAD

Leave a blank spaces after (and not before) each comma, semicolon, and keyword.

if (x == 0) ...

Every line must fit on 80 columns. If you must break a statement, add an indentation level for the continuation:

a[n] = ..................................................
+ .................;

Start the indented line with an operator (if possible). If this happens in an "if" or "while", be sure to brace in the next statement, even if there is only one.

if(.........................................................
&& ..................
|| .......... )
{
...
}

If it wasn't for the braces, it would be hard to visually separate the continuation of the condition from the statement to be executed.

8.3. Braces

Closing braces must line up vertically with the starting statement for a block of code.

while (i < n) {
a[i].print(cout);
i++;
}

or

while (i < n)
{
a[i].print(cout);
i++;
}

However, do not do any of the following as it just makes the code harder to read:

while (i < n) { a [i].print(cout); i++; }

or

while (i < n) { a [i].print(cout);
i++;
}

or

while (i < n)
{ a [i].print(cout);
i++;
}

Comments after the starting brace are okay.

8.4. The preprocessor

It has been said that all usage of the preprocessor points to deficiencies in the programming language. C++ fixes some deficiencies. Do not use #define for constants or macros--use const, enum and inline instead.

Do not use the #define facility to aid your transition from FORTRAN or Pascal, like

#define BEGIN {
#define END }

or

#define begin {
#define end }
#define repeat do {
#define until( x ) } while( !( x ) );

Neat as they may be, these constructs are strictly outlawed. Your fellow programmers have better things to do than play preprocessor with your code.

A legitimate use for the preprocessor is conditional compilation (e.g. #ifdef WIN31 ... #endif).

To comment out a block of code that may itself contain comments, use #if 0 ... #endif. (Recall that C++ comments do not nest.)