Not A Grammar: Design for Debug: An Overview

Software is never perfect, and it's never done. Bugs always creep in, and tend to reveal themselves towards the end of the development cycle when multiple modules are integrated together. In the inevitable debugging sessions that follow, do you expect a long hellish grind of guesswork, tweaks, and recompiles? Or, are there fast debugging methods, and ways to preemptively ease the troubleshooting?

This is where "Design for Debug" (DFD) comes in. The idea is to follow a set of coding practices that preemptively makes code easier to understand and debug. The analogous practice in the chip design world is Design for Testing (DFT). I strongly believe that DFD is a key component that separates 10x (or 100x) programmers from the pack. Not only does it make them more productive; it makes their colleagues more productive since anyone on the team can more easily debug issues in their code.

State Observation

A bug must first be observable. Given some inputs, the program must produce an incorrect or unexpected output (including crashing). The primary observation methods are:

inspecting program-visible inputs and outputs

includes: keyboard/mouse/microphone input, input/output files, graphical and audio output
Pros: no changes to the program needed
Cons: no intermediate state available

trace logging

Pros: shows the evolution of a narrow slice of intermediate state over time
Cons: requires modification to multiple locations in code; observation limited to state that was selected ahead of time

interactive debugger

Pros: shows the full program state at one instant in time
Cons: time consuming to observe evolution of state over time

Notice how logging and debugging complement each other, in a type of space-time tradeoff. It's up to us to choose the right tool for the job, depending on the situation.

Do you need to compare data values across 1000s of iterations?

add a logging statement
why? it would take forever to single-step 1000s of iterations

Are you trying to determine which codepath was taken through some extremely branchy code?

step through it with a debugger
why? it would be tedious to manually add a logging statement to 10s of code branches

Want to know why a function or call-chain is failing?

set a breakpoint on every suspected "return" and "throw" statement, see which one hits
why? it's faster to set a handful of breakpoints than to manually add logging and wait for a recompile

Do you need to observe control flow over large swaths of code over time?

set breakpoints and "binary search" the error, or fall back to adding logging statements
Breakpoints can save you recompiles, but if you keep missing the error location, then thorough logging may be the only recourse.

Of course all of the above presumes that the code is structured to permit all of the above easily.

Coding Guidelines

Each of the following deserve their own article with code samples, so this is more of a braindump.

Place intermediate values in named variables.

Avoid long chained calls and excessive function composition within a statement.
You can easily see variables in a debugger. You can't see compiler generated temporaries or return values.
If you want to log a value, make sure it's in a variable so that the computation and log statement don't go out-of-sync.

Calculate the return value into a local variable before returning it.

To set a breakpoint, you need a line of code to set it on.
To see the return value, you need a line where it's visible.

Clearly separate error and success paths.

Write separate return or throw statements for errors.
Try to end the success path at the bottom of the function.
Avoid single exit point style; in C++ it's an anti-pattern.

Historically the "Single Entry Single Exit" style was introduced to add structure to assembly, FORTRAN, and C, focused on resource cleanup. It's obsolete in C++, which has exceptions and generally requires RAII for cleanup. See here.
Herb Sutter would agree.
RVO is not a defensible argument for single-exit sytle. There are two cases:

The function is perf-critical, and thus made inline-able. Inline context renders RVO moot since the backend compiler can do far better.
The function is not perf-critical, and thus placed in a separate compilation unit. Now RVO is in play, but since we already assumed that micro-optimization is a non-goal, we can prioritize maintainability and debuggability over RVO.

Use consistent naming across the codebase.

If all variables of type Foo are written "Foo* pFoo" then a single watch expression of "pFoo" will reveal that state in every scope. Any sub-expressions you want to see in every context (like "pFoo->pBar->thing") will also re-evaluate cleanly in every context.
On the flip side, the use of different variable names in every context requires retyping every expression/sub-expression in every context. Wastefully inefficient with zero benefit. (the speed of thought reduced to repeated manual operations)

Make it easy to copy/paste sub-expressions from source code into the debugger's expression evaluator.

Avoid chain-calling accessor functions. Many debuggers cannot evaluate those functions, either as an inherent limitation, or because inline accessors have no out-of-line instance in the executable that is callable at debug time.
Pass raw pointers and references to functions whenever possible. (as opposed to smart pointer references)

Avoid function composition at callsites. to make it easier to step into any function call.

"Step in" must first go into every argument-producing call before finally going into the "actual function" of interest.
Anyone who's used a debugger has done the tedious "step in dance" -- "step in", "step out", "step in", "step out", "step in" ... until the function-of-interest is reached. Of course this is very error prone since one extra "step out" skips over the function-of-interest, which usually requires a program restart (i.e. a complete waste of time).
In Visual Studio there is a workaround that lets you mark functions as "NoStepInto". It's useful for leaf functions like getters/setters and conversion operators, but little else.
Strongly related to placing intermediate values in named variables.

Avoid excessive wrapper and forwarding functions, to make it easier to step into the "actual logic".

Basically, if you can eliminate wrapper layers, then do so. Kind of obvious.

Use data types that show up cleanly and easily in the debugger.

Prefer structs with named fields over std::pair and std::tuple. std::pair and std::tuple implementations use deep inheritance, either to process a type-list, or to implement the empty base class optimization. In a debugger, base classes are displayed similar to member variables by introducing a nesting level; to see all elements of a tuple requires excessive child expansion (i.e. many mouse-clicks).
Consider using virtual functions instead of "bags of function pointers" or std::function. Debuggers can deduce the derived-most type of any class/struct that has a vtable, and show the full contents automatically. Code browsing works better on virtuals too (e.g. "find all references" to jump between pure virtual declaration and its derived implementations). Note that superior implementations of std::function would obviate this choice, by storing internal state in a debuggable fashion (at least in debug builds!).
Other container types suffer this problem; even with custom debugger visualizers the result isn't always workable.

boost::optional and boost::flat_map

The problems aren't unique to Visual Studio; gdb has similar issues.
Debugger visualizers are not a substitute for debuggable data structures. Visualizers show values but not addresses, making it impossible to compare local pointers/references to elements, and impossible to plug into a memory viewer. Child elements of visualized data trees cannot be evaluated as watch expressions.

When iterating arrays and std::vectors, prefer integer indexing over alternatives.

The debugger shows array and vector contents by integer index.
Typical debug sessions involve setting a breakpoint in a loop, looking for a bad input element. Upon discovery of the bad element, you need to know its index to "work backwards" towards the code that generated it.
In the debugger, if you already know the bad array element (by inspecting array contents after a loop), you can then set a conditional breakpoint in the loop body on "index == BAD_INDEX" and rerun the program. Of course this requires an "index" variable in the code.
Iterators are not debuggable! You cannot trivially determine the array index from a vector iterator. Conditional breakpoints cannot perform iterator arithmetic like "itr - vec._Head" in most debuggers either.
Range-based for-loops use compiler-generated iterators, which reduces to the same problem. Only use range-based for-loops for trivially correct code.
Avoid std::for_each. In almost every case, it's a worse way of writing a range-based for-loop.
Cons: This approach results in slightly more verbose code.
Minor pro: index-based loops are friendlier to the optimizer by easing alias analysis. Index-based loops are auto-vectorizable, whereas iterator-based loops are not. This includes the use of pointers as iterators. See here and here.

Give functions unique names whenever possible.

Don't excessively overload on terse names. Or even reuse function names across namespaces.

It's difficult to set breakpoints when you don't know which function is going to be called.
Reused names are not grep-friendly.
Even sophisticated code-browsing tools often fail to find the right overload for a given callsite. They may resort to providing tons of options, causing trial & error navigation or missed breakpoints.

Of course the guideline doesn't apply when required by interface constraints (compile time, virtual, and meta-programming name requirements are all "interface requirements"). It only behooves us to choose semi-unique names per interface, to avoid collisions across different interfaces.

Concluding Remarks

If you've ever debugged through someone else's code and gotten upset, it's very likely because they violated any of the guidelines listed above. You may not have even realized why you were upset, other than "this code is sh*t and so difficult to work with". Hopefully by building up the list of patterns and anti-patterns, we can all consciously recognize and promote a software style that makes everyone more productive. Remember, software development is both art and science.

Not A Grammar

Monday, January 2, 2017

Design for Debug: An Overview

State Observation

Coding Guidelines

Concluding Remarks

No comments:

Post a Comment