C++ Coding Standards

前言:

本文為分享Coding standards的相關資訊,為筆者整理學習過程的筆記與其他資料來源。
Coding standards 原則上 follow Google 的 Coding standards。



Coding Standards
  • C++ code should be readable.
    • Good Comments:
      • Comments should tell you what’s being done from a high-level.
  • Comments are important.
    • 一個好的 Comments 必須要經常維護。
    • 如果沒有 Comments,可能面臨幾種問題:
      • 難以了解 programmer 的主要目的。
      • 如果發現 code 有怪異之處。這是 bug?還是故意這樣寫? 如果我要 rewriting,我是否要保留這段奇怪的部分?
      • 沒有 Comments 將會讓下一個 programmer 難以接手。
  • Understanding how a function is implemented when reading the code is done by:
    • Choosing good variable names. Don’t skimp on names. Make them meaningful.
    • Using concise code that’s easy to read (also means reasonably size for functions, lambdas, if-blocks, loops, etc).
    • Using standard library implementations when available.

Memory & Performance Basics
  • Explicitly delete copy constructors and assignment operators from your class when you first write it unless you know you need them.
  • Consider compilation costs when you have inline functions in headers.
    • Does the function require you to include another header? If the function won’t be a performance bottleneck, put it in the .cpp file instead, even if it’s small, and include the required headers there.
    • Every #include in your header file slows down compilation for everybody that uses your code.
    • Remove #include lines when you remove code that needed the header file.
      • funciotn非必要都一律放在.cpp。
      • 避免讓 compilation 變慢。
  • Every .h file should stand on its own.
    • When you include a .h file, you shouldn’t need to include others to make it work.
    • The order that you include .h files in your source code shouldn’t matter.

Memory/Performance
  • Try to avoid defining a lambda or a container in a loop.
    • Container destructors will invoke delete at the end of every loop and then reallocate memory at the start of the next loop.
    • Lambdas are also destroyed at the end of each loop and recreated at the start of the next loop.
  • When assigning a variable from a function that returns a reference, accept it as a reference; otherwise, you’re making a copy.
    • Ex.
  • Initialize all class variables that don’t have a default constructor.
  • Use initializer lists in a constructor to set specific variables in the class, letting the rest be set to default.
    • Setting the value within the constructor body actually causes 2 initializations of the variable.
  • Don’t optimize prematurely.
    • 如果你的 code 只佔整體執行時間的0.001%,讓這段 code 變快但是很醜,且很難maintain,反而有反效果。
    • Optimize when:
      • You have performance metrics that show that your code is a hot spot whose optimization will impact runtime.
      • You know that your code is complete (i.e. you won’t have to add a new feature that blows up all of the work that you put in place optimizing the code for a specific case).
    • Trust the compiler to optimize for you if you write clean, efficient code:
      • Compilers keep getting better and better – 這就是為甚麼一直有新版本!
      • 你不會一直維護同段code,避免陷入優化迴圈。
      • 過於奇特的優化,可能會讓未來的compile無法優化的你code。
  • Being correct but not necessarily optimal on all cases is better than getting the wrong answer faster.
    • But, you can still be smart/efficient while writing good code.
  • Prefer vectors over any other container *if it makes sense*.
    • Vectors 為container中最快,且使用最少 memory。
    • Use reserve() to avoid allocations and deallocations as the vector content grows.
  • Group related data structures together instead of a different container for each.
    • You gain memory for the container overhead, memory allocation, and even data packing.
  • Make your class’s move constructor noexcept if possible.
    • Use *if* your constructor will not throw an exception
    • Exceptionally useful for complex classes stored in a vector
    • When a vector grows, it moves all instances to a new memory location if the move constructor is noexcept
    • If the constructor is not noexcept, the copy constructor is used, which can be a LOT more expensive depending on the class
    • 補充資料: c++ 從vector擴容看noexcept應用場景
  • Avoid conditionals in performance-sensitive code.
    • Sometimes you can rewrite your code with a little thought to avoid the conditional.
    • Sometimes the conditional doesn’t matter.
  • Avoid double searches in a STL container.
    • STL contaoner,能透過給default,或是內建的function就可以簡化很多步驟。
    • Ex1,給default value。
    • Ex2,.erase本身就會有檢查的功能。
  • Avoid duplicating work that STL already does for  you.
  • Use STL algorithms – don’t reinvent the wheel.
    • std::count_if(), std::find_if(), std::minmax(), etc
    • The compiler usually optimizes these algorithms extensively for maximum performance
    • Code is already debugged for you

Readability
  • Avoid code duplication.
    • If you see a few lines of code that are duplicated except for minor changes, consider a lambda to have a single copy of the code.
    • For more than a few lines, use a member function or function in unnamed namespace.
    • Templates and lambdas can also be used to reduce code duplication.
    • Copying code means that any bug fixes/enhancements also need to be copied. Easy to miss something.
  • Declare variables close to the code that uses the variable. Use scoping to manage a variables lifetime unless it impacts performance (more on that later).
    • 宣告變數必須靠近在使用該變數之前。
    • 可以減少該變數的lifetime。
  • Make your code clear when you have a pointer variable.
    • Consider adding “Ptr” to the name for both regular pointers and smart pointers.
      • Example: netlistPtr, nodePtr
    • When assigning to a raw pointer variable, use the “*” syntax for added clarity.
    • Smart pointers returned from a function should be a reference or constant reference (or else you’d get a compiler error).
    • Examples:
      • const auto &netlistPtr = netlist();   // Gets UniquePtr<Netlist> reference
      • const auto *netlistPtr = netlist();    // Gets actual pointer to a Netlist instance
      • const auto netlist = netlist();          // No idea what this is; maybe a copy
  • If the class gets too big, it’s probably doing too many things.
    • Classes should do a limited number of things well.
    • You should be able to write unit tests for every API in a class. If you can’t easily do this, it’s too big.
      • We will be switching at some point to a new way of building and testing code. Don’t assume that what exists now will exist forever.
      • How hard would it be for you to test each class exhaustively?
  • If a class gets too big, take a step back and consider refactoring.
    • Can you break big pieces out into their own class?
    • Are there discrete pieces of code that don’t really interact with other code in the class?
    • How many lines of code are there in the class?
    • Is some of the code just there as a good holding spot for something that just uses data from the class?
  • The more code in a class that’s outside the main purpose of the class, the harder it is to maintain, debug and, eventually, refactor.
  • Using std::tuple<> for small, isolated regions of code for less than 4 values is fine.
    • Tuples become difficult when used as a primary data structure. Use a class or struct instead with named variables.
    • If somebody has to search the code to find out what’s stored in a tuple, the tuple is the wrong choice.
  • However, tuples can be great for nested comparisons!
  • Avoid struct unless it’s private within a class.
    • Breaks C++ encapsulation.
    • When used for private code within a class, you’ve limited any potential issues with maintenance.
  • Avoid highly nested if-statements.
    • Makes reading/debugging difficult.
    • If within a loop, consider using break/continue statements to get out of a nested-if situation.
    • Remember: If you have problems reading it, you’ll probably have problems debugging it.
  • Avoid excessive parentheses and blank lines.
    • Counting parentheses is a good way to make a mistake.
    • Excessive blank lines make it hard to read code.

Maintenance/Code Correct Code Quickly
  • Use the standard library or Boost library before writing your own algorithm.
    • Your code must be tested and maintained. The other libraries do that for you.
    • Other people will be familiar with the standard versions and not know why you used your own instead.
    • Read about the library every 6 months to refamiliarize yourself with what’s there
  • Use const wherever possible.
    • Lots of people believe that this is for the compiler, but it’s really not. It’s for the programmer as a form of documentation and error avoidance.
  • Avoid code duplication.
    • Many times you’ll have the same code for a const and non-const version.
    • Don’t duplicate code. Use std::as_const<> and const_cast<>
    • This code snippet shows how to handle 3 common situations when dealing with return values: pointers, references, and values. For pointers and references, you may need to perform a const_cast<T>() on the return value from the const version of the function. For returned values, however, no casting is required.
  • Avoid non-deterministic code and know when it can happen.
    • Iterating over unordered_map or unordered_set.
    • Using pointer values for sorting or hashing operations.
    • Using the thread pool to store values or even perform math operations.
  • Keep functions small. They should do just one thing and do it well with no side effects.
  • Don’t rely on somebody calling your code in a certain order. That’s a recipe for a bug in the future.
    • Public APIs to a class should start in a known good state and end in a good known state – ALWAYS
    • Private APIs don’t have this restriction because they are private. Since they can’t be called externally, it’s OK for a class to be in an invalid state between private calls.
  • Know when to use a class member function vs a free function.
    • If a function doesn’t use private class data, it really doesn’t need to be a member function.
    • Example: std::find() vs a different find function for each container
  • Use RAII (Resource Acquisition Is Initialization – See Stroustrup).
    • Initialize a variable when it’s declared.
      • Types that have a constructor are auto initialized already.
      • Puts variables in a known, valid state
  • Consider using utils::valueRange() for counting loops.
    • Ex. for (const auto nodeId: utils::valueRange(numNodes()) { ... }
    • The type of nodeId is the same as the parameter to utils::valueRange().
    • The nodeId value can be const.
    • Can also specify a starting value instead of the default 0.
  • Use named variables instead of passing literals into functions.
  • Think about how many places you’ll need to change something if required, and write your code to minimize those changes.
    • What are the odds that your assumptions on size/type become invalid over the next 20 years? How hard will it be to debug and fix?
    • Ex. You don’t think you’ll ever have more than 2 billion elements, so you use int32_t in your class variable declaration and throughout your code for loops.
    • Suddently, every place you used int32_t has to be changed.
    • Possible solutions for this case:
      • Use decltype() and/or using to define a type.
      • Use utils::valueRange() in loops which maintains type correctness implicitly.
      • Use size_t for anything associated with counting (guaranteed to be largest supported unsigned integer).
  • Avoid friend classes and functions except.
    • Embedded classes/structs.
    • Embedded functions.
    • Counter example: graph of nodes and edges

Maintenance
  • Don’t Use Friends - Graph Counterexample
  • There are limited exceptions to every rule, but they should be exceptions that happen very infrequently, and only when no better solution exists
  • Consider a graph
    • A graph is composed of nodes and edges.
    • You can’t create a node or edge outside of a graph.
      • Removing an edge from a node may mean that housekeeping in a node also needs to be performed.
      • A graph may be the management mechanism needed to guarantee that the caller can’t create inconsistencies between nodes and edges
    • This type of situation is very rare in general
  • Don’t let code outside of the class modify class members directly
    • When you enter and exit a public member function of a class, the state should be valid.
    • If something outside of the class modifies the class instance’s data, that guarantee is broken.
    • Can be impossible to track down where that change might happen.
  • Pass const iterators and ranges back from member functions, not the actual data structure.
    • If you ever change the data structure within the class, passing an iterator will (probably) still work well.
    • If you pass a data structure, callers may become reliant on the type of data structure returned, making it impossible to change the code in the future due to these dependencies.
    • Return const iterators instead of non-const iterators (see previous bullet).

Taking Advantage of New C++ Features


Writing Code Optimally
  • Use auto for lambda parameters instead of declaring the type.
    • Lambdas should be small and localized.
    • Using auto makes it easy to maintain the lambda in the event that a parameter type changes OR if you want to use the same lambda on multiple types (just like a template).
  • Use structured assignment.
    • A really nice feature of C++17 that makes code more readable.
    • C++ 17 adds a new type of assignment statement that can improve readability and productivity: structured binding. An excellent overview into structured binding can be found here, so this second simply provides a quick overview. The purpose of structured binding is to be able to declare and assign multiple variables simultaneously.
    • Examples are shown below.
  • Use inline static or constexpr static variables to avoid having to declare them in a .cppfile.
  • Avoid new/delete. Use smart pointers and containers instead.
    • Eliminates most issues with double deletion and unreachable memory.

Writing Code Optimally – std::move()
  • Know how to use std::move() to move data structures instead of copying them.
    • Moving an instance does *not* destroy the instance.
    • The instance needs to be in a state that the destructor can operate on BUT that’s the only guarantee about the state of the instance after a move (the destructor will work on it).
    • Moving a class/struct of POD values just copies the values.
    • If you move a raw pointer value in your move constructor, set the original to nullptr; otherwise you’ll have 2 instances pointing to the same raw pointer.
    • If you’re writing a container, you must still destroy an instance after a move.
      • The instance may have side effects upon construction/destruction that must be accounted for.
      • Ex. counting active instances.

Writing Code Optimally - Lambdas
  • Lambda functions are useful for:
    • Callbacks for standard algorithms or certain functions.
    • Eliminating code duplication.
  • Previously people would write a separate class, with captured variables, and a function call operator.
    • Internally, that’s what the compiler converts the lambda function into.
    • You can understand the cost if you think about it that way.
    • Every captured variable costs time/memory unless the compiler can optimize it away.
  • Putting a lambda in a loop means constructing/deconstructing each iteration.
  • Lambda functions were meant to be small.
    • The idea was that you didn’t need to write a separate class outside of the scope of your call so that you had to look at two places to see what’s going on.
    • If your code is more than a few lines, lambda are NOT the right thing to use – use a function instead.

Misc


General Rules of Thumb
  • When a parameter is to be changed, pass it as a pointer, not a reference.
    • You can look at a header file to see what the intent is with a pointer; for a reference, you need to read the code.
  • Don’t use protected inheritance, data, or member functions. From Stroustrup’s C++ Programming Language, 3rd Edition.
  • Know when to use asserts and when to use exceptions.
    • Use bAssert() to validate incoming parameters or state of class instance at the *start* of a function.
    • Asserts are not part of the build delivered to a customer, but are now turned on in local optimized builds.
    • Use exceptions for issues in the body of your code.
    • You can catch exceptions (1) if you can recover from the error or (2) want to handle the error and then rethrow the exception.
  • Don’t use macros if you can avoid them.
    • Most macros can be replaced by inline or template code.
    • No need to worry about side effects or conflicting macros.
  • Use friend functions in a class instead of external APIs.
    • No namespace issues.
    • No G++9.3 vs G++12 compilation issues.
  • Consider a print() member function for a class (no parameters, void return), defined in .cpp file.
    • Can be used from within GDB via call command.
    • By putting it in the .cpp file, it won’t be optimized out.
  • The inline keyword doesn’t mean what you think it does.
    • It used to give the compiler a hint as to what should be compiled in place where the function was called.
    • Now compilers determine which functions are to be inline’ed on their own, whether in the .cpp file or the .h file.
    • The inline keyword now tells the compiler which function names are OK to appear multiple times in a compilation unit.
    • Template code does not need “inline” at all UNLESS there is a fully qualified template specialization in a header file.
    • Small private functions that are only called from other functions in the .cpp file can also be put into the .cpp file w/o runtime penalty as a result.
  • Use static_assert() to capture error conditions at compile time instead of at runtime.
    • Obviously only applies for certain types of checks.
    • Extremely useful with templates and lambda functions that use auto parameters.
  • Try to avoid auto as a return type for a function (but lambdas are fine).
    • Not knowing the return type makes it difficult for others to use the return value in a container or to know how expensive your function may be.
  • Use an anonymous/unnamed namespace in a .cpp file to hold utility free functions used in a class.
    • Don’t use the static keyword (ala C).

Misc

  • Learn, learn, learn
    • Stay up to date with C++ as the language evolves.
    • Understand existing standard library and Boost components that you can use instead of reinventing the wheel.
    • When doing code reviews, point out to others when they can use a better approach or newer approach.
    • Use new C++ constructs over old ones. Product could be around for decades, so no reason to use C++98 instead of C++17.
  • Write your code as if somebody else is going to have to maintain and/or enhance it (odds are, somebody else will at some point in time).
    • Is it easy to understand?
    • What can be done to make it better?
    • What do you want somebody else’s code to look like if *you* have to maintain/enhance it?


留言