C++ Coding Standards
前言:
本文為分享Coding standards的相關資訊,為筆者整理學習過程的筆記與其他資料來源。
Coding standards 原則上 follow Google 的 Coding standards。
Coding Standards
- C++ code should be readable.
- Good Comments:
- Comments should tell you what’s being done from a high-level.
- Comments are important.
- 一個好的 Comments 必須要經常維護。
- 如果沒有 Comments,可能面臨幾種問題:
- 難以了解 programmer 的主要目的。
- 如果發現 code 有怪異之處。這是 bug?還是故意這樣寫? 如果我要 rewriting,我是否要保留這段奇怪的部分?
- 沒有 Comments 將會讓下一個 programmer 難以接手。
- Understanding how a function is implemented when reading the code is done by:
- Choosing good variable names. Don’t skimp on names. Make them meaningful.
- Using concise code that’s easy to read (also means reasonably size for functions, lambdas, if-blocks, loops, etc).
- Using standard library implementations when available.
Memory & Performance Basics
- Explicitly delete copy constructors and assignment operators from your class when you first write it unless you know you need them.
- Having them often lead to copies without knowing it, impacting memory and performance.
- Explicitly put copy constructor and assignment operator as private and =delete().
- 補充資料: C++11 明確地控制預設函式:delete 與 default
- Consider compilation costs when you have inline functions in headers.
- Does the function require you to include another header? If the function won’t be a performance bottleneck, put it in the .cpp file instead, even if it’s small, and include the required headers there.
- Every #include in your header file slows down compilation for everybody that uses your code.
- Remove #include lines when you remove code that needed the header file.
- funciotn非必要都一律放在.cpp。
- 避免讓 compilation 變慢。
- Every .h file should stand on its own.
- When you include a .h file, you shouldn’t need to include others to make it work.
- The order that you include .h files in your source code shouldn’t matter.
Memory/Performance
- Try to avoid defining a lambda or a container in a loop.
- Container destructors will invoke delete at the end of every loop and then reallocate memory at the start of the next loop.
- Lambdas are also destroyed at the end of each loop and recreated at the start of the next loop.
- When assigning a variable from a function that returns a reference, accept it as a reference; otherwise, you’re making a copy.
- Ex.
- Initialize all class variables that don’t have a default constructor.
- Use initializer lists in a constructor to set specific variables in the class, letting the rest be set to default.
- Setting the value within the constructor body actually causes 2 initializations of the variable.
- Don’t optimize prematurely.
- 如果你的 code 只佔整體執行時間的0.001%,讓這段 code 變快但是很醜,且很難maintain,反而有反效果。
- Optimize when:
- You have performance metrics that show that your code is a hot spot whose optimization will impact runtime.
- You know that your code is complete (i.e. you won’t have to add a new feature that blows up all of the work that you put in place optimizing the code for a specific case).
- Trust the compiler to optimize for you if you write clean, efficient code:
- Compilers keep getting better and better – 這就是為甚麼一直有新版本!
- 你不會一直維護同段code,避免陷入優化迴圈。
- 過於奇特的優化,可能會讓未來的compile無法優化的你code。
- Being correct but not necessarily optimal on all cases is better than getting the wrong answer faster.
- But, you can still be smart/efficient while writing good code.
- Prefer vectors over any other container *if it makes sense*.
- Vectors 為container中最快,且使用最少 memory。
- Use reserve() to avoid allocations and deallocations as the vector content grows.
- Group related data structures together instead of a different container for each.
- You gain memory for the container overhead, memory allocation, and even data packing.
- Make your class’s move constructor noexcept if possible.
- Use *if* your constructor will not throw an exception
- Exceptionally useful for complex classes stored in a vector
- When a vector grows, it moves all instances to a new memory location if the move constructor is noexcept
- If the constructor is not noexcept, the copy constructor is used, which can be a LOT more expensive depending on the class
- 補充資料: c++ 從vector擴容看noexcept應用場景
- Avoid conditionals in performance-sensitive code.
- Sometimes you can rewrite your code with a little thought to avoid the conditional.
- Sometimes the conditional doesn’t matter.
- Avoid double searches in a STL container.
- STL contaoner,能透過給default,或是內建的function就可以簡化很多步驟。
- Ex1,給default value。
- Ex2,.erase本身就會有檢查的功能。
- Avoid duplicating work that STL already does for you.
- Use STL algorithms – don’t reinvent the wheel.
- std::count_if(), std::find_if(), std::minmax(), etc
- The compiler usually optimizes these algorithms extensively for maximum performance
- Code is already debugged for you
- Avoid code duplication.
- If you see a few lines of code that are duplicated except for minor changes, consider a lambda to have a single copy of the code.
- For more than a few lines, use a member function or function in unnamed namespace.
- Templates and lambdas can also be used to reduce code duplication.
- Copying code means that any bug fixes/enhancements also need to be copied. Easy to miss something.
- Declare variables close to the code that uses the variable. Use scoping to manage a variables lifetime unless it impacts performance (more on that later).
- 宣告變數必須靠近在使用該變數之前。
- 可以減少該變數的lifetime。
- Make your code clear when you have a pointer variable.
- Consider adding “Ptr” to the name for both regular pointers and smart pointers.
- Example: netlistPtr, nodePtr
- When assigning to a raw pointer variable, use the “*” syntax for added clarity.
- Smart pointers returned from a function should be a reference or constant reference (or else you’d get a compiler error).
- Examples:
- const auto &netlistPtr = netlist(); // Gets UniquePtr<Netlist> reference
- const auto *netlistPtr = netlist(); // Gets actual pointer to a Netlist instance
- const auto netlist = netlist(); // No idea what this is; maybe a copy
- If the class gets too big, it’s probably doing too many things.
- Classes should do a limited number of things well.
- You should be able to write unit tests for every API in a class. If you can’t easily do this, it’s too big.
- We will be switching at some point to a new way of building and testing code. Don’t assume that what exists now will exist forever.
- How hard would it be for you to test each class exhaustively?
- If a class gets too big, take a step back and consider refactoring.
- Can you break big pieces out into their own class?
- Are there discrete pieces of code that don’t really interact with other code in the class?
- How many lines of code are there in the class?
- Is some of the code just there as a good holding spot for something that just uses data from the class?
- The more code in a class that’s outside the main purpose of the class, the harder it is to maintain, debug and, eventually, refactor.
- Using std::tuple<> for small, isolated regions of code for less than 4 values is fine.
- Tuples become difficult when used as a primary data structure. Use a class or struct instead with named variables.
- If somebody has to search the code to find out what’s stored in a tuple, the tuple is the wrong choice.
- However, tuples can be great for nested comparisons!
- Avoid struct unless it’s private within a class.
- Breaks C++ encapsulation.
- When used for private code within a class, you’ve limited any potential issues with maintenance.
- Avoid highly nested if-statements.
- Makes reading/debugging difficult.
- If within a loop, consider using break/continue statements to get out of a nested-if situation.
- Remember: If you have problems reading it, you’ll probably have problems debugging it.
- Avoid excessive parentheses and blank lines.
- Counting parentheses is a good way to make a mistake.
- Excessive blank lines make it hard to read code.
Maintenance/Code Correct Code Quickly
- Use the standard library or Boost library before writing your own algorithm.
- Your code must be tested and maintained. The other libraries do that for you.
- Other people will be familiar with the standard versions and not know why you used your own instead.
- Read about the library every 6 months to refamiliarize yourself with what’s there
- Use const wherever possible.
- Lots of people believe that this is for the compiler, but it’s really not. It’s for the programmer as a form of documentation and error avoidance.
- Avoid code duplication.
- Many times you’ll have the same code for a const and non-const version.
- Don’t duplicate code. Use std::as_const<> and const_cast<>
- This code snippet shows how to handle 3 common situations when dealing with return values: pointers, references, and values. For pointers and references, you may need to perform a const_cast<T>() on the return value from the const version of the function. For returned values, however, no casting is required.
- Avoid non-deterministic code and know when it can happen.
- Iterating over unordered_map or unordered_set.
- Using pointer values for sorting or hashing operations.
- Using the thread pool to store values or even perform math operations.
- Keep functions small. They should do just one thing and do it well with no side effects.
- Don’t rely on somebody calling your code in a certain order. That’s a recipe for a bug in the future.
- Public APIs to a class should start in a known good state and end in a good known state – ALWAYS
- Private APIs don’t have this restriction because they are private. Since they can’t be called externally, it’s OK for a class to be in an invalid state between private calls.
- Know when to use a class member function vs a free function.
- If a function doesn’t use private class data, it really doesn’t need to be a member function.
- Example: std::find() vs a different find function for each container
- Use RAII (Resource Acquisition Is Initialization – See Stroustrup).
- Initialize a variable when it’s declared.
- Types that have a constructor are auto initialized already.
- Puts variables in a known, valid state
- Consider using utils::valueRange() for counting loops.
- Ex. for (const auto nodeId: utils::valueRange(numNodes()) { ... }
- The type of nodeId is the same as the parameter to utils::valueRange().
- The nodeId value can be const.
- Can also specify a starting value instead of the default 0.
- Use named variables instead of passing literals into functions.
- Think about how many places you’ll need to change something if required, and write your code to minimize those changes.
- What are the odds that your assumptions on size/type become invalid over the next 20 years? How hard will it be to debug and fix?
- Ex. You don’t think you’ll ever have more than 2 billion elements, so you use int32_t in your class variable declaration and throughout your code for loops.
- Suddently, every place you used int32_t has to be changed.
- Possible solutions for this case:
- Use decltype() and/or using to define a type.
- Use utils::valueRange() in loops which maintains type correctness implicitly.
- Use size_t for anything associated with counting (guaranteed to be largest supported unsigned integer).
- Avoid friend classes and functions except.
- Embedded classes/structs.
- Embedded functions.
- Counter example: graph of nodes and edges
- Don’t Use Friends - Graph Counterexample
- There are limited exceptions to every rule, but they should be exceptions that happen very infrequently, and only when no better solution exists
- Consider a graph
- A graph is composed of nodes and edges.
- You can’t create a node or edge outside of a graph.
- Removing an edge from a node may mean that housekeeping in a node also needs to be performed.
- A graph may be the management mechanism needed to guarantee that the caller can’t create inconsistencies between nodes and edges
- This type of situation is very rare in general
- Don’t let code outside of the class modify class members directly
- When you enter and exit a public member function of a class, the state should be valid.
- If something outside of the class modifies the class instance’s data, that guarantee is broken.
- Can be impossible to track down where that change might happen.
- Pass const iterators and ranges back from member functions, not the actual data structure.
- If you ever change the data structure within the class, passing an iterator will (probably) still work well.
- If you pass a data structure, callers may become reliant on the type of data structure returned, making it impossible to change the code in the future due to these dependencies.
- Return const iterators instead of non-const iterators (see previous bullet).
Taking Advantage of New C++ Features
Writing Code Optimally
- Use auto for lambda parameters instead of declaring the type.
- Lambdas should be small and localized.
- Using auto makes it easy to maintain the lambda in the event that a parameter type changes OR if you want to use the same lambda on multiple types (just like a template).
- Use structured assignment.
- A really nice feature of C++17 that makes code more readable.
- C++ 17 adds a new type of assignment statement that can improve readability and productivity: structured binding. An excellent overview into structured binding can be found here, so this second simply provides a quick overview. The purpose of structured binding is to be able to declare and assign multiple variables simultaneously.
- Examples are shown below.
- Use inline static or constexpr static variables to avoid having to declare them in a .cppfile.
- Avoid new/delete. Use smart pointers and containers instead.
- Eliminates most issues with double deletion and unreachable memory.
Writing Code Optimally – std::move()
- Know how to use std::move() to move data structures instead of copying them.
- Moving an instance does *not* destroy the instance.
- The instance needs to be in a state that the destructor can operate on BUT that’s the only guarantee about the state of the instance after a move (the destructor will work on it).
- Moving a class/struct of POD values just copies the values.
- If you move a raw pointer value in your move constructor, set the original to nullptr; otherwise you’ll have 2 instances pointing to the same raw pointer.
- If you’re writing a container, you must still destroy an instance after a move.
- The instance may have side effects upon construction/destruction that must be accounted for.
- Ex. counting active instances.
Writing Code Optimally - Lambdas
- Lambda functions are useful for:
- Callbacks for standard algorithms or certain functions.
- Eliminating code duplication.
- Previously people would write a separate class, with captured variables, and a function call operator.
- Internally, that’s what the compiler converts the lambda function into.
- You can understand the cost if you think about it that way.
- Every captured variable costs time/memory unless the compiler can optimize it away.
- Putting a lambda in a loop means constructing/deconstructing each iteration.
- Lambda functions were meant to be small.
- The idea was that you didn’t need to write a separate class outside of the scope of your call so that you had to look at two places to see what’s going on.
- If your code is more than a few lines, lambda are NOT the right thing to use – use a function instead.
General Rules of Thumb
- When a parameter is to be changed, pass it as a pointer, not a reference.
- You can look at a header file to see what the intent is with a pointer; for a reference, you need to read the code.
- Don’t use protected inheritance, data, or member functions. From Stroustrup’s C++ Programming Language, 3rd Edition.
- Know when to use asserts and when to use exceptions.
- Use bAssert() to validate incoming parameters or state of class instance at the *start* of a function.
- Asserts are not part of the build delivered to a customer, but are now turned on in local optimized builds.
- Use exceptions for issues in the body of your code.
- You can catch exceptions (1) if you can recover from the error or (2) want to handle the error and then rethrow the exception.
- Don’t use macros if you can avoid them.
- Most macros can be replaced by inline or template code.
- No need to worry about side effects or conflicting macros.
- Use friend functions in a class instead of external APIs.
- No namespace issues.
- No G++9.3 vs G++12 compilation issues.
- Consider a print() member function for a class (no parameters, void return), defined in .cpp file.
- Can be used from within GDB via call command.
- By putting it in the .cpp file, it won’t be optimized out.
- The inline keyword doesn’t mean what you think it does.
- It used to give the compiler a hint as to what should be compiled in place where the function was called.
- Now compilers determine which functions are to be inline’ed on their own, whether in the .cpp file or the .h file.
- The inline keyword now tells the compiler which function names are OK to appear multiple times in a compilation unit.
- Template code does not need “inline” at all UNLESS there is a fully qualified template specialization in a header file.
- Small private functions that are only called from other functions in the .cpp file can also be put into the .cpp file w/o runtime penalty as a result.
- Use static_assert() to capture error conditions at compile time instead of at runtime.
- Obviously only applies for certain types of checks.
- Extremely useful with templates and lambda functions that use auto parameters.
- Try to avoid auto as a return type for a function (but lambdas are fine).
- Not knowing the return type makes it difficult for others to use the return value in a container or to know how expensive your function may be.
- Use an anonymous/unnamed namespace in a .cpp file to hold utility free functions used in a class.
- Don’t use the static keyword (ala C).
Misc
- Learn, learn, learn
- Stay up to date with C++ as the language evolves.
- Understand existing standard library and Boost components that you can use instead of reinventing the wheel.
- When doing code reviews, point out to others when they can use a better approach or newer approach.
- Use new C++ constructs over old ones. Product could be around for decades, so no reason to use C++98 instead of C++17.
- Write your code as if somebody else is going to have to maintain and/or enhance it (odds are, somebody else will at some point in time).
- Is it easy to understand?
- What can be done to make it better?
- What do you want somebody else’s code to look like if *you* have to maintain/enhance it?
留言
張貼留言