The PR branch HEAD was 021f86953e at the time of this review club meeting.
Notes
This week’s review club will focus on C++ patterns that separate
the implementation details of a class from its external interface.
PR 22950 is a fully
implemented example, but we’ll take the scenic route there to understand the
context and alternative approaches.
In a normal C++ class, changing any implementation detail usually requires all
client code to be recompiled. This is because private data members can change the
size of the object, and private member functions participate in overload
resolution. Those members are included in the header
file, and any translation unit which #includes the header file will need to be
recompiled if that file changes.
There are multiple ways to get around this compilation dependency. This week,
we will focus on two: the pimpl pattern & a pure abstract class. I’ve
included annotated versions of toy programs that are minimal implementations
of both patterns. I highly recommend using them as a guide and creating your
own.
PeerManager & PeerManagerImpl use a pure abstract class to reduce the
compilation dependencies. PeerManager defines a set of virtual functions
that provides callers with an interface. PeerManagerImpl inherits from
PeerManager and fulfills those functions, supplementing with any private
data members or helper functions. Instead of callers constructing a
PeerManager object, they call a static make function which returns an
instantiated PeerManagerImpl object. This provides callers with exactly
enough information to utilize the component, but nothing more.
Here is a toy program to help understand the material. You can find the
code in abstract_class.h and
abstract_class.cpp. Below is an
annotated version of the program that dives into the specific mechanisms.
TxRequestTracker & TxRequestTracker::Impl use the pimpl pattern to reduce
the compilation dependencies. The interface object, TxRequestTracker owns a
pointer to an implementation object, TxRequestTracker::Impl. When a caller
invokes a function on the interface object, it gets forwarded to the
implementation object to fulfill, and any return values are routed back
through the same call path. Along with a forward declaration of the Impl,
all the member functions of TxRequestTracker are declared in the header.
However, the functions must be defined in the .cpp where the definition of
a TxRequestTracker::Impl is complete.
Here is a toy program to help understand the material. You can find the
code in pimpl.h and
pimpl.cpp. Below is an annotated version
of the program that dives into the specific mechanisms.
Just for fun, one more diagram that puts it all together (and then some).
Why does changing implementation details of a class cause recompilation of
all its callers?
Describe how the pimpl pattern & a pure abstract class work to reduce
compilation dependencies. What are advantages & disadvantages of each?
Did you create your own toy programs? What were some challenges or
learnings you had along the way?
PR 22950 implements a pimpl
pattern for AddrMan. What does this enable for the code organization? What
restrictions still remain, and how does the PR address it?
<sipa> michaelfolkson: more modern languages generally solve that problem by having different notions of compilation, and more explicit modules - fundamentally this problem always exists
<larryruane> private member functions are also in the header file, so if they change (which users of the class shouldn't care about), recompilation is needed
<amiti> so the accessibility of what callers can access can be restricted, but the compiler gets to know about private members & functions at all times, including when it compiles the calling code.
<sipa> also all the type arguments and many more things in the header determine the symbol name it is compiled to; if you change the type of an argument to a function, all callers need to update the symbol name they link to
<lightlike> does the need for recompiling everything actually depend on what is changed (variable, function)? i.e. if you just change a comment in the header, wouldn't there be a lengthy recompile even then?
<sipa> lightlike: but yes, in general, because c++ has no real modules, it has no idea when interfaces change, and has to assume that whenever headers change, everything needs to be rebuilt
<zakinator123> amiti: So even if the public interface is the same, but new private members/functions are added to a class, recompilation of users of that class is required?
<larryruane> sometimes _just for testing_ (obviously), i'll run a command like `touch -d '2 weeks ago' file.h` after changing file.h to speed up compile (but also need to update the timestamp on file.cpp)
<amiti> zakinator123: yes, that's true. but because of this design, there's also the breakup of header & cpp files, so even other changes to that file can cause recompilation of callers
<amiti> since the notes start with the pattern of a pure abstract class- can anyone describe what the pattern is & how it can be used to reduce compilation dependencies?
<amiti> since its an abstract class, it can't be instantiated. so there is another class that inherits from the base class & adds any private functions / members to fulfill the interface
<larryruane> amiti: "how can we utilize that?" ... you can then define an "implementation" class that inherits from the abstract class, fills out those functions' implementations, but then you can write a function whose argument is the abstract type, but the caller can pass it an object of the derived type --- looks like a type mismatch but it's not! (sorry probably didn't explain that well)
<zakinator123> So since the private and public members of the abstract class remain unchanged since all changes to implementation will be happening in the impl class, recompilation of dependencies is not needed?
<amiti> I'm sure there are other ways to do it too, but this is the pattern that is currently used in peermanager & was one I came across in other places too
<larryruane> sipa: "a static method?" I think this just means a function that can't access the `this` pointer because there is no object (it can be called with the double-colon syntax)
<amiti> larryruane: I think they are just two ways to achieve the aim. the pimpl pattern seems to be more explicit / common in C++ literature. each one of these have lots of names for them.
<jnewbery> amiti: it's not required to have that static method. For example, look at how PeerManager implements the NetEventsInterface interface class: class PeerManager : public CValidationInterface, public NetEventsInterface
<sipa> i don't think pimpl really matches any of those common design patterns; it's more a workaround for a limitation in the language, than a design i'd say
<jnewbery> gene: yes, pimpl involves one extra pointer indirection. I don't know exactly how compilers implement vtables and dynamic dispatch, but I guess that the overhead is roughly equivalent
<sipa> gene: for such statements it's always useful to wonder how much it actually matters; the cost of a predictable function call is in the order of nanoseconds; that may matter for some use cases, but nothing in addrman needs anything near that kind of performance
<larryruane> I just thought of something, couldn't abstract classes (not sure about pimpl) be very useful for testing, for mocking? The test framework could create a derived class that does very different stuff from the production derived class, and you pass the abstract object to the production code and it doesn't know the difference
<sipa> jnewbery: the overhead is way less, because the linker resolves the function call, so the binary code actually has the memory address that's invokved; with vtables you have an indirection to even find out what code to call
<amiti> sipa: oh interesting. I'm guessing that's why the pimpl pattern is more common? using the abstract class thing made sense in PeerManager because there was already the virtual table lookups present.
<jnewbery> so even though the addrman.h header is only exposing a very limited interface to the rest of the code, the tests can still access test functions and members