Add pool based memory resource (utxo db and indexes)

Mar 8, 2023

https://github.com/bitcoin/bitcoin/pull/25325

Host: larryruane - PR author: martinus

The PR branch HEAD was d87cb99bb37637e26a9e00b9f7de4bc6f44cb79d at the time of this review club meeting.

Notes

A pool-based resource is a resource that is pre-allocated and managed by a pool.
A pool-based resource is typically a limited or expensive resource, such as a database connection, thread, or memory buffer. The pool keeps a number of pre-allocated resources ready for use, so that when a program requests a resource, it can be quickly and efficiently provided from the pool.
When a resource user is finished using a pool-based resource, it returns it to the pool rather than releasing it back to the system. This means that the resource can be reused by another request without the overhead of allocating a new resource.
(From the PR description:) This PR implements a memory resource similar to std::pmr::unsynchronized_pool_resource, but optimized for node-based containers. The goal is to be able to cache more coins with the same memory usage, and allocate/deallocate faster.
The Bitcoin Core client, bitcoind, spends a large fraction of its time validating transactions.
We validate a transaction when we receive it over the P2P network, before adding it to our mempool; and also when we receive it in a new block, if we haven’t already validated it (not already in our mempool).
An important part of validating a transaction is ensuring that each of its inputs refers (using a COutPoint) to an existing transaction output, and that this output is unspent.
The speed of validating transactions critically depends on how quickly we can locate the UTXOs referred to by their inputs.
When a transaction is confirmed, the UTXOs its inputs refer to become spent, and must be removed from the UTXO set. The transaction’s outputs create new UTXOs which must be added to the UTXO set. So the UTXO set is modified very often, especially during Initial Block Download (IBD), when almost all we’re doing is validating transactions as we receive each historical block.
The entire UTXO set is stored on disk in the chainstate subdirectory of the data directory, in LevelDB format. Since reading from disk is orders of magnitude slower than reading from memory, we would like to have the UTXOs that we’re most likely to need to access cached in memory.
This UTXO cache, also called the coins cache, is one of the most memory-intensive data structures in Bitcoin Core. Its CPU performance is also important, both for lookup, and for modification since entries are added to and deleted from this cache at a high rate.
The size of this memory cache can be controlled using the -dbcache configuration setting.
A memory-resident unspent transaction output, UTXO, is represented in the codebase as a Coin object. The key information in a Coin is a CTxOut, which contains an amount (nValue) and a scriptPubKey, sometimes referred to as the locking script because it prevents the coin from being spent without the authorizing witness as an input in a later transaction.
For inclusion in the coins cache, each Coin is wrapped within a CCoinsCacheEntry
The coins cache itself, containing the individual CCoinsCacheEntry items, is implemented as a std::unordered_map container with the alias name CCoinsMap. This map is part of the CCoinsViewCache class.
To summarize, we’d like this CCoinsMap (unordered map) to have the minimum memory requirement and the highest possible performance. Reducing its memory requirement means that it can store more entries (UTXOs) within a given amount of physical memory, which increases the cache hit rate.

Questions

Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach?
At a high level, what is the goal of this PR?
Why is CCoinsMap implemented as an std::unordered_map instead of a std::map? What are the advantages of each?
How does the PR change the implementation of CCoinsMap?
Regarding class PoolResource
- What does the final keyword do? Why is it needed here?
- What does static_assert do? Why is it used here instead of a regular assert?
- What does the expression (ALIGN_BYTES & (ALIGN_BYTES - 1)) == 0 do?
- Why is this class templated by MAX_BLOCK_SIZE_BYTES and ALIGN_BYTES,
- Why do some methods specify the [[nodiscard]] attribute?
The PoolResource takes one argument, chunk_size_bytes. What does this argument do, and what are the tradeoffs when deciding on this value?
Why is this allocator faster (for the coins cache) than the standard allocator?
Why does this allocator use less memory (for the coins cache) than the standard allocator?
Can you think of a usage pattern (perhaps outside of the coins cache) that might cause this allocator to use more memory than the standard allocator?
Are there other data structures that can take advantage of this new allocator?

Meeting Log

17:00

<LarryRuane> #startmeeting

17:00

<pablomartin> hello all!

17:00

<effexzi> Hello every1

17:00

<glozow> hi

17:00

<DaveBeer> hi

17:00

<martinus> hi!

17:00

<abubakar> hi

17:00

<hernanmarino> Helloooo

17:00

<svav> Hi

17:00

<LarryRuane> hi everyone, please say hi so we know who's here!

17:00

<glozow> hi martinus, great to have you here!

17:00

<codo> hi

17:00

<LarryRuane> yes, thank you for being here, @martinus !

17:00

<LarryRuane> any first-time attendees here today?

17:00

<jonatack1> hi

17:00

<martinus> thanks, and thanks LarryRuane for having my PR here!

17:01

<DaveBeer> yes, I'm here for first time

17:01

<LarryRuane> here's the write-up for today's review club session: https://bitcoincore.reviews/25325

17:02

<LarryRuane> welcome, @DaveBeer ! Feel free to ask away, any questions, even if we've moved on, continuing with previous threads is fine!

17:02

<pakaro> hi

17:03

<LarryRuane> first question, our usual: Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach?

17:03

<DaveBeer> y, read notes, briefly looked at code

17:03

<abubakar> Yes, Approach ACK, read through the PR commits and have an understanding of the PR code

17:03

<LarryRuane> Also, does anyone have any questions on the Notes (intended to be background material)? We can discuss those first if there are any

17:03

<hernanmarino> 70% yes

17:04

<martinus> I'd ACK it because I wrote it ;)

17:04

<pakaro> I was wondering where the data, if any, for cache hit rate is

17:04

<LarryRuane> *whew* +1 @martinus

17:04

<pakaro> ie. how often we access utxo from cache and how often we reach to disk

17:05

<LarryRuane> that's a good question, I don't know if there's any tracking of that, does anyone know?

17:05

<pablomartin> yes, read pr, notes, answered some questions, pending running some benchs & testing before commenting on the pr

17:06

<jonatack> Reviewed both the original PR (https://github.com/bitcoin/bitcoin/pull/22702) and then this newer one. Have also been running them for a long time on my nodes.

17:06

<Steve68> Hello, this is my first time here, I may be a little lost. Is there a Bitcoin core meeting on wednesdays?

17:07

<LarryRuane> I was looking over the notes just before the meeting, and even though there are lots of notes, there may be a few concepts that people aren't too familiar with... maybe we can take a couple minutes to cover some basics...

17:07

<LarryRuane> were most of you aware that C++ standard library contains allow you to use a custom memory allocator?

17:07

<hernanmarino> jonatack: great, this deserves a long testing

17:07

<LarryRuane> for example, take a look at the `Allocator` template argument here https://en.cppreference.com/w/cpp/container/unordered_map

17:07

<jonatack> Steve68: https://github.com/jonatack/bitcoin-development/blob/master/bitcoin-core-dev-irc-meetings.txt

17:08

<LarryRuane> the default memory allocator is the standard system allocator (which basically just does `new` and `delete`), but you can override the default

17:09

<Steve68> thank you!

17:09

<LarryRuane> (note that in c++ if you override a template argument, you must specify all preceding template arguments, even if you just want the defaults for those)

17:09

<LarryRuane> when a container object (like an `std::unordered_map` instance) wants to allocate memory, it uses its configured allocator (standard or custom)

17:10

<LarryRuane> (@martinus correct me if I'm wrong on any of this!)

17:10

<LarryRuane> this PR changes the allocator for the `unordered_map` used in the coins cache, and _only_ that one (not all unordered maps)

17:10

<martinus> all good from my side :)

17:11

<LarryRuane> it changes it to one implemented by this PR (so the PR has two main parts, implementing a new allocator, and making this one particular unordered_map use it)

17:11

<LarryRuane> Any of that unclear, or want to discuss?

17:12

<pakaro> very well described

17:13

<jonatack> for info, martinus has a blog at https://martin.ankerl.com and wrote the benchmarking library that we use since a couple of years now in bitcoin core

17:13

<LarryRuane> Okay we can get into the questions then ... one question I forgot to ask, why is there no mutex locking in this PR?

17:13

<martinus> Maybe let me add the main reasons for this custom allocator: unordered_map does one allocation per entry, and this can be costly: malloc/free can be relatively slow, and it also has a memory overhead. This PR makes it faster and reduces that overhead.

17:13

<LarryRuane> jonatack: TIL -- thanks for the link!!

17:14

<jonatack> (on the blog you can also see his extensive work on hashmap data structures in C++)

17:14

<LarryRuane> is that like standard c++ (not just bitcoin core)?

17:15

<martinus> It's per the standard that std::unordered_map has to perform at least one allocation per node (entry), and there's no way around this. Other hashmap implementations do this differently so they can be faster

17:16

<martinus> But changing the whole hashmap of such an integral part can be dangerous (hashmaps are hard to get right), so this PR just changes the allocator to get most of the benefits

17:17

<LarryRuane> martinus: thanks, and you mentioned that malloc/free (the lowest-level dynamic memory allocation primitives) can be slow, this is question 7 (ok if we go out of order!) ... can you give us reasons why?

17:18

<pablomartin> perhaps it's worth mentioning that originally, the focus was on the hashmap (to make std::unordered_map faster...) - https://github.com/bitcoin/bitcoin/pull/16718 - by jamesob

17:19

<martinus> sure, I mean malloc is really well optimized, but it has to be generically implemented for whatever size whatever thread currently wants to allocate. But with the unordered_map we know that we need a lot of nodes, and we know that each of them has exactly the same size; so we can use that to optimize for this specific case.

17:19

<abubakar> The Allocator implementation avoids memory fragmentation. This ensures no gaps between allocated blocks of memory, also avoids the overhead of calling malloc() and free() for every allocation and deallocation. Instead, the allocator manages its own pool of memory and allocates and deallocates blocks from this pool as needed. reduces the number of system calls required for memory

17:19

<abubakar> allocation and deallocation

17:21

<LarryRuane> martinus: abubakar: yes, thanks ... I'm not sure about reducing the number of system calls, though... aren't most malloc and free just within the runtime library?

17:22

<DaveBeer> yes they manage some kind of tree like structure for memory chunks, as long as large enough chunk is available, system call should not be needed

17:22

<DaveBeer> in runtime library that is

17:22

<LarryRuane> what I was thinking about the performance improvement was that no mutex locking is required ... i mean, there may be locking at a higher level (around the entire map, if multiple threads can access it), but not for each individual memory alloc and dealloc

17:24

<abubakar> LarryRuane: yes thanks.

17:24

<LarryRuane> and for question 8, "Why does this allocator use less memory (for the coins cache) than the standard allocator?", yes, I think @abubakar hit on the main reason, all the allocations are tightly packed together ... as opposed to individual allocations, where there's some (hidden to us) overhead

17:25

<LarryRuane> so I think martinus answered question 2, the goal of this PR ... shall we go to question 3?

17:25

<jonatack> LarryRuane: re locking: that said, my understanding is that locking itself isn't usually expensive, but lock contention is

17:26

<LarryRuane> yes, and there would be a lot of contention for memory allocation and deallocation, is that right?

17:27

<jonatack> I don't know offhand

17:27

<jonatack> just making the distinction

17:27

<LarryRuane> i know there's a build that measures lock contention (`--enable-debug` i think), but i don't know if it measures dynamic memory lock contention

17:28

<LarryRuane> Here's question 3, might be kind of easy: Why is CCoinsMap implemented as an std::unordered_map used instead of a std::map? What are the advantages of each?

17:28

<hernanmarino> it's faster

17:28

<abubakar> for question 3.. Because Coins are not ordered, we want fast read time, std::unordered_map uses hash table while std::map uses BST.

17:28

<hernanmarino> unordered_map, I mean

17:29

<abubakar> Std::unordered is good to access element fast that dont care about order, whereas std::map. Is good for sorted entries.

17:29

<DaveBeer> unordered has O(1) access time while regular map has O(log(N)) access time and Coins don't need to be visited in any particular order, so unordered_map is enough

17:29

<pakaro> there's no need for order, since each utxo in the utxo is unique. +1 davebeer abubakar

17:30

<jonatack> yes, DEBUG_LOCKCONTENTION is now built into the debug build and you can enable the lock logging category to see it

17:30

<jonatack> see https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md#debug_lockcontention for more

17:30

<LarryRuane> hernanmarino: abubakar: DaveBeer: pakaro: +1 - good answers, those are my understandings too

17:31

<LarryRuane> i think i've noticed quite a few maps that could be unordered_maps ... but if they're not performance-critical, i guess it doesn't matter

17:31

<pablomartin> faster lookup, customizable hashing, faster insertion & removal, less memory usage...

17:31

<LarryRuane> i think someone said that when bitcoin core was first written, there was no `std::unordered_map` yet, only added later ... does anyone know if that's true?

17:31

<LarryRuane> so (if that's true), some are just leftover from the early days

17:32

<martinus> When maps are very small (say 20 elements or so) sometimes std::map can be faster, but in this case the CCoinMap is really huge so std::unordered_map is definitely faster

17:32

<DaveBeer> unordered_map was introduced only in C++11 I think

17:32

<hernanmarino> LarryRuane, I believe it s true

17:32

<hernanmarino> Davebeer: +1

17:32

<jonatack> (but our lock contention logging is only related to the locks we call in src/sync.h)

17:33

<LarryRuane> jonatack: i see, not standard library locks

17:33

<hernanmarino> Also, the custom allocators where used in this PR where introduced in C++17

17:33

<hernanmarino> that*

17:33

<LarryRuane> continue discussing but let me put question 6 out there... The PoolResource takes one argument, chunk_size_bytes. What does this argument do, and what are the tradeoffs when deciding on this value?

17:35

<martinus> hernanmarino: I think the std::allocators were introduced in C++11, in C++17 the pmr allocators were introduced. But they are still not implemented everywhere...

17:35

<hernanmarino> martinus: thanks for clarifying

17:35

<pakaro> the number of bytes to give to each request for memory-space?

100

17:36

<LarryRuane> martinus: oh that's actually something i was wondering, the PR description says "A memory resource similar to std::pmr::unsynchronized_pool_resource, but optimized for node-based containers" ... I'm unsure if `pmr` could have been used, but would not have been as efficient? Or not possible?

101

17:36

<pakaro> if it's too large, we are wasting memory, if it is too small, an entry would not fit, and i assume the program would crash...this is why we do the *4 iirc...i may be off here

102

17:37

<jonatack> Bitcoin Core began with C++98, so yes, unordered_map didn't arrive until later IIRC

103

17:38

<LarryRuane> pakaro: +1 on the first part, there's always some unused space in the most recent chunk ... but if it's too small, the PoolAllocator just does a regular new (and later delete)

104

17:38

<LarryRuane> won't crash

105

17:38

<martinus> LarryRuane: Actually I first implemented this PR by using the pmr allocators and that way it was quite a bit simpler, but unfortunately that did not compile everywhere because libc++ still doesn't implement it. I think on MacOS that didn't work. It worked locally though with my libstdc++

106

17:39

<LarryRuane> oh i see... that must have been a frustrating day, if you had it all working on your machine!

107

17:40

<LarryRuane> so on the question of choosing the chunk size, if it's too large there's wasted memory, and if it's too small, then there are more system allocations (which we know are somewhat slow)

108

17:42

<LarryRuane> martinus: I was wondering, did you consider making `chunk_size_bytes` a template argument? What were the considerations there?

109

17:44

<martinus> LarryRuane: yes I thought about making it either a template or an argument in the constructor, but then I though I should try to keep the PR as simple as possible. It's already complex enough and requires quite a lot of knowledge of internals. It might be a future extension if the allocator is used more widely

110

17:44

<LarryRuane> I guess also, the more stuff you template, the more actual code memory usage there can be!

111

17:45

<LarryRuane> since each combination of template values creates a whole new instance of all the code (IIUC)

112

17:46

<LarryRuane> (but probably code memory usage is not important)

113

17:47

<martinus> Also, allocating a big chunk of memory is actually quite cheap, at least in Linux this is done lazily: the malloc of the chunk doesn't really do much, only when the memory is actually used (read / write) the operating system goes and makes that page available.

114

17:47

<Murch> Is that why you needed the two different variants of new?

115

17:47

<LarryRuane> As long as you mentioned "if the allocator is used more widely" -- question 10 is: Are there other data structures that can take advantage of this new allocator?

116

17:48

<Murch> To make sure it actually is made available?

117

17:48

<martinus> Murch: no this is completely hidden and done from the kernel as far as I know, this has nothing to do with the different versions of new

118

17:48

<Murch> Oh okay

119

17:49

<LarryRuane> yes, there's the operator new: https://en.cppreference.com/w/cpp/memory/new/operator_new and the expression new: https://en.cppreference.com/w/cpp/language/new ... which i never knew about before!

120

17:49

<Murch> TBH, this all seems like higher arcane mysteries to me ^^

121

17:49

<LarryRuane> It's like the expression new can be given _already allocated_ memory, right? (they call it "placement-params")

122

17:50

<hernanmarino> LarryRuane: On q. 10 Mempool is the first that comes to mind , but i don't think it can benefit a lot from this... perhaps something more heavily used during IBD? Blockindexes ?? I am not sure if they really use a lot of memory ...

123

17:50

<LarryRuane> yes, the uses of `new` in this PR were unfamiliar

124

17:50

<jonatack> Murch: once I tweeted about enjoying C++, and a long-time Ruby friend replied "Jon, are you OK? Blink twice if under duress" :)

125

17:51

<martinus> ah yes, allocating with operator new and in-place construction with new(...) are completely different things, but they have the same name... It's C++ being as obfuscated as it can be

126

17:51

<LarryRuane> hernanmarino: +1 - I was thinking mempool also, but I'm unsure

127

17:51

<hernanmarino> I'm just guessing, didn thoroughly reviewd the code to answer this one

128

17:51

<Murch> jonatack: Yeah, I’m definitely still not in it because of C++ 😅

129

17:51

<hernanmarino> :)

130

17:52

<Murch> 😀😆😀😆

131

17:52

<LarryRuane> martinus: kudos for figuring out some of this very weird (but needed) c++!

132

17:53

<Murch> hernanmarino: I’m not sure. A lot of transactions have pretty similar sizes, and the big ones could just be allocated separately. Might be useful to cover the ~24% one input one output txs, and the 46% one input two output txs

133

17:54

<LarryRuane> I think this is a great PR, I really want it to merge ... it was really interesting to read about people's concerns with safety (i think that's why the earlier attempt didn't happen)

134

17:55

<LarryRuane> Okay only 5 minutes left, anyone want to take on question 9: Can you think of a usage pattern (perhaps outside of the coins cache) that might cause this allocator to use more memory than the standard allocator?

135

17:55

<LarryRuane> Also, if anyone would like to throw out any answers to the sub-questions in question 5... I think those are interesting c++ questions

136

17:56

<hernanmarino> Murch: Agree. I was also thinking about something more heavily used during IBD, and perhaps it's not the case with the mempool

137

17:56

<pakaro> my understanding of static_assert is that it will throw a compilation error in debug mode, when an 'assert' will not throw an error in debug mode.

138

17:57

<pakaro> re. q5

139

17:57

<LarryRuane> pakaro: no actually, assert crashes the node even in a non-debug build

140

17:58

<pakaro> then im not sure of the difference between static_assert & assert

141

17:58

<hernanmarino> re q9 I'm just guessing, but perhaps allocating only a few big chunks of memory, instead of several small ones ...

142

17:58

<LarryRuane> it's interesting to look at *where* those `static_assert`s are ... they're not inside functions! (for fun I tried changing one of them to a regular assert, and it wouldn't compile)

143

17:58

<abubakar> pre allocated memory more than necessary might result to alot of unused memory

144

17:58

<jonatack> pakaro: static assert is checked at compilation time

145

17:58

<jonatack> vs runtime for assert

146

17:58

<pablomartin> q9: perhaps in scenarios where the data structure contains a mixture of large and small objects

147

17:58

<LarryRuane> yes, static_assert triggers for any kind of build

148

17:59

<LarryRuane> pablomartin: yes, that's what i was thinking ... like if you allocate a bunch of one size of objects, then free them all ... then allocate a bunch of other sizes but never that first size, then they're sort of stuck in their freelist

149

18:00

<LarryRuane> ok we're at time, thanks to all of you!! and especially @martinus ! Please go review the PR!

150

18:00

<martinus> static_asserts are great because it immediately gives you a compile error. So if it compiles you can be sure that all static_assert's are correct. assert() are only evaluated when that code is actually run, so ideally use static_assert when possible

151

18:00

<abubakar> final ensures that the PoolResource class can be modified outside it's scope

152

18:00

<pablomartin> thanks all!

153

18:00

<LarryRuane> #endmeeting

154

18:00

<effexzi> Thanks every1

155

18:00

<martinus> Thanks a lot LarryRuane for moderating this meeting!

156

18:00

<abubakar> LarryRuane: thanks for hosting

157

18:01

<Murch> Thanks Larry

158

18:01

<hernanmarino> thanks Larry for hosting, and martinus for great improvement

159

18:01

<LarryRuane> yes, and a regular assert might not be hit ... until post-release, on user machines!

160

18:01

<Murch> Also thanks to martinus for joining and giving more details!

161

18:02

<DaveBeer> thanks Larry and martinus

162

18:02

<pakaro> computer crashed. assuming end has been called. thanks everyone!

163

18:02

<LarryRuane> pakaro: yes, thanks for being here!

164

18:03

<jonatack> LarryRuane: (ALIGN_BYTES & (ALIGN_BYTES - 1)) == 0 tests for being a multiple of two IIRC, and the reason to use it over modulo could be performance. If I'm not confused, I reviewed it a while ago.

165

18:03

<LarryRuane> abubakar: I think of `final` as meaning that you can't create a class that uses this one as its base class ... but I'm not really sure if there's any special reason it's used here

166

18:04

<Murch> jonatack: Yeah, that was pretty nifty

167

18:04

<jonatack> (IIRC modulo is a fairly slow operation)

168

18:04

<Amirreza> Hi everyone

169

18:04

<pakaro> +1 murch very nifty

170

18:04

<LarryRuane> jonatack: yes that's correct ... only a power of 2, or zero, will make that expression zero

171

18:04

<LarryRuane> but you can't even use modulo, right?

172

18:05

<jonatack> LarryRuane: was referring to the comment at https://github.com/bitcoin/bitcoin/pull/25325/files#r1124914676

173

18:05

<Murch> Amirreza: Welcome, however the Bitcoin Core PR Review Club just finished

174

18:05

<LarryRuane> you essentially want to check of just 1 bit is set ... there are some instruction sets that have a "popcount" (number of bits set) instruction, but obviously that's pretty low-level

175

18:05

<jonatack> I recall messing around with adding static asserts with modulo as you suggested and checking

176

18:06

<martinus> yeah the "final" here is not really necessary, it's just that no one can use it as a base class. Actually I'm not sure why I have it here, but it doesn't hurt

177

18:06

<Amirreza> Murch, Oh, I'm too late. I just did a mistake in timing!

178

18:06

<LarryRuane> martinus: ok cool ... is there any special reason for the `[[nodiscard]]`? I think it's nice, but any special reason?

179

18:07

<jonatack> LarryRuane: nodiscard is good to add for getters to ensure that the result is in fact used

180

18:08

<jonatack> LarryRuane: TIL wrt to your popcount comment, thanks!

181

18:08

<LarryRuane> ok, makes sense, because it's a pure function i think it's called (no side-effects), so if you're not using the result why are you calling it??

182

18:08

<martinus> I usually use [[nodiscard]] everywhere, because when I return something I usually want that to be used; and if one doesn't use the return value it usually means that someone called the method but didn't have to. Especially all const methods should be [[nodiscard]]

183

18:09

<LarryRuane> martinus: i see, good to know... almost seems like `const` functions can be nodiscard by default?

184

18:09

<martinus> LarryRuane: yes I think it would be better to have that as default, but It's C++ so it won't change :)

185

18:09

<jonatack> martinus: good points

186

18:10

<LarryRuane> but it is interesting i guess that even `const` methods can have side-effects -- the `mutable` variables ... but callers shouldn't know about those side-effects

187

18:10

<LarryRuane> martinus: thanks.. I think your code is really good for all of us to use as a model!

188

18:11

<martinus> thanks!

189

18:12

<jonatack> Thanks for your long-standing work and patience to improve the performance here martinus, and thank you LarryRuane for choosing to review this pull and host a review club about it.

190

18:13

<pakaro> +1 jonatack - reading the comments from 2019 puts in perspective the amount of deliberation that goes into a pr like this

191

18:13

<DaveBeer> yeah this one was hard for me to parse '=D void* storage = ::operator new (m_chunk_size_bytes, std::align_val_t{ELEM_ALIGN_BYTES});

192

18:13

<DaveBeer> m_available_memory_it = new (storage) std::byte[m_chunk_size_bytes];

193

18:14

<LarryRuane> jonatack: +1 about martinus, and you're very welcome! I really enjoyed learning about this one!

194

18:15

<pakaro> +larryruane i learnt a lot from your notes and comments in the pr

195

18:16

<LarryRuane> DaveBeer: yes me too, but i think it's *almost* the same as a normal new, like `void* storage = new std::bytes[m_chunk_size_bytes]` except that we want it aligned (the default new with a one-byte size wouldn't be aligned by 8 bytes)

196

18:16

<abubakar> same really good notes Larry :)

197

18:16

<LarryRuane> pakaro: abubakar: thanks

198

18:18

<jonatack> +1 excellent notes. Oh and whoever PRs the meeting log could maybe also drop the word "used" from Question 3.

199

18:19

<martinus> right, I want it to be aligned correctly. Also note that I'm always using ::operator new (the initial ::), because otherwise it could be that we accidentally use an overloaded version of operator new

200

18:19

<DaveBeer> LarryRuane: I see, thanks, I wasn't even aware of the new expression until today

201

18:20

<LarryRuane> oh good catch @jonatack yes I'll fix that!

202

18:21

<LarryRuane> martinus: oh that's really interesting, i thought it was only so you could specify the alignment ... can you specify alignment without the `::operator`?

203

18:22

<DaveBeer> * new expression with placement params

204

18:23

<LarryRuane> DaveBeer: i wasn't either, until a few days ago!

205

18:25

<martinus> You can use void* storage = operator new (m_chunk_size_bytes, std::align_val_t{ELEM_ALIGN_BYTES}); just fine, but then you risk of using some other operator new. It's safer to always use ::operator new, unless you really want other operator new to work. So far I never needed that...