Once a block is fully validated, it is saved to disk in one of the blk<nFile>.dat files in your datadir.
Blocks are received, validated and stored in an unpredictable order (and not sequentially based on block height), so we need to keep track of which file each block is stored in, in order to be able to access it quickly. This is tracked in CBlockIndex by its membersnFilenDataPos and nUndoPos. In master, all of these members are guarded by the ::cs_main mutex. We have discussed how blocks are downloaded and stored in previous meetings #24858 and #25880.
::cs_mainis a recursive mutex which is used to ensure that validation is carried out in an atomic way. Although in recent years a lot of effort has been made to reduce usage of ::cs_main, it is still heavily used across the codebase.
Having a single (global) mutex can allow for reduced code complexity and simplify reasoning about race conditions. However, it often also leads to (sometimes significant) performance issues when multiple threads are waiting for the same mutex even though they don’t need synchronization and are not accessing any of the same variables.
SharedLock is added as a new mutex type to complement the UniqueLock we already have. Why does a UniqueLock not suffice here? How are the implementations of UniqueLock and SharedLock different?
Do you expect this PR to have any visible impact on performance? If so, for which process(es) (in a very general sense) and by how much (order of magnitude)? Were you able to verify/benchmark this in practice?
This PR changes CBlockIndex::nIndex to default to -1 instead of 0. How can/did you verify that this change is safe?
nFile, nDataPos and nUndoPoschange from being guarded by ::cs_main to being guarded by g_cs_blockindex_data. Why is it that we lock exactly these 3 variables with g_cs_blockindex_data? What would be the concerns and benefits of using a different mutex for each of those 3 variables?
Are there any other ways to ensure the data integrity of nFile, nDataPos and nUndoPos?
With this PR, does the number of times that a mutex is acquired increase, stay constant, or decrease - or does it depend on the program flow?
<stickies-v> welcome everyone! Today we're looking at #27006, authored by furszy. The notes and questions are available on https://bitcoincore.reviews/27006
<pakaro> tACK - compiled, ran bitcoind, unit tests, functional tests without bdb nor usdt-trace. All passed. I got a setup error when I tried to run sharedlock_tests on its own, i didn't spend much/anytime figuring that part out though.
<stickies-v> LarryRuane_: yes definitely quite a few layers of complexity. It looks very verbose at first but I suppose it does make using the mutexes quite a bit more straightforward
<DaveBeer> concept ACK, for code ACK I would be concerned with global variables usage, but I'm not familiar with bitcoin core codebase much as of yet and maybe global variables are reasonable here (I understand the PR doesn't change this approach and is as good as prior solution)
<stickies-v> DaveBeer: if I understand the PR comments historically, the initial implementation actually didn't use a global mutex variable, but because there are so many CBlockIndex objects this was the new approach chosen: https://github.com/bitcoin/bitcoin/pull/27006#discussion_r1092197196
<LarryRuane_> a mutex is the lowest-level syncronization primitive, just prevents two threads from running the protected ranges of code at the same time ... locks are higher-level constructs that _use_ a mutex
<stickies-v> in simpler terms: a mutex is an object that we use to help control access to one or multiple resources. a mutex can have one or multiple locks, and whoever has the lock(s) can access the underlying resources
<DaveBeer> LarryRuane_: actually your link also includes the shared_mutex, which is in its own header (I linked <mutex> header which does not include sahred_mutex)
<LarryRuane_> DaveBeer: "mutex can have any number of locks associated with it" -- I think you're right, multiple threads can be inside a `LOCK(::cs_main)` for example, and all of those are separate locks, but all refer to a single mutex
<stickies-v> "multiple threads can be inside a `LOCK(::cs_main)`" isnt' the whole point of LOCK that there can't be multiple threads accessing it at the same time?
<LarryRuane_> i may be confused here, but yes, you're correct, but multiple threads can be waiting at the same time (and each wait is a separate lock (?))
<stickies-v> the interesting thing about a shared mutex is that it can have both shared and exclusive locks, whereas an exclusive mutex can have just exclusive locks
<stickies-v> so then the next conceptual question is: if the whole point of a mutex is to prevent multiple threads from accessing the same data and messing everything up for everyone... what's the point of allowing a mutex to have shared locks?!
<pakaro> If a read-lock take hold of a resource, it can allow other read-locks to also hold the resource, but it will deny a write-lock, is that right?
<LarryRuane_> one thing other projects have is the ability to "upgrade" a lock from shared to exclusive, but i don't think we have that in bitcoin core
<LarryRuane_> you might wonder, why not just drop the shared lock and acquire the exclusive lock ... you can but things could change during that interval, so anything you've cached would be invalid
<LarryRuane_> stickies-v: for performance? multiple threads may want to read the block (and undo) files concurrently, and there's no reason to prevent this
<stickies-v> LarryRuane_: yeah accessing block data concurrently is what I was looking for. i/o operations typically benefit most from concurrency, so this seems like a pretty nice win at first sight
<LarryRuane_> stickies-v: yes i was thinking that also, this PR really combines two things that could have been done separately: having a separate lock (from cs_main), and making it a shared lock ... but it's good to do both
<DaveBeer> yes stickies-v, that's my understanding as well, first improvement is reducing ::cs_main contention, second (after splitting this usecase to new dedicated mutex) is to also utilize RW locks
<stickies-v> `nFile`, `nDataPos` and `nUndoPos` change from being guarded by `::cs_main` to being guarded by `g_cs_blockindex_data`. Why is it that we lock exactly these 3 variables with `g_cs_blockindex_data`? What would be the concerns and benefits of using a different mutex for each of those 3 variables?
<LarryRuane_> DaveBeer: +1, if they were separately locked, one might be updated from one thread while another updated from another thread, into an inconsistent state
<stickies-v> Do you expect this PR to have any visible impact on performance? If so, for which process(es) (in a very general sense) and by how much (order of magnitude)? Were you able to verify/benchmark this in practice?
<LarryRuane_> and in case anyone's not aware, the `nFile` integer relates to the block file name, so if `nFile` is 19, that corresponds to `datadir/blocks/blk00019.dat`
<stickies-v> DaveBeer: the blk<nnnnn>.dat files store the serialized blocks, but we also store the undo data to "reverse" the impact of a block onto the UTXO set in case of a reorg etc
<LarryRuane_> there are certain RPCs that read the block files (if the node isn't pruned), like if `txindex` is enabled, and the RPC server is multi-threaded, so those could proceed in parallel (better performance)?
<stickies-v> pakaro: no, it is not - if you inspect the blk and rev files you'll see they are different sizes, so that wouldn't quite work (and also the reasonw e have separate variables for them)
<LarryRuane_> pakaro: but one thing to note is that if a particular block is in `blk00123.dat`, then its undo data is in `rev00123.dat` (for example) .. and I think the blocks are in the same order across those 2 files (but the block and undo data are different sizes, blocks are much bigger usually)
<stickies-v> LarryRuane_: yeah RPC (well, and REST) is the first thing that came to my mind too because the HTTP server uses multiple workers. for example, I think the `getblock` method could now really benefit from batch RPC requests since *most* (not all) of that logic is now no longer exclusively locked to cs_main
<stickies-v> jbes: this PR does not introduce any new threads, I think! but it does allow existing multithreaded processes that were previously contended with cs_main to become more efficient. you weren't wrong, just wanted to challenge that a bit :-)
<stickies-v> (the challenge being that using a shared mutex does not magically introduce more multithreading - it just can make existing multithreading more efficient)
<stickies-v> +1, some tests are easier to understand than others but i found these to be particularly helpful, thanks for pointing that out LarryRuane_
<stickies-v> re performance, there's also this PR (also by PR author) that aims to introduce multiple threads to construct blockfilter index: https://github.com/bitcoin/bitcoin/pull/26966
<LarryRuane_> stickies-v: i did have a question if you don't mind... I noticed that there are "getters" for these `CBlockIndex` variables you mentioned, but they are still public instead of private.. just wondering why the PR doesn't make them private?
<LarryRuane_> i actually tried that, but got compile errors because some code, such as in `txdb.cpp`, doesn't use the getters, maybe that could be a follow-up PR?
<stickies-v> With this PR, does the number of times that a mutex is acquired increase, stay constant, or decrease - or does it depend on the program flow?
<pakaro> if the question were the maximum number of simultaneous mutexes held, that would increase, but the number of times total should remain constant
<stickies-v> I didn't count, but my impression is that it goes up? previously we could (and often did) just acquire a single cs_main lock for a bunch of operations, whereas now for example in `MakeBlockInfo()` we acquire a lock twice, first by calling `index->GetFileNum();` and then again by calling `index->GetFileNum();` directly after
<stickies-v> of course, even though there's a cost associated with acquiring a lock on a mutex, it can pale very very quickly in comparison to having multiple threads run concurrently, so this need not necessarily be a huge concern
<LarryRuane_> you know, just thinking, if those variables really are all related, maybe there should be a getter than returns all three atomically (fetched under one lock operation)
<stickies-v> thank you everyone for attending and for the discussion - hope our collective understanding about locks, mutexes, concurrency and cs_main has improved a bit - and hope that furszy gets some helpful feedback on his PR!