Add Fee rate Forecaster Manager (tx fees and policy)

https://github.com/bitcoin/bitcoin/pull/31664

Host: ismaelsadeeq  -  PR author: ismaelsadeeq

Notes

Motivation and Background

The current fee estimator in Bitcoin Core, used by the wallet and exposed to clients via RPC, is CBlockPolicyEstimator.

This estimator maintains a vector of fee rate buckets ranging from 1,000 sats/kvB to 10,000,000 sats/kvB. These buckets are spaced exponentially by a factor of 1.05. For example, if the first bucket is 1,000 sats/kvB, the next would be 1,050 sats/kvB, and so on. The estimator works by:

  • Tracking each transaction’s fee rate and assigning it to the appropriate bucket.
  • Monitoring transactions as they leave the mempool. If a transaction is confirmed in a block, the estimator records success in the bucket, along with the number of blocks it took to confirm (i.e., the difference between the confirmation block height and the height at which it was first seen in the mempool).
  • If a transaction is removed for any reason other than inclusion in a block, it is recorded as a failure.
  • This data is aggregated using an exponentially decaying moving average, so old data points become less relevant over time.

A naive fee rate estimate by this estimator would provide an estimate for a confirmation target n by going through these buckets from the lowest bucket to the highest bucket and returning the lowest fee rate bucket where more than 80% of the transactions first seen within these n blocks were confirmed.

However, the fee estimator is more complex than this. It maintains three sets of these buckets: Short-termMedium-term, and Long-term, each decaying at a different rate.

These allow the estimator to provide fee estimates in two modes:

  • Conservative – Uses long-term data for more cautious estimates and is tailored toward users who do not plan to not fee-bump. These fee rates tend to be higher.
  • Economical – Relies on more recent short-term data for making fee rate estimates, potentially resulting in lower fees. For a more detailed explanation of this fee estimator’s design, see John Newbery’s Introduction to Bitcoin Core Fee Estimation and the high-level design overview by the original author.

Key advantage: This approach is highly resistant to manipulation because all transactions must be:

  • Relayed (ensuring visibility to miners)
  • Confirmed (ensuring they follow consensus rules)

Key limitations for users:

  1. Mempool unaware: Does not consider the current mempool unconfirmed transactions when estimating fees. As a result, it remains oblivious to sudden changes, such as when high fee rate transactions are cleared and the mempool becomes sparse, leading to overpayment. Likewise, during sudden congestion, it fails to adjust, resulting in underpayment and missed confirmation targets.

  2. Package unaware: Only considers individual transactions, ignoring parent-child relationships. As a result, it fails to account for CPFP’d transactions, which can lead to lower fee rate estimates, resulting in underpayment and missed confirmation targets.

These limitations cause practical problems for:

  • Bitcoin Core wallet users (many now rely on third-party providers) due to the inaccuracies mentioned above.
  • Lightning implementations (e.g., C-lightning) that use Bitcoin Core’s fee estimator may fail to confirm transactions before their timelocks expire if it underestimates. If it overestimates, they may construct commitment transactions that pay more than necessary.
  • Bitcoin clients (e.g., BTCPay Server), have switched to external fee estimators due to Bitcoin Core’s mempool unawareness.
  • Others (e.g., Decred DEX), continue to see high fee rate estimates more than necessary even after the recent v28 switching of estimateSmartFee default mode to “economical”.

Bitcoin Core needs to provide reliable fee rate estimates to uphold the trustlessness and self-sovereignty of node operators and the services that rely on it. Relying on external estimators undermines these principles. Bitcoin Core’s fee estimator should be as reliable and cost effective as external alternatives.

This PR is part of the Fee Estimation via Fee rate Forecasters project #30392 which aims to address these limitations. A detailed design document of the project is available: Fee Rate Forecasting Design

Implementation Details

PR #31664 introduces a framework for improved fee estimation with the following key components:

1. Core Utility Structures

  • Forecaster abstract class: Defines the interface for all fee rate forecasters, establishing a consistent API for different fee rate forecasting strategies (Commit a2e3326).
  • ForecastResult struct: Provides an output format containing fee estimates and associated metadata (Commit 1e6ce06)
  • ConfirmationTarget struct: Implements a flexible input format supporting block-based targets with extensibility for future time-based targets (Commit df7ffc9)
  • ForecastType enum: Identifies different forecaster implementations, enabling appropriate routing of fee rate requests (Commit 0745dd7)

2. MempoolForecaster Implementation

  • MempoolForecaster class: Inherits from Forecaster and Generates block templates and extracts the 50th and 75th percentile fee rates to produce high and low priority fee rate estimates (Commit c7cdeaf)
  • Performance optimization: Implements a 30-second caching mechanism to prevent excessive template generation and mitigate potential DoS vectors (Commit 5bd2220)

3. Introducing FeeRateForecasterManager

  • FeeRateForecasterManager class: Serves as the central coordinator for all fee rate forecasters, maintaining shared pointers to registered forecasters (Commit df16b70).
  • Node Interface: The PR updates the node context to hold a unique pointer to FeeRateForecasterManager (Commit e8f5eb5).
  • Backward compatibility: Exposes a raw pointer to CBlockPolicyEstimator for compatibility with existing estimateSmartFee calls and related functions

4. Integration with CBlockPolicyEstimator

  • CBlockPolicyEstimator adaptation: Refactors the existing estimator to inherit from the Forecaster base class, adapting it to the new architecture while preserving existing functionality (Commit 9355da6).
  • Validation Interface: Now maintains a shared pointer to CBlockPolicyEstimator.

5. Files restructuring

  • The PR renames fees.{h,cpp} to block_policy_estimator.{h,cpp} to better reflect component responsibility (Commit 85dce07)
  • It also renames fees_args.{h,cpp} to block_policy_estimator_args.{h,cpp} for consistent terminology (Commit ec92584)
  • The PR renames policy_fee_tests.{h,cpp} to feerounder_tests.{h,cpp} to align with tested functionality (Commit 3d9a393)

Component Relationships

The architecture establishes a clear hierarchy:

  1. FeeRateForecasterManager sits at the top level, coordinating all fee rate forecasters.
  2. Both CBlockPolicyEstimator and MempoolForecaster implement the Forecaster interface, providing different approaches to fee rate forecasts.
                           ┌─────────────────┐
                           │FeeRateForecaster│
                           │    Manager      │
                           └───────┬─────────┘
                                   │ holds a shared_ptr copy of
                     ┌─────────────┴───────────────┐
                     │                             │
            ┌────────▼─────────┐         ┌─────────▼────────┐
            │CBlockPolicy      │         │Mempool           │
            │Estimator         │         │Forecaster        │
            └──────────────────┘         └──────────────────┘
            (Uses historical mempool data)    (Uses current mempool data)
  1. ForecastResult and ConfirmationTarget standardize the input and output formats across all forecasters.
  2. The node context maintains the lifecycle of these components, with ValidationInterface ensuring they receive blockchain updates.

Design Goals of #31664

  1. Creates a pluggable architecture for multiple fee estimation strategies.
  2. Creates the pathway for making fee estimation in Bitcoin Core both mempool-aware and package-aware.
  3. Maintains backward compatibility with existing code.

Previous work and discussions to dig deep:

Questions

Conceptual & Approach

  1. Did you review the PR? Concept ACK, approach ACK, tested ACK, or NACK? What was your review approach?
  2. Why is the new system called a “Forecaster” and “ForecasterManager” rather than an “Estimator” and “Fee Estimation Manager”?
  3. Why is CBlockPolicyEstimator not modified to hold the mempool reference, similar to the approach in PR #12966 #12966 What is the current approach and why is it better than holding a reference to mempool? (Hint: see #28368)
  4. What are the trade-offs between the new architecture and a direct modification of CBlockPolicyEstimator?

Code Review & Implementation

  1. Why does Commit 1e6ce06 compare against only the high-priority estimate?
  2. What other methods might be useful in the Forecaster interface (Commit a2e3326)?
  3. Why does Commit 143a301 return the current height, and where is nBestSeenHeight set in the code?
  4. Why is it important to maintain monotonicity when iterating through the package fee rates in (Commit 61e2842)?
  5. Why were the 75th and 50th percentile fee rates chosen as MempoolForecaster fee rate estimate in (Commit c7cdeaf)?
  6. In what way do we also benefit from making CBlockPolicyEstimator a shared_ptr in (Commit e8f5eb514)?
  7. Why are MempoolForecaster estimates cached for 30 seconds? Could a different duration be better (Commit 5bd2220)?
  8. Should caching and locking be managed within MempoolForecaster instead of CachedMempoolEstimates? Why is CachedMempoolEstimates declared mutable in MempoolForecaster?
  9. Why does ForecasterManager avoid returning a mempool estimate when the CBlockPolicyEstimator lacks sufficient data (Commit c6b9440)? When will this scenario occur?

Meeting Log

  117:00 <abubakarsadiq> #startmeeting
  217:00 <glozow> hi
  317:00 <dzxzg> hi
  417:00 <Musa> hi
  517:00 <sipa> hi
  617:01 <abubakarsadiq> hi everyone, welcome to this edition of the bitcoin core PR review club.
  717:01 <abubakarsadiq> thanks for joining
  817:01 <abubakarsadiq> We are looking at Bitcoin core PR #31664 which I authored, lets dive in
  917:01 <adys> hi
 1017:02 <monlovesmango> heyy
 1117:02 <effexzi> hello every1
 1217:02 <abubakarsadiq> who got the chance to review the PR or read the notes? (y/n)
 1317:02 <Musa> Yes, I was able to read the notes
 1417:03 <sliv3r__> y, not deep dive into the code but could check the notes and and took a fast look into the pr
 1517:03 <dzxzg> y (light)
 1617:03 <abubakarsadiq> Nice also feel free to ask question anytime, dont ask to ask :)
 1717:03 <oxfrank> y
 1817:04 <monlovesmango> y a bit
 1917:04 <abubakarsadiq> 1. Why is the new system called a “Forecaster” and “ForecasterManager” rather than an “Estimator” and “Fee Estimation Manager”?
 2017:04 <sliv3r__> I guess bc estimators rely on past information (blockchain info), forecaster uses data from the future
 2117:04 <monlovesmango> bc its trying to predict future fees?
 2217:05 <Musa> I think because it manages both historical data and current mempool data
 2317:05 <oxfrank> forecaster implies predicting future fee rates
 2417:06 <abubakarsadiq> yes IMO think estimator is a misnomer, The system predicts future outcomes based on current and past data. Unlike an estimator, which approximates present conditions with some randomization, a forecaster projects future events, which aligns with this system’s predictive nature and its output of uncertainty/risk levels.
 2517:06 <Musa> Which enables to predict future rates more accurate you both
 2617:08 <abubakarsadiq> 2. Why is CBlockPolicyEstimator not modified to hold the mempool reference, similar to the approach in PR #12966 #12966 What is the current approach and why is it better than holding a reference to mempool? (Hint: see #28368)
 2717:09 <abubakarsadiq> @sliv3r ForecasterManager also rely on past information.
 2817:11 <glozow> conceptually, `CBlockPolicyEstimator` doesn't really need to interact with mempool. it can get all the data it needs from the validation interface events
 2917:11 <abubakarsadiq> yes :100
 3017:11 <sliv3r__> @glozow bc it doesn't need to do any change on it you mean?
 3117:11 <abubakarsadiq> We had already refactored CBlockPolicyEstimator to not hold a mempool reference, instead updating through the validation interface.
 3217:12 <glozow> it doesn't need to make any changes to mempool certainly. but it doesn't even really need to know the mempool's contents
 3317:12 <monlovesmango> from some of the hints it seems that fee estimator was blocking mempool updates when txs were removed, which isnt ideal, especially if you are trying to improve fee estimation functionality
 3417:13 <glozow> that was the other direction - mempool used to also own the cblockpolicyestimator
 3517:13 <abubakarsadiq> Also I think it's cleaner to have a separate class (MempoolForecaster) to hold the mempool reference and generate mempool forecast.
 3617:13 <monlovesmango> glozow: ah ok!
 3717:14 <sliv3r__> @glozow oh right, only some data you extract from it (feerates in this case)
 3817:14 <glozow> but MempoolForecaster and CBlockPolicyEstimator don't talk to each other, do they?
 3917:14 <abubakarsadiq> They don't
 4017:15 <willcl-ark> hi
 4117:15 <abubakarsadiq> this is the reason for the design of the forecaster manager to separate concern
 4217:16 <abubakarsadiq> This brings us to the next question
 4317:16 <abubakarsadiq> 3. What are the trade-offs between the new architecture and a direct modification of CBlockPolicyEstimator?
 4417:17 <monlovesmango> direct modification would probably be easier short term, as you wouldn't need to alter the code where CBlockPolicyEstimator is called. but long term the new architecture allows for a lot more flexibility and upgradability.
 4517:18 <abubakarsadiq> yes the pro's of this is clean separation of logic, easy to plug in new forecasting strategies, modular & testable
 4617:18 <abubakarsadiq> what are the cons?
 4717:19 <abubakarsadiq> I think more code to maintain, slightly more complexity
 4817:19 <oxfrank> imo slightly more difficult to implement and minor possible performance overhead from additional abstraction
 4917:19 <monlovesmango> maintaining more methods of fee estimation? i'm actually not too sure why you wouldn't want to do this
 5017:20 <glozow> could be confusing for users who probably have no idea how the estimations work or which one is more suitable for them
 5117:21 <Leo82> Maintenance burden
 5217:21 <abubakarsadiq> yeah @willclark I think we discussed this, most users just want a value to use.
 5317:22 <abubakarsadiq> If we just add a new method to block policy estimator for getting mempool fee rate forecast it will be simple, fast to implement, reuses existing structure, minimal changes
 5417:22 <sliv3r__> @glozow I don't think that's a con. I guess a default value will be set and then users with more expertise will be able to choose depending on their needs
 5517:22 <glozow> a even having params is confusing - a lot of people were noticing overestimates, only 1 person realized that they could use "economical" instead of "conservative"
 5617:23 <glozow> (only 1 person that i know of, i mean)
 5717:23 <abubakarsadiq> Yes we should probably provide a value response and when you want verbose response you can get the information on which forecasting strategy was used and other details
 5817:24 <abubakarsadiq> Let's dive into the Code review section.
 5917:24 <abubakarsadiq> 1. Why does Commit 1e6ce06 compare against only the high-priority estimate?
 6017:25 <abubakarsadiq> @glozow I think AJ suggested it first, then a user also opened an issue about it.
 6117:25 <sliv3r__> This one I didn't get it. High priority will always be the higher number so the estimation with the biggest high-priority estimation will be the bigger one but we could also answer on the other way
 6217:25 <abubakarsadiq> yes @sliv3r it more than low priority
 6317:26 <abubakarsadiq> why is so for mempool forecaster? also why for block policy?
 6417:27 <monlovesmango> is it bc the high priority fee will usually have more difference in fee?
 6517:28 <monlovesmango> or said differently, that the high priority fee will usually be more differentiated?
 6617:28 <glozow> well what's the use case for having a comparison operator for `ForecastResult`s?
 6717:28 <abubakarsadiq> In Mempool forecaster, high priority and low priority are the 50th and 75th percentile fee rate of the generated block template
 6817:29 <abubakarsadiq> the use case is to compare fee rate forecast from block policy and mempool forecaster
 6917:30 <abubakarsadiq> In Block Policy the high and low priority are the conservative and economical fee rate estimate for a confirmation target.
 7017:30 <glozow> and do you want to compare them based on how much money it might cost the user?
 7117:31 <sliv3r__> what's the use case of that comparison between forecasters?
 7217:31 <monlovesmango> yes but why choose to compare the high priority fee as opposed to the low priority fee?
 7317:31 <glozow> presumably because you want to choose 1 to return to the user
 7417:31 <monlovesmango> what happens when user is wanting a low priority fee, how will this comparision be helpful?
 7517:31 <abubakarsadiq> @sliv3r see https://github.com/bitcoin/bitcoin/pull/31664#discussion_r2033340253
 7617:32 <abubakarsadiq> @monlovesmango because it is higher, you dont want to confirm with the lower value
 7717:33 <abubakarsadiq> compare*
 7817:33 <abubakarsadiq> 2. What other methods might be useful in the Forecaster interface (Commit a2e3326)?
 7917:34 <monlovesmango> i don't think i'm understanding. if a user wants a low priority fee, wouldn't we want to compare the low priority fee estimates from the two methods instead of the high priority fee estimates from the two methods?
 8017:35 <abubakarsadiq> @glozow yes you do. (you just want to know that what your mempool is suggesting for you to pay for high priority transactions is not higher than conservative fee rate estimate from block plicy)
 8117:35 <monlovesmango> might be useful to be able to specify using only one method to estimate
 8217:35 <sliv3r__> @abubakarsadiq fast checked, makes more sense with this :) so it's a protect mechanism against mempool manipulation
 8317:36 <abubakarsadiq> @monlovesmango good point.
 8417:37 <sliv3r__> but this can be a problem for example on LN. If you understimate during a spike when you have, let's say, 1 block time before the attacker can spend their coins (thus steal from you), you may not be able to get the penalty transaction mined on time
 8517:37 <sliv3r__> (The attack here would be to broadcast an old state)
 8617:39 <sliv3r__> I guess a solution to this is just give the option to specify that you want the higher fee-rate estimation and overpaid it a bit.
 8717:40 <abubakarsadiq> Yes I think LN has a better solution and than getting the state of the mempool when the deadline is that close, there is a great solution I like called deadline aware budget sweeper
 8817:40 <abubakarsadiq> https://delvingbitcoin.org/t/lnds-deadline-aware-budget-sweeper/1512
 8917:41 <sliv3r__> will take a look, thx
 9017:42 <abubakarsadiq> The aim is for solution like the one in the delving post below to call this fee rate forecaster with 1,2 confirmation target, if the mempool is sparse; this solotion will prevent them from paying more than necessary. if a block elapsed and it did not confirm they can compare the new 1,2 confirmation target with the fee function output and use the highest
 9117:42 <abubakarsadiq> https://delvingbitcoin.org/t/lnds-deadline-aware-budget-sweeper/1512/5?u=ismaelsadeeq
 9217:44 <abubakarsadiq> The mempool forecaster will be used to correct block policy feerate forecaster and prevent you from paying more than necessary after having rough certainty that your mempool is in sync with miners
 9317:44 <abubakarsadiq> This has been suggested a long time ago by Kalle Alm https://delvingbitcoin.org/t/mempool-based-fee-estimation-on-bitcoin-core/703/2?u=ismaelsadeeq
 9417:45 <abubakarsadiq> 3. Why does Commit 143a301 return the current height, and where is `nBestSeenHeight` set in the code?
 9517:47 <monlovesmango> I think bc it is part of the ForcastResult?
 9617:47 <sliv3r__> sorry I'm 1 question late :) - regarding 2. A reset function could be usefull if we have for differents things like benchmarking if we have some stateful model that takes into account history (e.g the cache)
 9717:48 <abubakarsadiq> 2. I also think `GetMaxTarget`: A method that returns the maximum confirmation target a forecaster can predict.
 9817:49 <abubakarsadiq> @sliv3r__ I don't follow
 9917:50 <abubakarsadiq> monlovesmango: Returning the current height can be helpful debugging and validation of forecast accuracy, helping us track at which height the forecast was made and assess effectiveness before the target elapsed.
10017:50 <monlovesmango> it is set in https://github.com/bitcoin/bitcoin/blob/bb92bb36f211b88e4c1aa031a4364795cbd24767/src/policy/fees.cpp#L684..?
10117:51 <abubakarsadiq> monlovesmango: YES
10217:51 <monlovesmango> abubakaras: that makes sense
10317:52 <abubakarsadiq> After connecting new block we update the current height, because new block was connected
10417:52 <sliv3r__> @abubakarsadiq: we may have forecasters with "historic" data on memory. (E.g the cache). A reset function that returns it to the initial state can be usefull for testing and benchmarking. The initial idea on this was for chain reorgs but I think that in that case is not that important and just estimating again the fee would be good
10517:52 <abubakarsadiq> monlovesmango: what if a reorg happen, how will you correct that?
10617:52 <abubakarsadiq> 4. Why is it important to maintain monotonicity when iterating through the package fee rates in (Commit 61e2842)?
10717:53 <sliv3r__> 4. For each percentile we should have a feerate equal or higher than the next one. So `90 >= 75 >= 50...`. If we choose based on priority we cannot let high priority pay less fees than low priority.
10817:54 <monlovesmango> 4. bc the accumulated weight is somewhat disconnected from each percentage's weight, so ordering is very important
10917:55 <abubakarsadiq> @liv3r__ how will the interface consumer; the forecaster manager benefit from getting the previous state?
11017:56 <abubakarsadiq> sliv3r__: first part of your answer is correct; the reason is because the packages fee rate are not monotonically increasing, why?
11117:58 <sliv3r__> @abubakarsadiq: not getting but reseting to default (0 or empty) value. For testing for example you may want to be able to call it multiple times fast without taking into account the cache.
11217:59 <monlovesmango> abubakarsa: about the reorg, no idea. I would assume we would want to queue a reevaluation of any cached esitmates at the time of a reorg?
11318:00 <abubakarsadiq> A transaction chunk fee rate may increase because it's parent was included previously in a sibling package. Hence the package fee rates of a block template are not monotonic; thanks to @sipa and @murch for pointing that out to me.
11418:01 <monlovesmango> they may not be monotonically increasing bc " Outliers can occur when the mining score of a transaction increases due to the inclusion of its ancestors in different transaction packages."
11518:01 <abubakarsadiq> monlovesmango: yes 💯
11618:01 <sliv3r__> @abubakarsadiq do you have a visual example of that?
11718:02 <monlovesmango> ok thanks for explaning that further! I saw the comment (which I quoted) but wanted to ask what that meant further
11818:04 <abubakarsadiq> sliv3r__: Good point; I think it's important for us not to return the cached data in the case of reorg
11918:04 <abubakarsadiq> #endmeeting