Skip to content

Conversation

@czareko
Copy link
Collaborator

@czareko czareko commented Jan 29, 2026

No description provided.

@czareko czareko marked this pull request as ready for review January 29, 2026 08:26
@czareko czareko changed the title feat: HS for Multisig feat: High-Security Integration for Multisig Pallet Jan 29, 2026
@n13
Copy link
Collaborator

n13 commented Jan 29, 2026

Gemini Review:
Architecture: The pallet-multisig now has a hard dependency on pallet-reversible-transfers for the HighSecurityInspector trait. I recommended inverting this dependency by defining the trait within pallet-multisig to keep the pallet generic.

Security: The implementation includes good defense-in-depth measures, such as checking call size before decoding. The documentation regarding the "Risk Window" during migration is clear and helpful.

Code Quality: The refactoring in transaction_extensions.rs to share the whitelist logic with the multisig configuration is excellent.

1. Architecture & Dependency Management

Observation: pallet-multisig now depends on pallet-reversible-transfers to access the HighSecurityInspector trait.

  • pallets/multisig/Cargo.toml adds pallet-reversible-transfers.
  • pallets/multisig/src/lib.rs imports the trait from pallet-reversible-transfers.

Recommendation: Invert this dependency to keep pallet-multisig generic and decoupled from specific business logic pallets.

  • Action: Define the HighSecurityInspector trait directly inside pallet-multisig (e.g., in traits.rs or lib.rs).
  • Benefit: pallet-multisig will no longer depend on pallet-reversible-transfers. The Runtime will bridge the two by implementing the trait (as it already does in runtime/src/configs/mod.rs), but the trait definition will live in the consumer (multisig) or a shared primitives crate, not the provider (reversible-transfers).

It kinda makes sense ^^^

Copy link
Contributor

@ethan-crypto ethan-crypto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One issue I noticed is that in the benchmarks we aren't actually computing the worst-case for the cleanup process because we aren't setting the call data to the max size for the expired proposals. This could lead to underweighting.

@czareko czareko marked this pull request as draft February 3, 2026 08:34
@czareko czareko marked this pull request as ready for review February 6, 2026 01:36
Copy link
Contributor

@ethan-crypto ethan-crypto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Right now we aren't making an effective use of the r regression coefficient. I suggested some changes that would account for this by defining a cleaned_target = i.min(r) which will only clean a subset of the total proposals, resulting in a more accurate weighting and a greater refund for proposal calls

  • The cleanup path remains expensive per proposal as long as call lives inside the same storage value. We should make the cleanup iteration cost depend only on the small fields we care about. To fix this we could either go with a split storage approach:

 ProposalMeta (small: proposer/status/expiry/deposit/…) used by cleanup iteration

 ProposalCall (big) only touched on execute / remove

OR store call as a preimage / hash (call_hash, call_len) and require bytes only when executing.

@n13
Copy link
Collaborator

n13 commented Feb 8, 2026

Proposal meta is a really good optimization @ethan-crypto

Basically don't have to decode the call just to find the expiry date

There's many ways to do this

@czareko
Copy link
Collaborator Author

czareko commented Feb 10, 2026

Finally, we decided to simplify the scope again, so benchmarks and weights are less sophisticated now.

@czareko czareko requested a review from ethan-crypto February 10, 2026 06:21
@n13
Copy link
Collaborator

n13 commented Feb 10, 2026

Here's Claude 4.6 review of PR #365: High-Security Integration for Multisig Pallet.


Summary

This PR introduces a HighSecurityInspector trait in the reversible-transfers pallet and integrates it into the multisig propose() flow and transaction extensions. The goal is to enforce a call whitelist for high-security accounts (both EOAs and multisigs) from a single source of truth. Nice DRY improvement!


Architecture & Design (Looks Good)

The trait-based approach is clean:

  • Trait defined in pallet-reversible-transfers (storage owner)
  • Implemented in runtime (HighSecurityConfig) with access to RuntimeCall
  • Consumed by both multisig pallet and transaction extensions

The () default implementation is a nice touch — pallets can opt out of high-security support trivially.


Issues Found

1. Duplicate Call Size Check (Minor, DRY Violation)

The diff adds an early size check:

+ let call_size = call.len() as u32;
+ ensure!(call_size <= T::MaxCallSize::get(), Error::<T>::CallTooLarge);

But the original check at line 545 of lib.rs still exists:

			// Check call size
			ensure!(call.len() as u32 <= T::MaxCallSize::get(), Error::<T>::CallTooLarge);

The call is now validated twice. The early check is correctly motivated (prevent decode of oversized payloads), but the original should be removed.


2. Weight Anomaly — propose Base 10x Higher Than propose_high_security

From the generated weights:

fn propose(_c, e) -> Weight {
    Weight::from_parts(800_288_659, ...)   // ~800µs base
}

fn propose_high_security(_c, e) -> Weight {
    Weight::from_parts(77_592_134, ...)    // ~78µs base
}

This is counterintuitive — the high-security path (which adds decode + whitelist check) should be more expensive, not 10x cheaper. The old propose base was ~140M. Jumping to 800M suggests a benchmarking environment issue (noisy machine, cold caches, etc.). I'd recommend re-running benchmarks in a clean environment, as this directly affects fees charged to users.


3. _c (Call Size) Unused in propose_high_security Weight

+ fn propose_high_security(_c: u32, e: u32, ) -> Weight {

The call is decoded in the high-security path, which is O(n) in call size. The benchmark should capture this scaling, yet _c is unused (the benchmark framework determined it was statistically insignificant). This might undercharge large calls in the high-security path. Worth verifying with a wider range of c values, or explicitly adding a per-byte component as a safety margin.


4. MockHighSecurity Defined in tests.rs, Referenced from mock.rs

+ type HighSecurity = crate::tests::MockHighSecurity;

This creates a mock → tests dependency, which is the reverse of the typical pattern (tests → mock). It works because both are #[cfg(test)], but it's fragile — if someone adds a non-test consumer of the mock, it would break. Consider moving MockHighSecurity into mock.rs instead.


5. Comment Verbosity

Given the project rule of minimal comments, the PR adds quite a few — especially in the benchmarking file (20+ lines of comments for a single benchmark). The code is fairly self-explanatory; trimming these would align with project conventions.


6. Unused guardian() in Multisig Context

The HighSecurityInspector trait includes fn guardian(...), but the multisig pallet only uses is_high_security and is_whitelisted. The trait forces implementors (including the () default and mock) to implement guardian() without a consumer. If it's for future use, that's fine — just noting it adds surface area with no current consumer in this PR.


Looks Good

  • Early call-size check before decode — smart DoS protection
  • Weight overcharge + refund pattern via DispatchResultWithPostInfo — correct approach
  • auto_cleanup_expired_proposals return value captured for actual weight calculation
  • Transaction extension refactored to use the shared HighSecurityConfig — eliminates duplicated whitelist logic
  • Benchmark for propose_high_security — proper setup with set_high_security_for_benchmarking
  • Whitelist is tight: only schedule_transfer, schedule_asset_transfer, and cancel

Verdict

The architecture and security logic are sound. The main actionable items are:

# Severity Issue
1 Low Remove duplicate call size check
2 Medium Re-run benchmarks — propose base weight looks anomalous
3 Low Verify _c scaling for high-security decode path
4 Low Move MockHighSecurity to mock.rs
5 Nit Trim verbose comments

Item #2 is the most impactful since it directly affects user fees. The rest are cleanup. Overall a solid PR — the unified HighSecurityInspector abstraction is a welcome improvement.

@n13
Copy link
Collaborator

n13 commented Feb 10, 2026

Gemini

PR Review: High-Security Integration for Multisig Pallet

Summary

This PR integrates High-Security (HS) mode into the Multisig pallet via a new HighSecurityInspector trait. The architecture allows for flexible opt-in of HS checks (decode + whitelist) for specific accounts.

Critical Issues (Must Fix)

1. Weight Anomaly (Performance/Economics)

Severity: High
The generated weights show a massive discrepancy:

  • propose (standard): ~800,000,000 ps (~800µs)
  • propose_high_security: ~77,000,000 ps (~77µs)

The HS path does more work (decode + whitelist), so it should be more expensive. The 10x lower weight for HS suggests a flawed benchmarking environment or regression. This must be re-benchmarked to avoid undercharging HS calls or overcharging standard ones.

2. Unused c Parameter in HS Weight

Severity: Medium
The propose_high_security weight function (weights.rs) ignores the c (call size) component. Since HS requires decoding the call (an O(c) operation), the weight must scale with c. Ignoring it risks DoS vectors where large calls are processed cheaply.

Code Quality & Best Practices

3. Duplicate Call Size Check

Severity: Low
The PR adds an early call.len() check in propose (correctly, for DoS protection). However, the original check at line 545 (ensure!(call.len()...)) was not removed.

  • Action: Remove the redundant check at line 545.

4. Circular Test Dependency

Severity: Low
MockHighSecurity is defined in tests.rs but used in mock.rs. This inverts the standard dependency chain.

  • Action: Move MockHighSecurity to mock.rs.

5. Excessive Comments

Severity: Nit
benchmarking.rs contains large blocks of explanatory comments (20+ lines).

  • Action: Reduce comments to align with the "minimal comments" project rule.

6. Unused Trait Method

Severity: Nit
The HighSecurityInspector trait includes guardian(), which is unimplemented (returns None or dummy) and unused by the Multisig pallet.

  • Action: Remove guardian() if not immediately needed, or document it as future-proofing.

Architecture (Good)

  • Trait-based Abstraction: HighSecurityInspector correctly decouples the policy from the mechanism.
  • DoS Protection: The early call_size check in propose is a good defense-in-depth measure (despite the duplication issue).
  • Runtime Whitelist: Restricting HS multisigs to ReversibleTransfers calls is a sound security policy.

Conclusion

The architecture is solid, but the benchmarking/weight generation is flawed and needs immediate attention before merge. The code cleanup items are straightforward.

Copy link
Contributor

@ethan-crypto ethan-crypto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I really like the new design introducing the execute method and confining all clean up operations to extrinsic calls. However there are still some places where underweighting can occur:

  • cancel, remove_expried and claim_deposits are all benchmarked with a small fixed call size. This will inevitably lead to underweighting. We should parameterize cancel weight by stored call length like we do with the other methods that touch proposal storage. For claimed_deposits we would need to compute the average call data size over the set of proposal in storage.

  • Even though the approve_dissolve weight function prices it as constant, it has two different execution paths: reached and not reached. We should adopt a similar pattern that accounts for this like we did with propose; (1) charge worst-case up front, then (2) refund to the cheaper path via actual_weight, and (3) benchmark both paths.

  • Even though the approvals field is much cheaper to decode compared to call, the decode cost also scales with approvals length. The benchmarks currently do not parameterize approvals lengths and doesn't cover the worst case (100 approvals), so this could lead to minor underweighting unless we fix the approvals to 100 or parameterize.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants