Adding notion of a recovery owner for network recovery #6705

gaurav137 · 2024-12-13T08:11:58Z

Adding support for a member that can act as a recovery owner and gets assigned the full recovery share to it. Such a member can single handedly recover the network on their own w/o requiring any other recovery members to submit their recovery shares.

This capability is going to be used in the confidential clean room scenario where a confidential recovery service gets created and is able to recover the CCF network on its own using a recovery owner identity. Only if the confidential recovery service is unable to perform recovery then the recovery members can come together and perform break glass recovery by submitting the threshold amount of recovery shares.

Changes done:

Added an optional recovery_role enum in MemberDetails to indicate whether a recovery member is a recovery owner or not.
Added support to set the above flag both via the cchost config file during network create and set_member proposal.
Recovery owners don't count towards the recovery threshold calculation so updated code paths accordingly.
Added test cases to recover a network using the recovery owner.

samples/constitutions/default/actions.js

achamayou · 2024-12-13T10:57:08Z

src/host/main.cpp

+      // owners as only these members will be handed a recovery share
+      // that accrues towards the recovery threshold.
+      // Note that it is acceptable to start a network without any member
+      // having a recovery share. The service will check that at least one


I believe we support opening deliberately unrecoverable services, and although I am not aware of current use cases, they have come up as potential use cases in the past, so I think we want to leave that open as a possibility.

@achamayou not sure which lines you wanted me to change here. I added (a) a check to ensure that if recovery_owner has a value then enc_pub_key must also have a value else throw and (b) count the member_with_pubk_count while skipping the recovery_owner members.
(a) is like a configuration issue while (b) is only ensuring the correctness of the existing check that the count of recovery members and supplied or calculated default recovery threshold values are sane else the logic already throws below.

"The service will check that at least one..."
^ I don't believe this is true now, and I don't want it to become true, because it precludes creating un-recoverable systems, which we think may be desirable in some cases.

This is something that an operator can quite trivially preclude by modifying the transition_service_to_open() transition if they wish to do so, there is no reason to hardcode it outside the constitution.

@achamayou "The service will check that at least one..." this check that there must be atleast 1 recovery member (aka participant) and recovery threshold cannot exceed that number existed before this PR and continues to work today. So as of now you cannot open a service that has 0 recovery participants. I preserved the check in main.cpp using members_with_pubk_count and there are checks in internal table access::set_recovery_threshold, remove_member and open_service that continue to work as before. Having owners has not changed the checks around recovery threshold and recovery members (participants).

achamayou · 2024-12-13T11:01:25Z

src/node/share_manager.h

-      std::copy_n(secret.begin(), secret.size(), data.begin());
-      OPENSSL_cleanse(secret.data(), secret.size());
+      // TBD (gsinha): How to set ccf::crypto::sharing::Share secret member
+      // variable here?


It's data (the raw secret) prefixed with an index 0, if we do that, we might as well drop data altogether and avoid keeping the secret twice.

I wasn't sure how to instantiate a Share instance using the constructor inputs. If we can do that then the wrap/unwrap methods also would need to create a temporary secret.key(size) instance to replace the current data instance usage in those methods?

There is no way to do that in the current API, it's the lines 153 to 162 of sharing.cpp, but it's deliberately not exposed and the sampling is combined with the sharing to avoid sharing of unsafe keys/unsafe sizes.

I think that is fine, because re-constructed keys never need to compute a zero share, and so all objects created with this ctor should never need this member attribute. This is why I was suggesting splitting this into two distinct types: SharedLedgerSecretWrappingKey and ReconstructedLedgerSecretWrappingKey.

Ok, @achamayou we can take this comment when we get to SharedLedgerSecretWrappingKey/ReconstructedLedgerSecretWrappingKey refactoring work?

src/node/share_manager.h

src/service/internal_tables_access.h

achamayou · 2024-12-13T11:18:24Z

@gaurav137 in member_frontend.h on line 602 there is a version number of the openapi generation, if you bump it up and rebuild, the next run of the schema test will fail but produce the updated schema. You can check that in and then the run after should work (it's just updating a golden file).

Co-authored-by: Amaury Chamayou <[email protected]>

src/node/share_manager.h

src/node/gov/handlers/recovery.h

tests/recovery.py

.prettierrc.yaml

doc/host_config_schema/cchost_config.json

include/ccf/service/tables/members.h

src/node/gov/handlers/recovery.h

src/node/gov/handlers/service_state.h

include/ccf/service/tables/members.h

achamayou · 2024-12-17T14:29:43Z

src/node/share_manager.h

@@ -96,6 +110,11 @@ namespace ccf
      return ret;
    }

+    std::vector<uint8_t> get_full_share_serialised() const


This is a problem because it will leave a vector with the full share on the stack, we need to call with a member id instead, and encrypt (same as wrap()) so as to only return encrypted info.

I think it would be better to modify wrap() to return a WrappedShare, with a map of member_id to wrapped_shared, all encrypted.

@achamayou I'm not clear how exactly to make this change. Perhaps you could show it how?

gaurav137 added 17 commits December 11, 2024 06:34

Update

07b152f

Update

7cd6011

Update

580670c

Update

4df08bd

Update

491aed2

Update

41e16fd

Update

90cbdfb

Update

1c1bda9

Update

2f7ff7f

working

7d0fe02

Update

607b8a4

Update

986138a

Update

0ae745c

Update

33b736b

Update

96838b7

Update

7104982

Update

7639bf9

gaurav137 requested a review from a team as a code owner December 13, 2024 08:11

gaurav137 added 2 commits December 13, 2024 08:23

Update

8bd2d95

Update

e5021f1

gaurav137 changed the title ~~Adding notion of a recovery owner notion for network recovery~~ Adding notion of a recovery owner for network recovery Dec 13, 2024