Primitive based interface #313

PhilippGackstatter · 2022-02-02T16:30:57Z

PhilippGackstatter
Feb 2, 2022

Feature Name: primitive-based-interface
Start Date: 2021-12-08

Summary

Implement an interface that is based on stronghold primitives on a type level, allows concurrent usage and requires less password management.

Motivation

The identity.rs and wallet.rs crates wrap stronghold for similar reasons. API-level ideas from these wrappers can flow back into the stronghold interface itself to reduce the amount of required wrapping. These wrappers primarily solve three issues with the current approach:

One important reason is to implement password management. Password management referes to the ability of users to provide the password once in a session and then not having to provide it again for further snapshot reads. However, in the wrappers this is currently done relatively insecurely as passwords are stored in unguarded memory. The interface cannot fix this problem unless architectural changes are made, which is the among the topics of other RFCs (such as Feature/implementation #2). This RFC makes the assumption that password management will be solved within stronghold and users do not have to provide the password repeatedly.
Another reason is to allow for race-condition free usage of the interface. In the current interface, only one actor can be set as the current target for operations. Users who want to access stronghold with multiple clients concurrently need to wrap the Stronghold into a mutex, to ensure that switching the target and a subsequent operation is atomic. If mutually exclusive access is not guaranteed, another concurrent caller may switch the target to some other actor. Whether the actor system is removed or not, the new architecture should allow for concurrent usage, and the interface should expose that capability.
From a user perspective, there are multiple primitives to learn about and work with on a conceptual level: snapshots, clients, vaults, records and stores. Currently, the interface is made up of a single Stronghold type, which exposes all of these primitives. Using Rust's type system, these primitives could be exposed more directly, making it easier to map stronghold concepts onto the interface. It should also become easier to do multiple operations on a nested primitive, like a vault or store. Overall, this should make the interface easier to reason about because the mental model of stronghold maps more directly to the interface, and it should be easier to learn the interface in an exploratory manner.

Guide-level explanation

Introduce a type in the interface for each primitive mentioned above, so there's (roughly) one-to-one relationship between the mental model of stronghold primitives and types in the interface.

Local Example

async fn stronghold_interface_example() -> Result<(), Box<dyn std::error::Error>> {
    let vault_path: &[u8] = b"vault";
    let client_path: &[u8] = b"client";
    let key_data: Vec<u8> = [0xff; 32].to_vec();
    let store_loc: &[u8] = b"store-key";

    // Create a new stronghold with no clients or snapshot loaded.
    let stronghold: Stronghold = Stronghold::new();

    // Representation of a location of a snapshot. Can be a file path like here, or extended
    // in the future with more storage implementations. Serves as the identifier of a snapshot in `Stronghold`.
    let location: SnapshotLocation = SnapshotLocation::path("/path/to/file.stronghold");
    let key_provider: KeyProvider = KeyProvider::from_key("some-secure-key");

    // Retrieves the snapshot if it was already loaded, otherwise loads it.
    let snapshot: &Snapshot = stronghold.load_snapshot(location, key_provider).await?;

    let client: Client = stronghold.load_client("client_path_1");

    // Load the client's state from the snapshot.
    client.load_state(snapshot)?;

    // Interact with the store.
    let store: Store = client.store();
    store.write(store_loc.clone(), b"test".to_vec(), None).await?;
    let _data: Option<Vec<u8>> = store.read(store_loc.clone()).await?;
    store.delete(store_loc.clone()).await?;

    // Interact with the vault.
    let slip10_generate = Slip10Generate::default().write_secret(
        Location::generic(vault_path.clone(), "record-1")
    );

    let _output: CollectedOutput = client.execute_procedure(slip10_generate).await??;

    let insert_vault = WriteToVault::new(vec![0; 64]).write_secret(
        Location::generic(vault_path.clone(), "record-2"),
    );

    let _output: _ = client.execute_procedure(insert_vault).await??;

    // Write the changes back into the snapshot.
    client.store_state(snapshot)?;

    // Persist the snapshot to disk.
    snapshot.write().await?;

    Ok(())
}

First we instantiate a new Stronghold. This is a collection of clients and snapshots and acts as a centralized supervisor type. A snapshot is identified by a SnapshotLocation which is a file path in this instance, but could be extended with other location types in the future. A KeyProvider stores the encryption key for a snapshot. Both together can be passed to a stronghold to load the snapshot from its storage location into memory. To get an already loaded snapshot, Stronghold::get_snapshot(location) could be used, without providing the KeyProvider again.

Clients are then created using the stronghold, and their state is loaded from a snapshot. To interact with the Client, we use the store (Store) and vault interface (Client::execute_procedure). We write, read and delete from a store, which has a hashmap-like interface. We interact with a vault on client-level. All operations on a vault are procedures, like generating a seed or writing data. To store the state from the client to storage, first the state is written back into the snapshot, and the snapshot is then persisted to storage.

P2P Example

async fn stronghold_p2p_receiver_example() -> Result<(), Box<dyn std::error::Error>> {
    let key_data: Vec<u8> = [0xff; 32].to_vec();

    let stronghold: Stronghold = Stronghold::new();

    let location: SnapshotLocation = SnapshotLocation::path("/path/to/file.stronghold");
    let key_provider: KeyProvider = KeyProvider::from_key("some-secure-key");

    let snapshot: &Snapshot = stronghold.load_snapshot(location, key_provider).await?;

    snapshot.spawn_p2p();
    snapshot.set_firewall_rule(Rule::AllowAll, Vec::new(), true).await?;
    snapshot.start_listening(None).await??;
    snapshot.join().await?;
}

async fn stronghold_p2p_sender_example() -> Result<(), Box<dyn std::error::Error>> {
    let client_path: &[u8] = b"client";
    let store_loc: &[u8] = b"store-key";

    let snapshot: RemoteSnapshot = RemoteSnapshot::new(PeerId::from(...), Multiaddress::from(...)).await?;
    let client: RemoteClient = snapshot.client(client_path).await?;
    let store: RemoteStore = client.store();

    store.write(store_loc.clone(), b"test".to_vec(), None).await?;
    let _data: Option<Vec<u8>> = store.read(store_loc.clone()).await?;
    store.delete(store_loc.clone()).await?;

    let slip10_generate = Slip10Generate::default().write_secret(
        Location::generic(b"vault", "record-1")
    );

    let _output: CollectedOutput = client.execute_procedure(slip10_generate).await??;

    Ok(())
}

Exposing a stronghold over libp2p is done on a snapshot-level, which means multiple snapshots can be exposed simultaneously. Exposing those interfaces on Stronghold instead would make it unclear which snapshot is network reachable.

The sender side has equivalent Remote-prefixed types for each of the local types. The RemoteSnapshot is used to connect to a network-reachable snapshot, identified by PeerId and Multiaddress. Then on the snapshot, a RemoteClient can be created, by passing in the client_path. This client will be created on the remote stronghold, if the caller has permission to do so. Various policies can be used to customize the permissions, which are out of scope for this RFC. Once a RemoteClient has been obtained, a RemoteStore can be used to modify the store with a very similar interface as the regular Store. The RemoteClient also has the facilities to execute procedures over the network.

Reference-level explanation

SnapshotLocation

Stronghold currently has two concepts for providing the location of a stronghold file. These can be found in Stronghold::write_snapshot and Stronghold::read_snapshot. For, example, the latter:

fn read_snapshot(..., filename: Option<String>, path: Option<PathBuf>) -> ...

If path is Some(p), then p is used, if it's None, filename is evaluated. If that is Some(name), then a path is created in a default directory with name as the filename. If it's None, the default name main is used in the default directory. So, if both filename and path are Some, one of them is ignored. To make this last case unrepresentable and make the behavior more obvious to users, a SnapshotLocation type is introduced.

let named_file: SnapshotLocation = SnapshotLocation::named("test");
let path_file: SnapshotLocation = SnapshotLocation::path("~/.stronghold/my.stronghold");

Here, named uses the parameter as the name of the file in the default directory, and path is simply turned into a PathBuf. The read_snapshot function takes a single parameter of type SnapshotLocation, making the mentioned case unrepresentable. It implements Default which is simply Stronghold::named("main").

It should be evaluated whether the default feature is used and makes sense. If multiple applications would use the default snapshot, they would also need to use the same password. If it is user provided, this might be fine. However, users might not be aware that applications write to the same snapshot and one of the applications could accidentally overwrite the state of the other one. Because of that, applications might simply avoid using the default snapshot, and if no application uses it, the feature can also be removed.

Snapshot

Ideally a Snapshot represents an open, read- and writable snapshot file, which means the password (i.e. the KeyProvider) is only required when creating it through Stronghold::load_snapshot.

Stronghold::load_snapshot reads the snapshot contents if the file exists, and loads a default state if it does not exist. An alternative would be not to load the state at construction-time and have a separate read method, that restores the state of a snapshot from a file. That seems less desireable. The assumption is that in most cases one wants to restore the state from the file rather than starting with an empty state, so the default behavior should match that and Stronghold::load_snapshot should therefore restore the state. A clear_state method could be added, if deemed useful.

This approach through a Stronghold means no two snapshots can be created that point to the same non-existing file, but creating two Strongholds would allow for it. If both Snapshots write, the second write would overwrite the first. This is a race condition we choose not to prevent, because it is hard to do so consistently across OSs, and adds additional complexity. Note that the current interface also does not prevent this case.

The Snapshot::write method writes the current state to storage. This method does not require a password, because it is assumed that the password is stored securely through the KeyProvider within the Stronghold the snapshot was created from.

Ideally, the same Snapshot can be used from different threads. This can be facilitated through the Stronghold type, which would have to be clonable. Then from different threads/tasks, Snapshot::load_snapshot can be called with the same SnapshotLocation to get access to the same snapshot. In identity.rs, for example, the typical use case is to open a single snapshot, in which multiple identities can be stored. However, each identity is an independent object, so those can be modified on different threads which requires writing into stronghold from multiple threads.

Client

A client is identified through a client_path, independent from a Snapshot and is created through Stronghold. Without writing the client to a snapshot (e.g. through Client::store_state), everything happens in-memory.

A client should be usable from different threads, i.e. it should be thread-safe. Its exact structure is omitted here, since it depends on the implementation of the underlying architecture which is the topic of RFC 1.

Ideally, Clients can easily be embedded in other structs, so they should be free of lifetime-bound references. That should make it easy to create bindings for other languages, too.

In the current interface, users don't have to explicitly save a client's state into the snapshot state before writing. The proposed interface makes this more explicit, at the cost of convenience.

Store

Each client has a single store, which can be accessed with Client::store. The Store has a simple hashmap-like interface:

impl Store {
    pub async fn write(
        &self,
        key: Vec<u8>,
        payload: Vec<u8>,
        lifetime: Option<Duration>,
    ) -> StrongholdResult<Option<Vec<u8>>> { ... }
    pub async fn read(&self, key: Vec<u8>) -> StrongholdResult<Option<Vec<u8>>> { ... }
    pub async fn delete(&self, key: Vec<u8>) -> StrongholdResult<()> { ... }
}

The Store's lifetime could be constrained to the client:

pub struct Store<'client> {
    client: &'client Client,
}

The Store is just a conceptual representation of the interface, and does not hold any data itself. It uses the client to modify the actual store. Creating bindings for such a struct is basically impossible, so that struct would effectively have to be re-created in bindings, which is a downside.

Vault

While the vault is a stronghold primitive, it does not have its own type (being the only exception). Most interesting operations on a vault are procedures. Since a client can have multiple vaults, a procedure generally works on the client-level rather than the vault-level, which means a procedure can read and write across vaults. Changing procedures to only work in a single vault makes them less composable and more restricted, so this should be avoided. Hence, executing a procedure is done through Client::execute_procedure.

This leaves the question where the simpler vault operations are put:

write to vault
revoking a record
garbage collection
vault existence check
listing record ids and hints
copy record

1 through 3, as well as 6 could be turned into procedures themselves. 1, 2 and 6 have no useful return type, so this works with the existing return type of execute_procedure. This leaves the question whether the return type of garbage collection is useful. It is a bool to indicate whether the vault that was supposed to be cleaned was found. If that is important to check, users can use the vault existence check instead, meaning garbage collection can also be turned into a procedure with no output. However, since the return type of execute_procedure is fairly complex, many of the error paths it can represent cannot occur with these simple procedures. Becaus of that, a execute_single_procedure method could be added, that simplifies the signature, but which can only run a single, non-chained procedure.
4 can be a method on the client, e.g. Client::vault_exists, or alternatively Client::is_vault_empty, which would be more consistent with naming in Rust's std library.
5 does not appear to be useful for users, because it returns RecordIds, but inserting a record requires a record path of type Vec<u8>. 5 is primarily useful for checking the number of records in a vault, or comparing vaults. No usage of that functionality was found in wallet.rs or identity.rs. Similarly, hints aren't used in the mentioned packages either. It seems to be only used for testing stronghold itself, so it should be feature-gated behind #[cfg(test)]. Moreover, this prevents leaking the internal RecordId type to the public interface. Whether obtaining the number of records in a vault or comparing vaults is useful, is debatable. Overall, without exposing the *Id types to users, hints should be removed as users do not and can not use them.

Remote Primitives

Generally, the remote types mirror the functionality of the local types. This section focuses on the differences. This is an overview of the remote API:

#[cfg(feature = "p2p")]
impl Snapshot {
    async fn spawn_p2p(&mut self) -> ...;
    async fn generate_p2p_keypair(&mut self) -> ...;
    // ... other p2p management functions
}

impl RemoteSnapshot {
    pub async fn new(stronghold_id: PeerId, stronghold_addr: Multiaddr) -> Result<Self>;
    pub async fn client(&self, client_path: &[u8]) -> Result<RemoteClient>;
}

impl RemoteClient {
    async fn store(&self) -> RemoteStore;
    async fn execute_procedure(&self, procedure: ...) -> ...;
}

impl RemoteStore {
    pub async fn write(&self, key: Vec<u8>, payload: Vec<u8>, lifetime: Option<Duration>) -> ...;
    pub async fn read(&self, key: Vec<u8>) -> ...;
    pub async fn delete(&self, key: Vec<u8>) -> ...;
}

The advantage of having a separate RemoteSnapshot, instead of adding these methods to the existing Snapshot, is that sender and receiver can be clearly separated on a conceptual level, following the principle of single responsibility. As a side effect, it also makes the Snapshot API more manageable. Moreover, compiling the RemoteSnapshot does not require compiling the stronghold core. This is especially relevant for users who only want to use remote strongholds and thus do not want to compile the entirety of stronghold, to reduce the size of the resulting binary. This would most likely require quite a bit of feature-gating throughout the library.

This API also assumes that one PeerId can be mapped to multiple ClientIds on the receiver side. Adding this would mean that local and remote strongholds can be designed with similar storage layouts -- how data is stored in stronghold -- in mind. For example, in identity.rs each identity will be mapped to one client in a local stronghold. When operating on an identity, only a single client needs to be loaded from the snapshot, improving efficiency. If the same storage layout can be used in remote strongholds, the same efficiency improvement applies, and it makes it easier for users to share code between local and remote stronghold implementations. They may even become interoperable, e.g. a remote stronghold can later be used locally, or vice versa, without changes. Generally, it means local and remote strongholds can be used in very similar ways, reducing the differences users need to be aware of and accommodate.

A potential drawback of this approach is that the connections are one-directional. The receiver cannot send commands to the sender and instead would have to open a separate connection with a RemoteSnapshot, while the sender would have to start listening with a Snapshot. If that use case should be easier, this approach would have to be modified.

Drawbacks

This proposal creates flexibility at the cost of convenience. In particular, loading and storing of Snapshots and Clients becomes explicit. The user has precise control over when state is moved from one to the other. Thanks to other RFCs, the interface becomes easier to use in some respects, like not having to provide the password for every snapshot interaction. Another important aspect - building on the software-transaction memory RFC - is that it will be easier to use a Client from multiple threads and tasks.

To a degree, the Store is duplicated in the local and remote types. Since one of implementations is sync, while the other is async, a unification of both would essentially mean the local interface needs to become async, too. That seems undesirable. Moreover, the store interface is limited in scope, so the duplication is unlikely to become a large maintenance burden.

Rationale and alternatives

This design maps the concept of stronghold primitives to the interface, thereby making the interface easier to read and reason about. The alternative is to keep the current interface approach, which has the mentioned downsides of being complex to understand and use, since all primitives are controlled through a single type. Not adopting the proposed approach effectively means that more easy-to-use wrappers will be created for identity.rs and wallet.rs, as it was already the case in the past.

A different flavor of the presented interface was also proposed.

let stronghold = Stronghold::create(Snapshot::named("known-location"), key_provider).await?;
let snapshot: &Snapshot = stronghold.get_snapshot(Snapshot::named("known-location")).await?;

This proposes to change the constructor to take a snapshot and key provider directly, to initialize a Stronghold type with a snapshot. Since Stronghold is in-part a collection type, storing clients and snapshots, this does not seem to follow the convention of other collection types like Vec or HashMap.
The proposal here is also to use Snapshot itself as the identifier of a snapshot. It seems unintuitive to create a Snapshot to only use it as an identifier, and then return a &Snapshot.

Unresolved questions

Does it make sense to turn all simple operations on a vault into procedures?
Is there opposition to feature-gating the entire crate into roughly core and p2p parts, so they can be compiled separately?

Future possibilities

Separation into local and remote types, as well as splitting the interface into its primitives means the extension can happen very naturally. For example, modifications to the remote aspects only require changes in the remote types. New primitives or extensions are added to both local and remote types.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Primitive based interface #313

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Primitive based interface #313

PhilippGackstatter Feb 2, 2022

Summary

Motivation

Guide-level explanation

Local Example

P2P Example

Reference-level explanation

SnapshotLocation

Snapshot

Client

Store

Vault

Remote Primitives

Drawbacks

Rationale and alternatives

Unresolved questions

Future possibilities

Replies: 0 comments

PhilippGackstatter
Feb 2, 2022