Skip to content
This repository has been archived by the owner on Oct 31, 2024. It is now read-only.

Commit

Permalink
chore: updating after review
Browse files Browse the repository at this point in the history
Signed-off-by: Simon Paitrault <[email protected]>
  • Loading branch information
Freyskeyd committed Nov 9, 2023
1 parent 378f810 commit 3b532f0
Show file tree
Hide file tree
Showing 4 changed files with 27 additions and 24 deletions.
38 changes: 20 additions & 18 deletions crates/topos-tce-storage/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,21 +29,21 @@ As an overview, the storage layer is composed of the following stores:
As illustrated above, multiple `stores` are exposed in the library using various `tables`.

The difference between a `store` and a `table` is that the `table` is responsible for storing
the data while the `store` is responsible for managing the data access and its behavior.
the data while the `store` manages the data access and its behavior.

Here's the list of the different stores and their responsibilities:

- The [`EpochValidatorsStore`](struct@epoch::EpochValidatorsStore) is responsible for managing the list of validators for each `epoch`.
- The [`FullNodeStore`](struct@fullnode::FullNodeStore) is responsible for managing all the persistent data such as [`Certificate`] delivered and associated `streams`.
- The [`IndexStore`](struct@index::IndexStore) is responsible for managing indexes in order to collect information about the broadcast and the network.
- The [`FullNodeStore`](struct@fullnode::FullNodeStore) is responsible for managing all persistent data such as [`Certificate`] delivered and associated `streams`.
- The [`IndexStore`](struct@index::IndexStore) is responsible for managing indexes and collect information about the broadcast and the network.
- The [`ValidatorStore`](struct@validator::ValidatorStore) is responsible for managing the pending data that one validator needs to keep track, such as the certificates pool.

For more information about a `store`, see the related doc.

Next, we've the list of the different tables and their responsibilities:

- The [`EpochValidatorsTables`](struct@epoch::EpochValidatorsTables) is responsible for storing the list of validators for each `epoch`.
- The [`ValidatorPerpetualTables`](struct@validator::ValidatorPerpetualTables) is responsible for storing the [`Certificate`] delivered and all the persistent data related to the Broadcast.
- The [`ValidatorPerpetualTables`](struct@validator::ValidatorPerpetualTables) is responsible for storing the delivered [`Certificate`]s and the persistent data related to the Broadcast.
- The [`ValidatorPendingTables`](struct@validator::ValidatorPendingTables) is responsible for storing the pending data, such as the certificates pool.
- The [`IndexTables`](struct@index::IndexTables) is responsible for storing indexes about the delivery of [`Certificate`]s such as `target subnet stream`.

Expand All @@ -52,33 +52,35 @@ Next, we've the list of the different tables and their responsibilities:
When using the storage layer, be aware of the following:
- The storage layer uses [rocksdb](https://rocksdb.org/) as the backend, which means don't need an external service, as `rocksdb` is an embedded key-value store.
- The storage layer uses [`Arc`](struct@std::sync::Arc) to share the stores between threads. It also means that a `store` is only instantiated once.
- Some functions are batching multiple writes in one transaction. But not all functions are using it.
- Some storage methods are batching multiple writes into a single transaction.

### Design Philosophy

The choice of using [rocksdb](https://rocksdb.org/) as a backend was made because it is a well known and battle tested database.
It is also very fast and efficient when it comes to write and read data.
The choice of using [rocksdb](https://rocksdb.org/) as a backend was made because it matches a lot of the conditions
that we were expected, such as being embedded and having good performances when reading and
writing our data.

Multiple `stores` and `tables` exists in order to allow admin to deal with backups or
Splitting storage into multiple `stores` and `tables` allows admins to deal with backups or
snapshots as they see fit. You can pick and choose which `tables` you want to backup without having to backup the whole database.

By splitting the data in dedicated tables we define strong separation of concern
directly in our storage.
Strong separation of converns is directly achieved at the storage level by splitting the data in
dedicated tables

`RocksDB` is however not the best fit when it comes to compose or filter data based on the data
However, `RocksDB` is not the best fit when it comes to compose or filter data based on the data
itself.

For complex queries, another database like [`PostgreSQL`](https://www.postgresql.org/) or [`CockroachDB`](https://www.cockroachlabs.com/) could be used as a Storage for projections.
The source of truth would still be [rocksdb](https://rocksdb.org/) but the projections would be stored in a relational database, allowing for more complex queries.

As mention above, the different stores are using [`Arc`](struct@std::sync::Arc), allowing a single store to be instantiated once
and then shared between threads. This is very useful when it comes to the [`FullNodeStore`](struct@fullnode::FullNodeStore) as it is used in various places but need to provides single entrypoint to the data.
As mentioned above, the different stores are using [`Arc`](struct@std::sync::Arc), allowing a single store to be instantiated once
and then shared between threads. This is very useful when it comes to the [`FullNodeStore`](struct@fullnode::FullNodeStore) as it is used
in various places but should provide single entry point to the data.

It also means that the store is immutable, which is a good thing when it comes to concurrency.

The burden of managing the locks is handled by the [`async_trait`](https://docs.rs/async-trait/0.1.51/async_trait/) crate when using the [`WriteStore`](trait@store::WriteStore).

The locks are responsible for preventing any other query to mutate the data currently in processing. For more information about the locks see [`locking`](module@fullnode::locking)
It also means that the store is immutable thus can be shared easily between threads,
which is a good thing for the concurrency.
However, some stores are implementing the [`WriteStore`](trait@store::WriteStore) trait in order to
insert or mutate data, managing locks on resources and preventing any other query to mutate the data
currently in processing. For more information about the locks see [`locking`](module@fullnode::locking)

The rest of the mutation on the data are handled by [rocksdb](https://rocksdb.org/) itself.

2 changes: 1 addition & 1 deletion crates/topos-tce-storage/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@
//!
//! As mentioned above, the different stores are using [`Arc`](struct@std::sync::Arc), allowing a single store to be instantiated once
//! and then shared between threads. This is very useful when it comes to the [`FullNodeStore`](struct@fullnode::FullNodeStore) as it is used
//! in various places but need to provides single entrypoint to the data.
//! in various places but should provide single entry point to the data.
//!
//! It also means that the store is immutable thus can be shared easily between threads,
//! which is a good thing for the concurrency.
Expand Down
2 changes: 1 addition & 1 deletion crates/topos-tce-storage/src/store.rs
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ pub trait ReadStore: Send {
limit: usize,
) -> Result<Vec<(CertificateDelivered, CertificateTargetStreamPosition)>, StorageError>;

/// Returns the list of source subnet that delivered certificates to a target subnet
/// Returns the list of source subnets that delivered certificates to a particular target subnet
fn get_target_source_subnet_list(
&self,
target_subnet_id: &SubnetId,
Expand Down
9 changes: 5 additions & 4 deletions crates/topos-tce-storage/src/validator/mod.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
//! Validator's context store and storage
//!
//! The [`ValidatorStore`] is responsible for managing the different data that are required by the
//! The [`ValidatorStore`] is responsible for managing the various kind of data that are required by the
//! TCE network in order to broadcast certificates. It is composed of two main parts:
//!
//! - a [`FullNodeStore`]
Expand All @@ -12,7 +12,7 @@
//! different pending pools and to manage them but also to access the [`FullNodeStore`] in order to
//! persist or update [`Certificate`] or `streams`.
//!
//! Pending pools and how they behave is decribed in the [`ValidatorPendingTables`] documentation.
//! Pending pools and their behavior is decribed in the [`ValidatorPendingTables`] documentation.
//!
use std::{
collections::HashMap,
Expand Down Expand Up @@ -49,7 +49,8 @@ mod tables;
/// The [`ValidatorStore`] is composed of a [`FullNodeStore`] and a [`ValidatorPendingTables`].
///
/// As the [`FullNodeStore`] is responsible of keeping and managing data that are persistent,
/// the [`ValidatorStore`] is delegating many of the [`WriteStore`] and [`ReadStore`] to it.
/// the [`ValidatorStore`] is delegating to it many of the [`WriteStore`] and [`ReadStore`]
/// functionality.
///
/// The key point is that the [`ValidatorStore`] is managing the different pending pools using a [`ValidatorPendingTables`].
///
Expand Down Expand Up @@ -105,7 +106,7 @@ impl ValidatorStore {
Ok(self.pending_tables.pending_pool.get(pending_id)?)
}

/// Returns the entier pending_pool
/// Returns the entire pending_pool
pub fn get_pending_certificates(
&self,
) -> Result<Vec<(PendingCertificateId, Certificate)>, StorageError> {
Expand Down

0 comments on commit 3b532f0

Please sign in to comment.